## Friday, September 30, 2011

### JAGS, dinterval

Edit: I posed this question to Martyn Plummer and got a very useful response. I'll post it once I figure out the details.

One of the hazards of using something like JAGS, with complicated innards written by someone else, is that you don't have much of a chance to the problems coming ahead.  I happen to have a continuous process which is observed only in discrete units and it happens to mimic the dinterval distribution in JAGS.  Unfortunately I recently had to change the number of breaks used by dinterval by an order of magnitude and it turns out that it bogs down fairly quickly.  Here's an example:

cat('
model{
for ( i in 1:N) {
x[i] ~ dunif(0,10^K)
o[i] ~ dinterval(x[i],breaks)
}
}
', file = './dintervalProblem.bugs'
)

library(rjags)
mod <- list()
samp <- list()
timings <- list()
k <- c(3,4,5)
for ( i in 1:length(k) ) {
N <- 100
K <- k[i]
x <- runif(N,0,10^K)
o <- ceiling(x)
breaks <- seq(0,10^K,1)

timings[[i]] <- system.time(
{
mod[[i]] <- jags.model(
file = './dintervalProblem.bugs',
data = list(
N = N,
K = K,
o = o,
breaks = breaks
),
inits = list(
o = o,
x = x
),
n.chains = 1,
quiet = FALSE
)

samp[[i]] <- jags.samples(
model = mod1,
variable.names = c('x','o'),
n.iter = 1000,
thin = 1
)
}
)

}

My digging in the JAGS internals hasn't turned up anything yet which could be responsible for this behavior.  The value of the dinterval node is calculated by:

static unsigned int value(vector const &par, unsigned int ncut){    double t = T(par);    for (unsigned int i = 0; i < ncut; ++i) {  if (t <= CUTPOINTS(par)[i])      return i;    }    return ncut;}

It's not like there's a separate node for every breakpoint... though the number of nodes generated in the graphs does go up with K.

Suggestions?  (Especially all those folks coming here from Andrew Gelman's blog...)