rintervalsjagsrjagsrunjags

In Rjags/runjags, what causes the "node inconsistent with parents" error when using dinterval?


I have wracked my brain trying to come up with a solution to this problem and I'm at wits end! First, the necessary context: Aquatic plants in lakes are sampled with rakes. You throw a rake out into the lake, you pull it back into your boat, and you figure out what plants are on its tines. In our case, we measure both presence/absence as well as "abundance," but in an ordinal/interval-censored way --> it's 0 if species X isn't noticed on the rake at all, 1 if it covers < 25% of the rake's tines, 2 if it covers between 25 and 75%, and 3 if it covers > 75%. However, it's fairly easy to miss a species entirely when it's in low abundance, so 0s are sketchy--they may not represent true absences, and that is really the issue our model is trying to explore.

So, there are really three layers here--a true, fully latent abundance that we don't observe directly at all, a partially latent "true presence/absence" in that we know where true presences are but not where true absences are, and then we have our observed presence/absence data. What's more interesting is that we think some environmental variables may affect both true abundance and true occurrence but differently, and then other variables may affect detectability, and it's those processes we're trying to tease apart.

So, anyhow, my actual model is much larger and more complicated than what I've pasted below, but here is a sort of functional (but probably academically meritless) training version of it that replicates the error I am getting.

#data setup
N = 1500 #Number of cases
obs = sample(c(0,1,2,3), N, 
             replace=T, prob=c(0.7, 0.2, 0.075, 0.025)) #Our observed, interval-censored data.
X1 = rnorm(N) #Some covariate that probably affects both occurrance and abundance but maybe in different ways.
abundances = rep(NA, times = N) #Abundance is a latent variable we don't directly observe. From elsewhere, I know the values here need to be NAs so the model will know to impute them
occur = rep(1, times = N) #Occurance is a degraded form of our abundance data.

#d will be the initials for the abundance data, since this is apparently needed to jumpstart the imputation.
d = vector()
for(o in 1:N) {
  if (obs[o]==0) { d[o] = 0.025; occur[o] = 0 }
  if (obs[o]==1) { d[o] = 0.15 }
  if (obs[o]==2) { d[o] = 0.5 }
  if (obs[o]==3) { d[o] = 0.875 }
}

#Data
test.data = list("N" = N, 
                 "obs" = obs,
                 "X1" = X1, 
                 "abund" = abundances,
                 "lim" = c(0.05, 0.25, 0.75, 0.9999),
                 "occur" = occur)
#Inits
inits = list(abund = d)

cat("model
    {
    for (i in 1:N) {
    
obs[i] ~ dinterval(abund[i], lim)
abund[i] ~ dbeta(theta[i], rho[i]) T(0.0001, 0.9999)
theta[i] <- mu[i] * epsilon
rho[i] <- epsilon * (1-mu[i])

logit(mu[i]) <- alpha1 + X.beta1 * X1[i]

occur[i] ~ dbern(phi[i])
logit(phi[i]) <- alpha2 + X.beta2 * X1[i]

    }
    
#Priors
epsilon ~ dnorm(5, 0.1) T(0.01, 10)
alpha1 ~ dnorm(0, 0.01)
X.beta1 ~ dnorm(0, 0.01)
alpha2 ~ dnorm(0, 0.01)
X.beta2 ~ dnorm(0, 0.01)
}
 ", file = "training.txt")

test.run = jags.model(file = "training.txt", inits = inits, data=test.data, n.chains = 3) 

params = c("epsilon", 
                "alpha1",
                "alpha2",
                "X.beta1", 
                "X.beta2")

run1 = run.jags("training.txt", data = test.data, n.chains=3, burnin = 1000, sample = 5000, adapt = 4000, thin = 2,
                monitor = c(params), method="parallel", modules = 'glm')

At the end, I get this error, and I always get this error any time I try to do something even remotely like this:

Graph information: Observed stochastic nodes: 3000 Unobserved stochastic nodes: 1505 Total graph size: 19519 . Reading parameter file inits1.txt. Initializing model Error in node obs1 Node inconsistent with parents

I've read every posting that covers this error I can find, including this one, this one, this one, and this one. I can surmise from my research and testing that the error is probably occurring for one of the following reasons.

  1. My initials for the latent abundance variable are not adequate somehow. It sounds like this requires pretty useful initial values to work.
  2. One or more of my priors is allowing values that are not permissible OR they are too broad and that's causing problems somehow. This might be especially an issue because of the beta distribution I am using which has strong requirements about not having values outside of 0 and 1.
  3. I am using the dinterval() function incorrectly, which seems likely because it is always the line containing it that trips the error.
  4. My model is somehow mis-specified.

But I can't see where I might be going wrong--I have tried a number of different options for 1 and 2, and so far as I can tell from the documentation (see pages 55-56), I am using dinterval correctly. What am I missing??

In case it's relevant, from what I have gathered, the idea of dinterval() is that the variable on the left of the ~ is the interval-censored version of the variable given in the first argument (here, abundance). Then, the second argument (here, lim) is a vector of "breakpoints" that dictate which intervals the abundance data end up in. So, here, you end up with an observed abundance code of 0 if you are lower than the lowest lim (here, 0.05), 1 if you are in between the first two values in lim, etc. It's like the abundance variable is being pushed through a "binning sieve" created by the lim variable to produce a binned output variable, our observed abundances.

Any guidance would be most welcome!!


Solution

  • I have run your example with JAGS 4.3.0 and rjags 4-12. For me, the version with rjags runs correctly. The version with runjags does not work because you have not provided intial values. This is easily fixed by adding the argument

    inits=list(inits, inits, inits)
    

    to the call to run.jags().

    You have correctly understood the purpose of dinterval. This is an "observable function" which imposes constraints on its parameters via a likelihood. When using dinterval you must always provide initial values that satisfy the constraints from the fist iteration. As far as I can see, your initial values do satisfy the constraints and this is verified by the fact that I can run your example (with initial values).