rbayesianopenbugs

OpenBUGS error in R: "undefined variable"


I am trying to conduct an hierarchical Bayesian analysis using OpenBUGS in R via the library R2OpenBUGS but I keep running into an error message during the early stage of model compilation. I am trying to model plant functional traits as a function of plant size and environment. For several plant species, I have data on traits, diameter at breast height (dbh), and altitude.

My model and data specification are as follows (excluding the data generation part):

model = function() {
  for(i in 1:22){
    mu[i] ~ dnorm(0, 0.00001)
    env[i] ~ dnorm(0, 0.00001)
    s[i] ~ dnorm(0, 0.00001)
    me[i] ~ dnorm(0, 0.00001)

    for(j in 1:np[i]){
      tr[i, j] ~ dnorm(eff[i, j], tau)
      eff[i, j] <- mu[i] + s[i] * log(dbh[i, j]) + env[i] * pow(edev[i, j], 2) * (edev[i, j] / abs(edev[i, j]))
      edev[i, j] <- e[i, j] - me[i]
    }
  }

  tau ~ dnorm(0, 0.00001)
  sig <- 1 / tau
}

library(R2OpenBUGS)
model.file = file.path(tempdir(), "model.txt")
write.model(model, model.file)

data.th = list(tr = tr.th, dbh = dbh, e = elevadj, np = np)
params = c("mu", "env", "s", "me", "sig")
inits = function() { list(mu = rep(1, 22), s = rep(1, 22), env = rep(1, 22), me = rep(1, 22), tau = 1) }

out.th = bugs(data.th, inits, params, model.file, n.iter = 10000, codaPkg = T)

Explanation for the parameters to be modeled:

The data are stored in matrices tr.th, dbh, elevadj, with each row representing each plant species. My data is unbalanced: each species contains different numbers of sample points. np is a vector containing the number of actual (non-NA) sample points for each plant species. (I don't know if actual data are needed to spot the error, if requested I'll post a sample up here)

Whenever I try to run this model, I encounter a error message which says "variable dbh is not defined", and it fails to be compiled. It is quite bizarre because I have successfully run a model before which is not so much different from the one I'm trying to run now. I suspect it's probably just a simple, stupid mistake on formatting or specification, since I have only started Bayesian analysis using OpenBUGS in Rfor a few weeks now, and may be missing some crucial points.

I have looked on the site and found the following questions, but I have not been successful in applying to solve my problem:

Neither has reading the manual helped: it provides the following text, which is too vague for me to understand:

'undefined variable' - undefined variables are allowed in the model so long as they are not used on the right hand side of a relation

I am using RStudio, version 0.98.1091 and OpenBUGS, version 3.2.3 rev 1012.

I would appreciate a lot if someone could help me pinpoint the root of the problem and ways to fix it.


Solution

  • I'm actually not sure if I should edit my question or add comments instead, but I found the source of my problem so I will first try to address it here. In a nutshell, it is not the code that's wrong, but rather the data format.

    As I said above, in my model I use the vector np to specify the number of actual (non-NA) sample points for different plant species, each of which is evaluated in the j loop. It turns out that the code I use to generate np is incorrect, so that it does not actually match the number of non-NA sample points. Thus when the model tries to call the data, it includes the NA in my unbalanced data, and I suspect that is the source of the problem because I've read somewhere elso.because I store my unbalanced data in a matrix.

    In order to prvent this question from becoming a personal blog post, I think the take-home message would be:

    1. Check your codes thoroughly (duh)
    2. Be careful if your data contains NA
    3. It doesn't seem like such a good idea to store data in a matrix, especially when you have unbalanced data

    I'm thinking it might be better to edit my question to address the second and third point more directly, but I'm not yet sure how to do that.