I am trying to conduct an hierarchical Bayesian analysis using OpenBUGS in R via the library R2OpenBUGS
but I keep running into an error message during the early stage of model compilation. I am trying to model plant functional traits as a function of plant size and environment. For several plant species, I have data on traits, diameter at breast height (dbh), and altitude.
My model and data specification are as follows (excluding the data generation part):
model = function() {
for(i in 1:22){
mu[i] ~ dnorm(0, 0.00001)
env[i] ~ dnorm(0, 0.00001)
s[i] ~ dnorm(0, 0.00001)
me[i] ~ dnorm(0, 0.00001)
for(j in 1:np[i]){
tr[i, j] ~ dnorm(eff[i, j], tau)
eff[i, j] <- mu[i] + s[i] * log(dbh[i, j]) + env[i] * pow(edev[i, j], 2) * (edev[i, j] / abs(edev[i, j]))
edev[i, j] <- e[i, j] - me[i]
}
}
tau ~ dnorm(0, 0.00001)
sig <- 1 / tau
}
library(R2OpenBUGS)
model.file = file.path(tempdir(), "model.txt")
write.model(model, model.file)
data.th = list(tr = tr.th, dbh = dbh, e = elevadj, np = np)
params = c("mu", "env", "s", "me", "sig")
inits = function() { list(mu = rep(1, 22), s = rep(1, 22), env = rep(1, 22), me = rep(1, 22), tau = 1) }
out.th = bugs(data.th, inits, params, model.file, n.iter = 10000, codaPkg = T)
Explanation for the parameters to be modeled:
mu
: species means
: parameter for size effectenv
: parameter for environmental (altitudinal) effectme
: theoretical "optimal" altitude for each speciessig
: error term, expressed in varianceThe data are stored in matrices tr.th
, dbh
, elevadj
, with each row representing each plant species. My data is unbalanced: each species contains different numbers of sample points. np
is a vector containing the number of actual (non-NA
) sample points for each plant species. (I don't know if actual data are needed to spot the error, if requested I'll post a sample up here)
Whenever I try to run this model, I encounter a error message which says "variable dbh is not defined", and it fails to be compiled. It is quite bizarre because I have successfully run a model before which is not so much different from the one I'm trying to run now. I suspect it's probably just a simple, stupid mistake on formatting or specification, since I have only started Bayesian analysis using OpenBUGS in Rfor a few weeks now, and may be missing some crucial points.
I have looked on the site and found the following questions, but I have not been successful in applying to solve my problem:
Neither has reading the manual helped: it provides the following text, which is too vague for me to understand:
'undefined variable' - undefined variables are allowed in the model so long as they are not used on the right hand side of a relation
I am using RStudio, version 0.98.1091 and OpenBUGS, version 3.2.3 rev 1012.
I would appreciate a lot if someone could help me pinpoint the root of the problem and ways to fix it.
I'm actually not sure if I should edit my question or add comments instead, but I found the source of my problem so I will first try to address it here. In a nutshell, it is not the code that's wrong, but rather the data format.
As I said above, in my model I use the vector np
to specify the number of actual (non-NA) sample points for different plant species, each of which is evaluated in the j
loop. It turns out that the code I use to generate np
is incorrect, so that it does not actually match the number of non-NA sample points. Thus when the model tries to call the data, it includes the NA
in my unbalanced data, and I suspect that is the source of the problem because I've read somewhere elso.because I store my unbalanced data in a matrix.
In order to prvent this question from becoming a personal blog post, I think the take-home message would be:
NA
I'm thinking it might be better to edit my question to address the second and third point more directly, but I'm not yet sure how to do that.