Outcome prediction using JAGS from R

[Code is updated and does not correspond to error messages anymore]

I am trying to understand how JAGS predicts outcome values (for a mixed markov model). I've trained the model on a dataset which includes outcome m and covariates x1, x2 and x3.

Predicting the outcome without fixing parameter values works in R, but the output seems completely random:

preds <- run.jags("model.txt",
                  data=list(x1=x1, x2=x2, x3=x3, m=m,
                            statealpha=rep(1,times=M), M=M, T=T, N=N), monitor=c("m_pred"),
                  n.chains=1, inits = NA, sample=1)

Compiling rjags model... Calling the simulation using the rjags method... Note: the model did not require adaptation Burning in the model for 4000 iterations... |**************************************************| 100% Running the model for 1 iterations... Simulation complete Finished running the simulation

However, as soon as I try to fix parameters (i.e. use model estimates to predict outcome m, I get errors:

preds <- run.jags("model.txt",
                  data=list(x1=x1, x2=x2, x3=x3,
                            statealpha=rep(1,times=M), M=M, T=T, N=N, beta1=beta1), monitor=c("m"),
                  n.chains=1, inits = NA, sample=1)

Compiling rjags model... Error: The following error occured when compiling and adapting the model using rjags: Error in rjags::jags.model(model, data = dataenv, n.chains = length(runjags.object$end.state), : RUNTIME ERROR: Compilation error on line 39. beta1[2,1] is a logical node and cannot be observed

beta1 in this case is a 2x2 matrix of coefficient estimates.

How is JAGS predicting m in the first example (no fixed parameters)? Is it just completely randomly choosing m?
How can I include earlier acquired model estimates to simulate new outcome values?

The model is:

model{
 for (i in 1:N)
 {

for (t in 1:T)
  {
  m[t,i] ~ dcat(ps[i,t,])
  }

for (state in 1:M)
  {
  ps[i,1,state] <- probs1[state]

  for (t in 2:T)
    {
    ps[i,t,state] <- probs[m[(t-1),i], state, i,t]
    }

  for (prev in 1:M){
       for (t in 1:T) {
    probs[prev,state,i,t] <- odds[prev,state,i,t]/totalodds[prev,i,t]
    odds[prev,state,i,t] <- exp(alpha[prev,state,i] +
                                beta1[prev,state]*x1[t,i]
                                + beta2[prev,state]*x2[t,i]
                               + beta3[prev,state]*x3[t,i])
    }}

  alpha[state,state,i] <- 0

      for (t in 1:T) {
  totalodds[state,i,t] <- odds[state,1,i,t] + odds[state,2,i,t]
  }
}
alpha[1,2,i] <- raneffs[i,1]
alpha[2,1,i] <- raneffs[i,2]
raneffs[i,1:2] ~ dmnorm(alpha.means[1:2],alpha.prec[1:2, 1:2])
}

for (state in 1:M)
  {
  beta1[state,state] <- 0
  beta2[state,state] <- 0
  beta3[state,state] <- 0
  }

beta1[1,2] <- rcoeff[1]
beta1[2,1] <- rcoeff[2]
beta2[1,2] <- rcoeff[3]
beta2[2,1] <- rcoeff[4]
beta3[1,2] <- rcoeff[5]
beta3[2,1] <- rcoeff[6]

alpha.Sigma[1:2,1:2] <- inverse(alpha.prec[1:2,1:2])
probs1[1:M] ~ ddirich(statealpha[1:M])
for (par in 1:6)
{
alpha.means[par] ~ dt(T.constant.mu,T.constant.tau,T.constant.k)
rcoeff[par] ~ dt(T.mu, T.tau, T.k)
}

T.constant.mu <- 0
T.mu <- 0
T.constant.tau <- 1/T.constant.scale.squared
T.tau <- 1/T.scale.squared
T.constant.scale.squared <- T.constant.scale*T.constant.scale
T.scale.squared <- T.scale*T.scale
T.scale <- 2.5
T.constant.scale <- 10
T.constant.k <- 1
T.k <- 1
alpha.prec[1:2,1:2] ~ dwish(Om[1:2,1:2],2)
Om[1,1] <- 1
Om[1,2] <- 0
Om[2,1] <- 0
Om[2,2] <- 1

## Prediction
for (i in 1:N)
    {

   m_pred[1,i] <- m[1,i]

    for (t in 2:T)  
      {
      m_pred[t,i] ~ dcat(ps_pred[i,t,])
      }

    for (state in 1:M)
      {
      ps_pred[i,1,state] <- probs1[state]

      for (t in 2:T)
        {
        ps_pred[i,t,state] <- probs_pred[m_pred[(t-1),i], state, i,t]
        }

      for (prev in 1:M)
        {

       for (t in 1:T)
         {
        probs_pred[prev,state,i,t] <- odds_pred[prev,state,i,t]/totalodds_pred[prev,i,t]
        odds_pred[prev,state,i,t] <- exp(alpha[prev,state,i] +
                                    beta1[prev,state]*x1[t,i]
                                    + beta2[prev,state]*x2[t,i]
                                   + beta3[prev,state]*x3[t,i])
        }}

      for (t in 1:T) {
      totalodds_pred[state,i,t] <- odds_pred[state,1,i,t] + odds_pred[state,2,i,t]
       }
      }
  }

Solution

TL;DR: I think you're just missing a likelihood.

Your model is complex, so perhaps I'm missing something, but as far as I can tell there is no likelihood. You are supplying the predictors x1, x2, and x3 as data, but you aren't giving any observed m. So in what sense can JAGS be "fitting" the model?

To answer your questions:

Yes, it appears that m is drawn as random from a categorical distribution conditioned on the rest of the model. Since there are no m supplied as data, none of the parameter distributions have cause for update, so your result for m is no different than you'd get if you just did random draws from all the priors and propagated them through the model in R or whatever.
Though it still wouldn't constitute fitting the model in any sense, you would be free to supply values for beta1 if they weren't already defined completely in the model. JAGS is complaining because currently beta1[i] = rcoeff[i] ~ dt(T.mu, T.tau, T.k), and the parameters to the T distribution are all fixed. If any of (T.mu, T.tau, T.k) were instead given priors (identifying them as random), then beta1 could be supplied as data and JAGS would treat rcoeff[i] ~ dt(T.mu, T.tau, T.k) as a likelihood. But in the model's current form, as far as JAGS is concerned if you supply beta1 as data, that's in conflict with the fixed definition already in the model.

I'm stretching here, but my guess is if you're using JAGS you have (or would like to) fit the model in JAGS too. It's a common pattern to include both an observed response and a desired predicted response in a jags model, e.g. something like this:

model {
  b ~ dnorm(0, 1) # prior on b

  for(i in 1:N) {
    y[i] ~ dnorm(b * x[i], 1) # Likelihood of y | b (and fixed precision = 1 for the example)
  }

  for(i in 1:N_pred) {
    pred_y[i] ~ dnorm(b * pred_x[i], 1) # Prediction
  }
}

In this example model, x, y, and pred_x are supplied as data, the unknown parameter b is to be estimated, and we desire the posterior predictions pred_y at each value of pred_x. JAGS knows that the distribution in the first for loop is a likelihood, because y is supplied as data. Posterior samples of b will be constrained by this likelihood. The second for loop looks similar, but since pred_y is not supplied as data, it can do nothing to constrain b. Instead, JAGS knows to simply draw pred_y samples conditioned on b and the supplied pred_x. The values of pred_x are commonly defined to be the same as observed x, giving a predictive interval for each observed data point, or as a regular sequence of values along the x axis to generate a smooth predictive interval.