Categorical Data model with dirichlet prior using RJags

I am trying to write a model using jags. The problem is that I am really new to jags language. I have a dataset that is basically a survey with 6 questions and one predictive variable. All are categorical some of which have 3 possible answers other have 2, (the predictive has 3 possible answers). In order to create a Bayesian model using jags, I decided to use categorical distribution to use in order to model Y (for the likelihood) and dirichlet for the priors.

As I have missing values in both Xs and Y. After writing a model to include and calculate those values, R gives me a fatal error and I can not run the model. I can not find any source online for such a case/model to try and replicate the model into my case. Anyways below I have my model so please If you find anything strange in there please indicate it to me so I fix the problem else if everything with model seems fine and you know what causes the error I would be glad to hear that as well.

Specifications : - OS : MacOS Intel Core i5 - R Studio Version : R 4.1.2 - rjags version : 4-13

Model : 
model {
  # Likelihood
 for (i in 1:209) {
  FIN[i] ~ dcat(p[i,1:3])
  logit(p[i,1]) <- alpha[1] + beta[1]*AGE[i] + beta[2]*SEX[i] + beta[3]*INC[i] + beta[4]*POL[i] + beta[5]*AREA[i] + beta[6]*STAT[i]
  logit(p[i,2]) <- alpha[2] + beta[1]*AGE[i] + beta[2]*SEX[i] + beta[3]*INC[i] + beta[4]*POL[i] + beta[5]*AREA[i] + beta[6]*STAT[i]
  logit(p[i,3]) <- alpha[3] + beta[1]*AGE[i] + beta[2]*SEX[i] + beta[3]*INC[i] + beta[4]*POL[i] + beta[5]*AREA[i] + beta[6]*STAT[i]
 
  # -- Add model for dependent variables --
  AGE[i] ~ dcat(p_age)
  SEX[i] ~ dbern(p_sex)
  INC[i] ~ dcat(p_inc)
  POL[i] ~ dcat(p_pol)
  STAT[i]~ dbern(p_stat)
 }
  # Priors
  for (j in 1:6) {
  beta[j] ~ dnorm(0, 0.001) # Normal prior for beta
 }
  for (k in 1:3) {
  alpha[k] ~ dnorm(0, 0.001) # Normal prior for alpha
  }
  # -- Priors for dependent variables --
  p_age ~ ddirch(c(0.3444976,0.3157895,0.3349282))
  p_inc ~ ddirch(c(0.2248804,0.3971292,0.2870813))
  p_pol ~ ddirch(c(0.4019139,0.1913876,0.3732057 ))
  p_sex ~ dbeta(2,1)
  p_stat ~ dbeta(2,1)

Feel free to propose any modification in the model that may work for my case of data.

The first 10 lines of my data are the following:

   AGE SEX INC POL AREA FIN STAT
1    3   0   2   2    1   2    1
2    2   0   3   3    1   3    1
3    1   0   2  NA    1   2    1
4    3   1   2   1    1   1    0
5    3   1   3   3    3   2   NA
6    1   0   2   1    3   3   NA
7    3   1   1   3    3   1    1
8    1   0   1   3    2   1    0
9    3   1  NA   3    3   2    0
10   1   0  NA   1    1   2    1

predictive var is FIN and the rest are response ones.

Solution

It looks like your likelihood is a kind of mix between the multinomial and ordered. In the multinomial likelihood, you would have different coefficients for all variables across categories of the outcome, but all coefficients for the reference group would be set to zero. For the ordinal logit likelihood, you would as you have done here - all coefficients are constrained to be the same across categories of the outcome, except for the intercepts. However, the intercepts are forced to be ordered and the probabilities that are calculated from the ordered cut points are cumulative (rather than being for each category). Since I'm not sure what you intend, let's look at both. First, the multinomial likelihood:

  for (i in 1:209) {
    FIN[i] ~ dcat(p[i,1:3])
    log(q[i,1]) <- alpha[1] + beta[1, 1]*AGE[i] + beta[2, 1]*SEX[i] + beta[3, 1]*INC[i] + beta[4, 1]*POL[i] + beta[5, 1]*AREA[i] + beta[6, 1]*STAT[i]
    log(q[i,2]) <- alpha[2] + beta[1, 2]*AGE[i] + beta[2, 2]*SEX[i] + beta[3, 2]*INC[i] + beta[4, 2]*POL[i] + beta[5, 2]*AREA[i] + beta[6, 2]*STAT[i]
    log(q[i,3]) <- alpha[3] + beta[1, 3]*AGE[i] + beta[2, 3]*SEX[i] + beta[3, 3]*INC[i] + beta[4, 3]*POL[i] + beta[5, 3]*AREA[i] + beta[6, 3]*STAT[i]
    for(k in 1:3){
      p[i,k] <- q[i,k]/sum(q[i,])
    }
  }
  for (j in 1:6) {
    beta[k,1] <- 0
    for(k in 2:3){
      beta[j,k] ~ dnorm(0, 0.001) # Normal prior for beta
    }
  }
  alpha[1] <- 0
  for (k in 2:3) {
    alpha[k] ~ dnorm(0, 0.001) # Normal prior for alpha
  }

Note here that we change logit(p[i,k]) to log(q[i,k]) as the probability of observation i, being in category k is

We set the first category coefficients to zero as the reference.

The ordinal model likelihood would look like this:

for (i in 1:209) {
  FIN[i] ~ dcat(p[i,1:3])
  logit(q[i,1]) <- alpha[1] - beta[1]*AGE[i] + beta[2]*SEX[i] + beta[3]*INC[i] + beta[4]*POL[i] + beta[5]*AREA[i] + beta[6]*STAT[i]
  logit(q[i,2]) <- alpha[2] - beta[1]*AGE[i] + beta[2]*SEX[i] + beta[3]*INC[i] + beta[4]*POL[i] + beta[5]*AREA[i] + beta[6]*STAT[i]
  p[i,1] <- q[i,1]
  p[i,2] <- q[i,2]-q[i,1]
  p[i,3] <- 1-q[i,2]
}
for (j in 1:6) {
    beta[j] ~ dnorm(0, 0.001) # Normal prior for beta
}
for (k in 1:2) {
  a[k] ~ dnorm(0, 0.001) # Normal prior for alpha
}
alpha <- sort(a)

Here, the alpha parameters are the cut points - assuming that alpha0 = -Inf and alpha3 = Inf. The important point is that the alpha parameters must be increasing in size.

Let's try to run both models:

Multinomial Model

mnl_mod <- "model {
  # Likelihood
  for (i in 1:10) {
    FIN[i] ~ dcat(p[i,1:3])
    log(q[i,1]) <- alpha[1] + beta[1, 1]*AGE[i] + beta[2, 1]*SEX[i] + beta[3, 1]*INC[i] + beta[4, 1]*POL[i] + beta[5, 1]*AREA[i] + beta[6, 1]*STAT[i]
    log(q[i,2]) <- alpha[2] + beta[1, 2]*AGE[i] + beta[2, 2]*SEX[i] + beta[3, 2]*INC[i] + beta[4, 2]*POL[i] + beta[5, 2]*AREA[i] + beta[6, 2]*STAT[i]
    log(q[i,3]) <- alpha[3] + beta[1, 3]*AGE[i] + beta[2, 3]*SEX[i] + beta[3, 3]*INC[i] + beta[4, 3]*POL[i] + beta[5, 3]*AREA[i] + beta[6, 3]*STAT[i]
    for(k in 1:3){
      p[i,k] <- q[i,k]/sum(q[i,])
    }
        # -- Add model for dependent variables --
    AGE[i] ~ dcat(p_age)
    SEX[i] ~ dbern(p_sex)
    INC[i] ~ dcat(p_inc)
    POL[i] ~ dcat(p_pol)
    STAT[i]~ dbern(p_stat)
  }
  for (j in 1:6) {
    beta[j,1] <- 0
    for(k in 2:3){
      beta[j,k] ~ dnorm(0, .1) # Normal prior for beta
    }
  }
  alpha[1] <- 0
  for (k in 2:3) {
    alpha[k] ~ dnorm(0, .1) # Normal prior for alpha
  }
  # Priors
  # -- Priors for dependent variables --
  p_age ~ ddirch(c(0.3444976,0.3157895,0.3349282))
  p_inc ~ ddirch(c(0.2248804,0.3971292,0.2870813))
  p_pol ~ ddirch(c(0.4019139,0.1913876,0.3732057 ))
  p_sex ~ dbeta(2,1)
  p_stat ~ dbeta(2,1)
}"

dat <- read.table(text="AGE SEX INC POL AREA FIN STAT
1    3   0   2   2    1   2    1
2    2   0   3   3    1   3    1
3    1   0   2  NA    1   2    1
4    3   1   2   1    1   1    0
5    3   1   3   3    3   2   NA
6    1   0   2   1    3   3   NA
7    3   1   1   3    3   1    1
8    1   0   1   3    2   1    0
9    3   1  NA   3    3   2    0
10   1   0  NA   1    1   2    1", header=TRUE)


dl <- as.list(dat)
library(rjags)
#> Loading required package: coda
#> Linked to JAGS 4.3.1
#> Loaded modules: basemod,bugs
mnl_j <- jags.model(textConnection(mnl_mod), data=dl, n.chains=2)
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 55
#>    Unobserved stochastic nodes: 24
#>    Total graph size: 268
#> 
#> Initializing model
update(mnl_j, 10000)
mnl_samps <- coda.samples(mnl_j, variable.names=c("alpha", "beta"), n.iter=5000)

Model Summary:

summary(mnl_samps)
#> 
#> Iterations = 11001:16000
#> Thinning interval = 1 
#> Number of chains = 2 
#> Sample size per chain = 5000 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>               Mean    SD Naive SE Time-series SE
#> alpha[1]   0.00000 0.000  0.00000        0.00000
#> alpha[2]  -0.98913 2.607  0.02607        0.12603
#> alpha[3]  -1.44130 2.869  0.02869        0.12171
#> beta[1,1]  0.00000 0.000  0.00000        0.00000
#> beta[2,1]  0.00000 0.000  0.00000        0.00000
#> beta[3,1]  0.00000 0.000  0.00000        0.00000
#> beta[4,1]  0.00000 0.000  0.00000        0.00000
#> beta[5,1]  0.00000 0.000  0.00000        0.00000
#> beta[6,1]  0.00000 0.000  0.00000        0.00000
#> beta[1,2] -0.64053 1.351  0.01351        0.07672
#> beta[2,2] -0.45833 2.317  0.02317        0.07423
#> beta[3,2]  2.15004 1.633  0.01633        0.10340
#> beta[4,2] -0.52331 1.455  0.01455        0.09729
#> beta[5,2] -0.17880 1.503  0.01503        0.08903
#> beta[6,2]  1.86677 2.012  0.02012        0.06045
#> beta[1,3] -2.50116 1.827  0.01827        0.08436
#> beta[2,3] -2.66265 2.719  0.02719        0.06562
#> beta[3,3]  3.98474 2.097  0.02097        0.12879
#> beta[4,3] -0.86416 1.609  0.01609        0.08633
#> beta[5,3]  0.07456 1.597  0.01597        0.07076
#> beta[6,3]  0.45412 2.697  0.02697        0.09375
#> 
#> 2. Quantiles for each variable:
#> 
#>              2.5%     25%     50%     75%  97.5%
#> alpha[1]   0.0000  0.0000  0.0000  0.0000 0.0000
#> alpha[2]  -6.0760 -2.7514 -0.9464  0.7636 4.0737
#> alpha[3]  -7.2179 -3.3325 -1.3085  0.4784 3.9364
#> beta[1,1]  0.0000  0.0000  0.0000  0.0000 0.0000
#> beta[2,1]  0.0000  0.0000  0.0000  0.0000 0.0000
#> beta[3,1]  0.0000  0.0000  0.0000  0.0000 0.0000
#> beta[4,1]  0.0000  0.0000  0.0000  0.0000 0.0000
#> beta[5,1]  0.0000  0.0000  0.0000  0.0000 0.0000
#> beta[6,1]  0.0000  0.0000  0.0000  0.0000 0.0000
#> beta[1,2] -3.2657 -1.5539 -0.6686  0.2970 1.9910
#> beta[2,2] -5.0281 -2.0270 -0.4329  1.1616 4.0349
#> beta[3,2] -0.9553  1.0735  2.0957  3.1838 5.5768
#> beta[4,2] -3.5925 -1.4003 -0.4761  0.4134 2.3058
#> beta[5,2] -3.2009 -1.1905 -0.1978  0.8687 2.7151
#> beta[6,2] -2.1637  0.5484  1.8749  3.1853 5.8102
#> beta[1,3] -6.4089 -3.6391 -2.3866 -1.2600 0.8229
#> beta[2,3] -7.9316 -4.4922 -2.6946 -0.8260 2.7573
#> beta[3,3]  0.1021  2.5416  3.8960  5.3833 8.2670
#> beta[4,3] -4.0675 -1.9166 -0.8612  0.2320 2.2527
#> beta[5,3] -3.0094 -1.0119  0.0389  1.1190 3.3034
#> beta[6,3] -4.8934 -1.3242  0.4582  2.2557 5.8092

Diagnostics:

novar <- which(apply(mnl_samps[[1]], 2, sd) == 0)
gelman.diag(mnl_samps[,-novar])
#> Potential scale reduction factors:
#> 
#>           Point est. Upper C.I.
#> alpha[2]        1.00       1.01
#> alpha[3]        1.00       1.00
#> beta[1,2]       1.01       1.04
#> beta[2,2]       1.00       1.02
#> beta[3,2]       1.00       1.01
#> beta[4,2]       1.00       1.01
#> beta[5,2]       1.00       1.01
#> beta[6,2]       1.00       1.00
#> beta[1,3]       1.00       1.02
#> beta[2,3]       1.00       1.00
#> beta[3,3]       1.00       1.01
#> beta[4,3]       1.01       1.01
#> beta[5,3]       1.00       1.01
#> beta[6,3]       1.00       1.01
#> 
#> Multivariate psrf
#> 
#> 1.01

Ordered Logit Model


ol_mod <- "model {
  # Likelihood
  for (i in 1:10) {
    FIN[i] ~ dcat(p[i,1:3])
    logit(q[i,1]) <- alpha[1] - beta[1]*AGE[i] + beta[2]*SEX[i] + beta[3]*INC[i] + beta[4]*POL[i] + beta[5]*AREA[i] + beta[6]*STAT[i]
    logit(q[i,2]) <- alpha[2] - beta[1]*AGE[i] + beta[2]*SEX[i] + beta[3]*INC[i] + beta[4]*POL[i] + beta[5]*AREA[i] + beta[6]*STAT[i]
    p[i,1] <- q[i,1]
    p[i,2] <- q[i,2]-q[i,1]
    p[i,3] <- 1-q[i,2]
    AGE[i] ~ dcat(p_age)
    SEX[i] ~ dbern(p_sex)
    INC[i] ~ dcat(p_inc)
    POL[i] ~ dcat(p_pol)
    STAT[i]~ dbern(p_stat)
  }
  for (j in 1:6) {
      beta[j] ~ dnorm(0, 0.25) # Normal prior for beta
  }
  a[1] ~ dnorm(-1, 0.25) # Normal prior for alpha
  a[2] ~ dnorm(1, 0.25) # Normal prior for alpha
  alpha <- sort(a)
        # -- Add model for dependent variables --
  # Priors
  # -- Priors for dependent variables --
  p_age ~ ddirch(c(0.3444976,0.3157895,0.3349282))
  p_inc ~ ddirch(c(0.2248804,0.3971292,0.2870813))
  p_pol ~ ddirch(c(0.4019139,0.1913876,0.3732057 ))
  p_sex ~ dbeta(2,1)
  p_stat ~ dbeta(2,1)
}"

ol_j <- jags.model(textConnection(ol_mod), data=dl, n.chains=2, inits=list(a = c(-1,1)))
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 55
#>    Unobserved stochastic nodes: 18
#>    Total graph size: 201
#> 
#> Initializing model
update(ol_j, 10000)
ol_samps <- coda.samples(ol_j, variable.names=c("alpha", "beta"), n.iter=5000)

Model Summary:



summary(ol_samps)
#> 
#> Iterations = 11001:16000
#> Thinning interval = 1 
#> Number of chains = 2 
#> Sample size per chain = 5000 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>             Mean     SD Naive SE Time-series SE
#> alpha[1] -1.0566 1.4416 0.014416        0.04071
#> alpha[2]  2.9102 1.4574 0.014574        0.04246
#> beta[1]  -1.1004 0.9484 0.009484        0.04393
#> beta[2]   1.4218 1.5966 0.015966        0.04189
#> beta[3]  -2.1456 1.1222 0.011222        0.05998
#> beta[4]   0.8146 0.8649 0.008649        0.03884
#> beta[5]  -0.3696 0.8421 0.008421        0.03201
#> beta[6]  -0.5356 1.3743 0.013743        0.03513
#> 
#> 2. Quantiles for each variable:
#> 
#>              2.5%     25%     50%      75%   97.5%
#> alpha[1] -3.91901 -2.0171 -1.0359 -0.06762  1.6986
#> alpha[2]  0.06765  1.9184  2.9027  3.88113  5.8073
#> beta[1]  -3.09245 -1.7250 -1.0667 -0.44667  0.6355
#> beta[2]  -1.69368  0.3684  1.4133  2.49551  4.5652
#> beta[3]  -4.50615 -2.8821 -2.0831 -1.35712 -0.1230
#> beta[4]  -0.88252  0.2309  0.8094  1.38204  2.5435
#> beta[5]  -2.08284 -0.9168 -0.3691  0.20091  1.2354
#> beta[6]  -3.19611 -1.4704 -0.5316  0.35694  2.1961

Diagnostics:

gelman.diag(ol_samps)
#> Potential scale reduction factors:
#> 
#>          Point est. Upper C.I.
#> alpha[1]       1.00       1.01
#> alpha[2]       1.00       1.01
#> beta[1]        1.01       1.01
#> beta[2]        1.00       1.02
#> beta[3]        1.01       1.01
#> beta[4]        1.01       1.06
#> beta[5]        1.01       1.05
#> beta[6]        1.00       1.00
#> 
#> Multivariate psrf
#> 
#> 1.02

^{Created on 2023-04-09 with reprex v2.0.2}

Note, in the above code I increased the precision on the coefficients to get something that looked close to convergence, but if you've got lots of data, you should be able to loosen those up in your real model if you like.