bayesianstanrstan

Specifying integer latent variable in stan


I'm learning Bayesian data analysis. I try to replicate the tutorials by Trond Reitan by stan, which are originally created by WinBugs.

Specifically, I have following data and model

weta.windata<-list(numdet=c(0, 0, 1, 0, 0, 0, 0, 0, 0, 2, 1, 1, 2, 0, 3, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 1, 0, 3, 1, 1, 3, 1, 1, 2, 0, 2, 1, 1, 1, 1,0, 0, 0, 2, 0, 2, 4, 3, 1, 0, 0, 2, 0, 2, 2, 1, 0, 0, 1),
           numvisit=c(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 3, 3, 4, 4, 4, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4,4, 4, 4, 4, 4, 4, 4 ,4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3),
           nsites=72)

model_string1="
data{
int nsites;
real<lower=0> numdet[nsites];
real<lower=0> numvisit[nsites];
}
parameters{
real<lower=0> p;
real<lower=0> psi;
int<lower=0> z[nsites];
}
model{
p~uniform(0,1);
psi~uniform(0,1);

for(i in 1:nsites){
z[i]~ bernoulli(psi);
p.site[i]~z[i]*p;
numdet[i]~binomial(numvisit[i],p.site[i]);
}
}
"

mcmc_samples <- stan(model_code=model_string1, 
                      data=weta.windata, 
                      pars=c("p","psi","z"), 
                      chains=3, iter=30000, warmup=10000)

The context is about detecting wetas in fields. There are 72 sites. for each site, researchers visited several times (i.e., numvisit) and recorded the number of times weta found (i.e., numdet).

There is a latent variable z, describing whether one site has weta or not. psi is the probability that one site has weta. p is the detection rate.

The problem I have is I can not declare z to be integers

parameters or transformed parameters cannot be integer or integer array;  found declared type int, parameter name=z
Problem with declaration.

However, if I set z to be real, that is,

real<lower=0> z[nsites];

the problem becomes I cannot set the variable from bernoulli as integer...

No matches for: 

  real ~ bernoulli(real)

I'm very new to stan. Forgive me if this question is very silly.


Solution

  • Stan doesn't support integer parameters or hacks to let you pretend real variables are integers. What it does support is marginalizing the integer variables out of the density. You can then reconstruct them with much more efficiency and much higher tail resolution.

    The chapter in the manual on latent discrete parameters is the place to start. It includes an implementation of the CJS population models, which may be familiar. I implemented the Dorazio and Royle occupance models as a case study and Hiroki Ito translated the entire Kery and Schaub book to Stan. They're all linked under users >> documentation on the web site.