rstanarm

Unique intercepts approach for categorical variables in "rstanarm" package in R


Background: McElearth (2016) in his rethinking book pages 158-159, uses an index variable instead of dummy coding for a 3-category variable called "clade" to predict "kcal.per.g" (linear regression).

Question: I was wondering if we could apply the same approach in "rstanarm"? I have provided data and R code for a possible demonstration below.

library("rethinking") # A github package not on CRAN
data(milk)
d <- milk
d$clade_id <- coerce_index(d$clade) # Index variable maker
#[1] 4 4 4 4 4 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 1 1 1 1 1 1 1 1 1 # index variable

# Model Specification:
fit1 <- map(
alist(
kcal.per.g ~ dnorm( mu , sigma ) ,
mu <- a[clade_id] ,
a[clade_id] ~ dnorm( 0.6 , 10 ) ,
sigma ~ dunif( 0 , 10 )
 ) ,
data = d )

Solution

  • The most analogous way to do this using the rstanarm package is with

    library(rstanarm)
    fit1 <- stan_glmer(kcal.per.g ~ 1 + (1 | clade_id), data = milk,
                       prior_intercept = normal(0.6, 1, autoscale = FALSE), 
                       prior_aux = exponential(rate = 1/5, autoscale = FALSE),
                       prior_covariance = decov(shape = 10, scale = 1))
    

    However, this is not exactly the same for the following reasons:

    1. Bounded uniform priors on sigma are not implemented because they are not a good idea, so I have used an exponential distribution with an expectation of 5 instead
    2. Fixing the standard deviation on a is not implemented either so I have used a gamma distribution with an expectation of 10
    3. Hierarchical models in rstanarm (and lme4) are parameterized with deviations from common parameters, so rather than using an expectation of 0.6 for a, I have used an expectation of 0.6 for the global intercept and the prior on a is normal with an expectation of zero. This means you need to call coef(fit1) rather than ranef(fit1) to see the "intercepts" as they are parameterized in the original model.