Is there some way of directly (jointly) modeling the random intercepts estimated with lme4
's lmer()
or brms
? For example, in the below code I fit a hierarchical model, extract the random intercepts, then model them.
One downside to this two-step approach is that I am ignoring the error in the intercepts. This is easily fixed with a robust covariance matrix, weighted least squares, etc. However, jointly estimating all of this in a single model is preferable.
For context: I am interested in this because I am estimating an item response model where each random intercept is a person's ability at each point in time and I want to predict those abilities. I'll be doing all of this within a much more complex, Bayesian model.
library(lme4)
library(tibble)
set.seed(123)
# Simulate longitudinal data
N <- 100
time <- 2
# Time-varying data
df <- tibble(person = rep(1:N, time),
x = rnorm(N*time),
y = 2 + x*runif(N*time))
# Fit hierarchical model
mod <- lmer(y ~ -1 + (1 | person), df)
# Time-invariant data (constant within person)
df_person <- data.frame(ints = data.frame(ranef(mod))$condval,
sex = rbinom(N, size = 1, prob = 0.5))
# Model intercepts as function of time-invariant feature
summary(lm(ints ~ 1 + sex, df_person))
I don't know about lme4
or brms
, but it can be done directly in Stan. Here's a way to replicate your model, but with everything jointly estimated; see this example (in Python, not R) and this paper for more.
We model the observed values of y with by-person random intercepts (one for each person j) and a scale parameter .
We model the by-person random intercepts as a function of the person's sex and the coefficient , plus a second scale parameter .
Here's the Stan code for the model described above. It uses a non-centered parameterization: "raw" values of have a standard normal prior, and we convert to the "real" values of using and .
data {
int<lower=0> N_obs; // number of observations
int<lower=2> N_person; // number of persons
int<lower=1,upper=N_person> person[N_obs]; // person associated with each observation
vector<lower=0,upper=1>[N_person] sex; // sex of each person
vector[N_obs] y; // observed outcomes
}
parameters {
real<lower=0> sigma; // observation-level variation
vector[N_person] alpha_person_raw; // random intercepts for persons
real mu_alpha; // mean of random intercepts for persons
real<lower=0> sigma_alpha; // scale of random intercepts for persons
real beta; // coefficients for sex
}
transformed parameters {
vector[N_person] alpha_person = (alpha_person_raw * sigma_alpha) +
mu_alpha +
(sex * beta);
}
model {
sigma ~ exponential(1);
alpha_person_raw ~ std_normal();
mu_alpha ~ std_normal();
sigma_alpha ~ exponential(1);
beta ~ std_normal();
y ~ normal(alpha_person[person], sigma);
}
I recreated the dataset using the same rules as in the original example, but in a slightly different format that's easier for Stan to work with.
library(tidyverse)
person.df = data.frame(person = 1:N, sex = rbinom(N, size = 1, prob = 0.5))
obs.df = data.frame(person = rep(1:N, time),
x = rnorm(N * time)) %>%
mutate(y = 2 + (x * runif(N * time)))
library(rstan)
stan.data = list(
N_obs = nrow(obs.df),
N_person = nrow(person.df),
person = obs.df$person,
sex = person.df$sex,
y = obs.df$y
)
stan.model = stan("two_level_model.stan", data = stan.data, chains = 3)
First, I re-fit the two-stage model using the new datasets.
first.level.m = lmer(y ~ -1 + (1 | person), obs.df)
second.level.m = lm(intercept ~ 1 + sex,
person.df %>%
mutate(intercept = ranef(first.level.m)$person[["(Intercept)"]]))
Now, let's compare the estimated parameter values. The two approaches agree that the effect of sex includes 0 and that the average of the random intercepts is approximately 2. (Naturally, these estimates are influenced by the fact that x
isn't yet included as a predictor in the models.) They estimate approximately the same error term at the observation level, but the joint model estimates a substantially smaller error term for the model of the random intercepts. I'm not sure how the model-fitting procedure of lme4
relates to the exponential prior I put on in the Stan model.
library(tidybayes)
bind_rows(
spread_draws(stan.model, mu_alpha, beta, sigma_alpha, sigma) %>%
ungroup() %>%
dplyr::select(.draw, mu_alpha, beta, sigma_alpha, sigma) %>%
pivot_longer(cols = -.draw, names_to = "parameter") %>%
group_by(parameter) %>%
summarise(lower.95 = quantile(value, 0.025),
lower.50 = quantile(value, 0.25),
est = median(value),
upper.50 = quantile(value, 0.75),
upper.95 = quantile(value, 0.975)) %>%
ungroup() %>%
mutate(parameter = case_when(parameter == "mu_alpha" ~ "mu[alpha]",
parameter == "beta" ~ "beta",
parameter == "sigma_alpha" ~ "sigma[alpha]",
parameter == "sigma" ~ "sigma"),
model = "joint"),
summary(second.level.m)$coefficients %>%
data.frame() %>%
rownames_to_column("parameter") %>%
mutate(lower.95 = Estimate + (Std..Error * qt(0.025, second.level.m$df.residual)),
lower.50 = Estimate + (Std..Error * qt(0.25, second.level.m$df.residual)),
est = Estimate,
upper.50 = Estimate + (Std..Error * qt(0.75, second.level.m$df.residual)),
upper.95 = Estimate + (Std..Error * qt(0.975, second.level.m$df.residual)),
parameter = case_when(parameter == "(Intercept)" ~ "mu[alpha]",
parameter == "sex" ~ "beta"),
model = "two-stage") %>%
dplyr::select(model, parameter, est, matches("lower|upper")),
data.frame(est = summary(second.level.m)$sigma,
parameter = "sigma[alpha]",
model = "two-stage"),
data.frame(est = summary(first.level.m)$sigma,
parameter = "sigma",
model = "two-stage")
) %>%
mutate(parameter = fct_relevel(parameter, "beta", "mu[alpha]", "sigma[alpha]",
"sigma")) %>%
ggplot(aes(x = parameter, color = model)) +
geom_linerange(aes(ymin = lower.95, ymax = upper.95), size = 1,
position = position_dodge(width = 0.3)) +
geom_linerange(aes(ymin = lower.50, ymax = upper.50), size = 2,
position = position_dodge(width = 0.3)) +
geom_point(aes(y = est), size = 3, position = position_dodge(width = 0.3)) +
scale_x_discrete(labels = scales::parse_format()) +
labs(x = "Parameter", y = "Estimated parameter value", color = "Model") +
coord_flip() +
theme_bw()