I need to estimate and plot a logistic multilevel model. I've got the binary dependent variable employment status (empl) (0 = unemployed; 1 = employed) and level of internet connectivity (isoc) as (continuous) independent variable and need to include random effects (intercept and slope) alongside the education level (educ) (1 = low-skilled worker; 2 = middle-skilled; 3 = high-skilled). Also I have some control variables I'm not going to mention here. I'm using the glmer function of the lme4 package. Here is a sample data frame and my (simplified) code:
library(lme4)
library(lmerTest)
library(tidyverse)
library(dplyr)
library(sjPlot)
library(moonBook)
library(sjmisc)
library(sjlabelled)
set.seed(1212)
d <- data.frame(empl=c(1,1,1,0,1,0,1,1,0,1,1,1,0,1,0,1,1,1,1,0),
isoc=runif(20, min=-0.2, max=0.2),
educ=sample(1:3, 20, replace=TRUE))
Results:
empl isoc educ
1 1 0.078604108 1
2 1 0.093667591 3
3 1 -0.061523272 2
4 0 0.009468908 3
5 1 -0.169220134 2
6 0 -0.038594789 3
7 1 0.170506490 1
8 1 -0.098487991 1
9 0 0.073339737 1
10 1 0.144211813 3
11 1 -0.133510687 1
12 1 -0.045306606 3
13 0 0.124211903 3
14 1 -0.003908486 3
15 0 -0.080673475 3
16 1 0.061406993 3
17 1 0.015401951 2
18 1 0.073351501 2
19 1 0.075648137 2
20 0 0.041450192 1
Fit:
m <- glmer(empl ~ isoc + (1 + isoc | educ),
data=d,
family=binomial("logit"),
nAGQ = 0)
summary(m)
Now the question: I'm looking for a plot with three graphs, one graph for each educ-level, with the probabilities (values just between 0 and 1). Here is an sample image from the web:
Below is my (simplified) code for the plot. But it only produces crap I cannot interpret.
plot_model(m, type="pred",
terms=c("isoc [all]", "educ"),
show.data=TRUE)
There is one thing I can do to get a "kind-of" right plot but I have to alter the model above in a way I think it's wrong (keyword: multicollinearity). Additionally I don't think the three graphs of this plot are correct either. The modified model looks like this:
m <- glmer(empl ~ isoc + educ (1 + isoc | educ),
data=d,
family=binomial("logit"),
nAGQ = 0)
summary(m)
I appreciate any help! I think my problem resembles this question but unfortunately there has been no answer to this yet and unfortunately I'm not able to comment with my low reputation.
I think you want
plot_model(m, type="pred",
pred.type = "re",
terms = c("isoc[n=100]","educ"), show.data = TRUE)
pred.type = "re"
takes the random effects into account when making predictionsisoc[n=100]
uses 100 distinct values across the range of isoc
- this is better than making predictions only at the observed values of isoc
, which is what [all]
specifiesFor the example you've given the prediction lines are all on top of each other (because the fit is singular/the random-effects variance is effectively zero), but that's presumably because your sample data set is so small.
For what it's worth, although this is a perfectly well-posed programming problem, I would not recommend treating educ
as a random effect:
Feel free to ask more questions about your model setup/definition on CrossValidated