I'm very new to multilevel modeling and doing data analysis for repeated measures. I am trying to figure out if my model is set up correctly using the nlme
package and if it's set up correctly to answer the question I want to answer. I want to see if ius
moderates the relationship between na
and worry
.
Variables
id - subject id
count - time variable; day of collection
worry - outcome (collected daily, continuous variable)
na - predictor (collected daily, continuous variable)
ius - moderator (collected at baseline, continuous variable)
I also created lag variables for na
(lag_na
) and worry
(lag_worry
) so I can control for the previous days na
and worry
though I'm not sure if this was the right thing to do.
Here is my code:
library(lme4)
# Here's an example dataset:
Dataset <- structure(list(id = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L), levels = c("1", "2", "3",
"5"), class = "factor"), count = c(1, 2, 3, 1, 2, 3, 4, 1, 2,
3, 4, 1, 2, 3, 4, 5, 6), na = c(1, 0, 0, 18, 13, 3, 5, 29, 15,
19, 21, 3, 5, 2, 2, 18, 19), worry = c(0, 1, 0, 0, 0, 0, 0, 2,
2, 1, 2, 0, 0, 3, 0, 4, 3), ius = c(35, 35, 35, 65, 65, 65, 65,
44, 44, 44, 44, 53, 53, 53, 53, 53, 53), lag_na = c(NA, 1, 0,
NA, 18, 13, 3, NA, 29, 15, 19, NA, 3, 5, 2, 2, 18), lag_worry = c(NA,
0, 1, NA, 0, 0, 0, NA, 2, 2, 1, NA, 0, 0, 3, 0, 4)), row.names = c(NA,
-17L), groups = structure(list(id = structure(1:4, levels = c("1",
"2", "3", "5"), class = "factor"), .rows = structure(list(1:3,
4:7, 8:11, 12:17), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
model <- lmer(worry ~ na*ius + lag_na + lag_worry + count + (1 | id), REML=FALSE, data = Dataset)
For a variable to be an "effect moderator" (at least as the term is used in epidemiologic discussion) there would need to be a material change in the predictions from models with and without the interaction term in the model. You have a model with an interaction between ius
and na
> model <- lmer(worry ~ na*ius + lag_na + lag_worry + count + (1 | id), REML=FALSE, data = Dataset)
> model
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: worry ~ na * ius + lag_na + lag_worry + count + (1 | id)
Data: Dataset
AIC BIC logLik deviance df.resid
49.0113 54.0958 -15.5056 31.0113 4
Random effects:
Groups Name Std.Dev.
id (Intercept) 0.7525
Residual 0.6096
Number of obs: 13, groups: id, 4
Fixed Effects:
(Intercept) na ius lag_na lag_worry count na:ius
2.185346 -0.169745 -0.092891 0.128599 -0.871304 1.004783 0.003021
# Now remove the interaction term
> model <- lmer(worry ~ na + ius + lag_na + lag_worry + count + (1 | id), REML=FALSE, data = Dataset)
> model
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: worry ~ na + ius + lag_na + lag_worry + count + (1 | id)
Data: Dataset
AIC BIC logLik deviance df.resid
47.4562 51.9758 -15.7281 31.4562 5
Random effects:
Groups Name Std.Dev.
id (Intercept) 0.7212
Residual 0.6325
Number of obs: 13, groups: id, 4
Fixed Effects:
(Intercept) na ius lag_na lag_worry count
1.474439 -0.006056 -0.076298 0.122280 -0.840278 0.951945
From what I can see there is almost no change in measures of global fit (AIC, BIC or deviance). Do you want to proceed further in determining what the differences in predictions are with such a small dataset? There would be a difference in the predictions between these two models, but there seems to be little evidence that they would be considered "material". The method of examining what the data shows versus the respective models is described in this post to the stats.SE forum: https://stats.stackexchange.com/questions/33059/testing-for-moderation-with-continuous-vs-categorical-moderators/33090#33090
Plot (scatterplot) worry
as the y-axis and na
on the x-axis. Then for the non-interaction model plot the single line at the mean of ius
, You're going to find some difficulty in doing this sensibly because the values of `ius are not at all normally distributed. (I discovered this when I went to color the points in a scatterplot:
findInterval(Dataset$ius, c(30,45, 52, 66))
[1] 1 1 1 3 3 3 3 1 1 1 1 3 3 3 3 3 3
> table(Dataset$ius)
35 44 53 65
3 4 6 4
When you plot the points with the four groups you find that the ranges of the outcome and the predictor within groups of identical ius
measures are much smaller that the full dataset ranges. It really makes little sense to use an interaction model with continuous variables in this setting:
png(); plot(worry~jitter(na,3), Dataset, col=2+findInterval(Dataset$ius, c(30,36, 52, 56, 66))); dev.off()
So I see two compelling reasons not to use this analysis as evidence for effect moderation. Whether you want to built a categorical prediction model might be determined by how much more data could be gathered. Seems to be a pretty sparse dataset for any conclusions, but there is suggestion of some sort of grouping effect.