rlme4multilevel-analysis

Multilevel modeling for repeated measures data - time and lagged variables


I'm very new to multilevel modeling and doing data analysis for repeated measures. I am trying to figure out if my model is set up correctly using the nlme package and if it's set up correctly to answer the question I want to answer. I want to see if ius moderates the relationship between na and worry.

Variables

id - subject id
count - time variable; day of collection
worry - outcome (collected daily, continuous variable)
na - predictor (collected daily, continuous variable)
ius - moderator (collected at baseline, continuous variable)

I also created lag variables for na (lag_na) and worry (lag_worry) so I can control for the previous days na and worry though I'm not sure if this was the right thing to do.

Here is my code:

library(lme4)
# Here's an example dataset:
Dataset <- structure(list(id = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L), levels = c("1", "2", "3", 
"5"), class = "factor"), count = c(1, 2, 3, 1, 2, 3, 4, 1, 2, 
3, 4, 1, 2, 3, 4, 5, 6), na = c(1, 0, 0, 18, 13, 3, 5, 29, 15, 
19, 21, 3, 5, 2, 2, 18, 19), worry = c(0, 1, 0, 0, 0, 0, 0, 2, 
2, 1, 2, 0, 0, 3, 0, 4, 3), ius = c(35, 35, 35, 65, 65, 65, 65, 
44, 44, 44, 44, 53, 53, 53, 53, 53, 53), lag_na = c(NA, 1, 0, 
NA, 18, 13, 3, NA, 29, 15, 19, NA, 3, 5, 2, 2, 18), lag_worry = c(NA, 
0, 1, NA, 0, 0, 0, NA, 2, 2, 1, NA, 0, 0, 3, 0, 4)), row.names = c(NA, 
-17L), groups = structure(list(id = structure(1:4, levels = c("1", 
"2", "3", "5"), class = "factor"), .rows = structure(list(1:3, 
    4:7, 8:11, 12:17), ptype = integer(0), class = c("vctrs_list_of", 
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

model <- lmer(worry ~ na*ius + lag_na + lag_worry + count + (1 | id), REML=FALSE, data = Dataset)

Solution

  • For a variable to be an "effect moderator" (at least as the term is used in epidemiologic discussion) there would need to be a material change in the predictions from models with and without the interaction term in the model. You have a model with an interaction between ius and na

    > model <- lmer(worry ~ na*ius + lag_na + lag_worry + count + (1 | id), REML=FALSE, data = Dataset)
    > model
    Linear mixed model fit by maximum likelihood  ['lmerMod']
    Formula: worry ~ na * ius + lag_na + lag_worry + count + (1 | id)
       Data: Dataset
         AIC      BIC   logLik deviance df.resid 
     49.0113  54.0958 -15.5056  31.0113        4 
    Random effects:
     Groups   Name        Std.Dev.
     id       (Intercept) 0.7525  
     Residual             0.6096  
    Number of obs: 13, groups:  id, 4
    Fixed Effects:
    (Intercept)           na          ius       lag_na    lag_worry        count       na:ius  
       2.185346    -0.169745    -0.092891     0.128599    -0.871304     1.004783     0.003021  
    
    # Now remove the interaction term
    > model <- lmer(worry ~ na + ius + lag_na + lag_worry + count + (1 | id), REML=FALSE, data = Dataset)
    > model
    Linear mixed model fit by maximum likelihood  ['lmerMod']
    Formula: worry ~ na + ius + lag_na + lag_worry + count + (1 | id)
       Data: Dataset
         AIC      BIC   logLik deviance df.resid 
     47.4562  51.9758 -15.7281  31.4562        5 
    Random effects:
     Groups   Name        Std.Dev.
     id       (Intercept) 0.7212  
     Residual             0.6325  
    Number of obs: 13, groups:  id, 4
    Fixed Effects:
    (Intercept)           na          ius       lag_na    lag_worry        count  
       1.474439    -0.006056    -0.076298     0.122280    -0.840278     0.951945  
    

    From what I can see there is almost no change in measures of global fit (AIC, BIC or deviance). Do you want to proceed further in determining what the differences in predictions are with such a small dataset? There would be a difference in the predictions between these two models, but there seems to be little evidence that they would be considered "material". The method of examining what the data shows versus the respective models is described in this post to the stats.SE forum: https://stats.stackexchange.com/questions/33059/testing-for-moderation-with-continuous-vs-categorical-moderators/33090#33090

    Plot (scatterplot) worry as the y-axis and na on the x-axis. Then for the non-interaction model plot the single line at the mean of ius, You're going to find some difficulty in doing this sensibly because the values of `ius are not at all normally distributed. (I discovered this when I went to color the points in a scatterplot:

    findInterval(Dataset$ius, c(30,45, 52, 66))
     [1] 1 1 1 3 3 3 3 1 1 1 1 3 3 3 3 3 3
    > table(Dataset$ius)
    
    35 44 53 65 
     3  4  6  4 
    

    When you plot the points with the four groups you find that the ranges of the outcome and the predictor within groups of identical ius measures are much smaller that the full dataset ranges. It really makes little sense to use an interaction model with continuous variables in this setting:

    png(); plot(worry~jitter(na,3), Dataset, col=2+findInterval(Dataset$ius, c(30,36, 52, 56, 66))); dev.off()
    

    enter image description here

    So I see two compelling reasons not to use this analysis as evidence for effect moderation. Whether you want to built a categorical prediction model might be determined by how much more data could be gathered. Seems to be a pretty sparse dataset for any conclusions, but there is suggestion of some sort of grouping effect.