This is my data frame (please copy and paste to reproduce):
Control <- replicate(2, c("112", "113", "116", "118", "127", "131", "134", "135", "136", "138", "143", "148", "149", "152", "153", "155", "162", "163"))
EPD <- replicate(2, c("101", "102", "103", "104", "105", "106", "107", "108", "109", "110", "114", "115", "117", "119", "120", "122", "124", "125", "126", "128", "130", "133", "137", "139", "140", "141", "142", "144", "145", "147"))
Subject <- c(Control, EPD)
Group <- c(replicate(36, "Control"), replicate(60, "Patient"))
Side <- c(replicate(18, "L"), replicate(18, "R"), replicate(30, "L"), replicate(30, "R"))
Control_Volume_L <- c(99, 119, 119, 146, 127, 96, 100, 132, 103, 103, 107, 142, 140, 134, 117, 117, 133, 143)
Control_Volume_R <- c(93, 123, 114, 152, 122, 105, 98, 138, 111, 110, 115, 137, 142, 140, 124, 102, 153, 143)
EPD_Volume_L <- c(132, 115, 140, 102, 130, 131, 110, 124, 102, 111, 93, 92, 94, 104, 92, 115, 144, 118, 104, 132, 90, 102, 94, 112, 106, 105, 79, 114, 104, 108)
EPD_Volume_R <- c(136, 116, 143, 105, 136, 137, 103, 121, 105, 115, 97, 97, 93, 108, 91, 117, 147, 111, 97, 129, 85, 107, 91, 116, 113, 101, 75, 108, 95, 98)
Volume <- c(Control_Volume_L, Control_Volume_R, EPD_Volume_L, EPD_Volume_R)
Control_FA_L <- c(0.43, 0.39, 0.38, 0.58, 0.37, 0.5, 0.35, 0.36, 0.72, 0.38, 0.45, 0.30, 0.47, 0.30, 0.67, 0.34, 0.42, 0.29)
Control_FA_R <- c(0.36, 0.49, 0.55, 0.59, 0.33, 0.41, 0.32, 0.50, 0.59, 0.52, 0.32, 0.40, 0.49, 0.33, 0.46, 0.39, 0.37, 0.33)
EPD_FA_L <- c(0.25, 0.39, 0.36, 0.42, 0.21, 0.40, 0.43, 0.16, 0.31, 0.41, 0.39, 0.40, 0.35, 0.29, 0.31, 0.24, 0.39, 0.36, 0.54, 0.38, 0.34, 0.28, 0.42, 0.33, 0.40, 0.36, 0.42, 0.28, 0.40, 0.41)
EPD_FA_R <- c(0.26, 0.36, 0.36, 0.61, 0.22, 0.33, 0.36, 0.34, 0.35, 0.37, 0.39, 0.45, 0.30, 0.31, 0.50, 0.31, 0.29, 0.43, 0.41, 0.21, 0.38, 0.28, 0.66, 0.33, 0.50, 0.27, 0.46, 0.37, 0.26, 0.39)
FA <- c(Control_FA_L, Control_FA_R, EPD_FA_L, EPD_FA_R)
data <- data.frame(Subject, Group, Side, Volume, FA)
I then run a linear mixed model for FA values with the nlme package:
library(nlme)
lmm <- lme(FA ~ Group + Side + Volume, ~ 1|Subject, data = data)
summary(lmm)
Because "Side" is not a significant factor it is dropped from the model:
lmm <- lme(FA ~ Group + Volume, ~ 1|Subject, data = data)
summary(lmm)
I would like to run a post-hoc analysis for the "Group" factor (two levels: "Control" and "Patient"). Normally I would run the following code to perform post-hoc analysis on factors with more than two levels using the multcomp package:
library(multcomp)
summary(glht(lmm, linfct=mcp(Group ="Tukey")))
I do not believe Tukey's multiple comparisons post-hoc test would be appropriate in this case since our factor only has two levels. What would be the appropriate post-hoc test in this scenario? I would like to see the model estimated differences between the two levels of the "Group" factor ("Control" and "Patient"). Any feedback would be much appreciated!
A couple of things. First, just because Side
is non-significant isn't necessarily a reason to drop it from the model. If there are theoretical reasons to drop (e.g. if there's no reason why it should relate to the outcome) it or if there are issues with the measurement/data itself, then it's maybe more valid to drop.
Second, since Group
is a binary variable, you shouldn't need to do any post-hoc comparisons. The coefficient in your output for Group
will represent the average difference between control and patient groups while controlling for all other variables in the model. So, in the output from the model with Side
included, it looks like patients score .08 units lower on their FA score, on average, than those in the control group. If this metric is meaningful in itself, then you may just report it this way. If not, you may want to standardize it.
Hope this helps.