How to account for paired observations in statistical tests other than t-test
? Below I discuss two examples in which I try to do so with a mixed-effect approach and fail.
Example 1: how to reproduce t.test(..., paired=T)
in lm()
?
# simulate data
set.seed(66)
x1 <- rnorm(n=100, mean=30, sd=6)
x2 <- rnorm(n=100, mean=60, sd=6)
# arrange the data in a dataset
dd <- data.frame(ID=rep(paste("ID", seq(1,100, by=1), sep="_"),2),
response=c(x1,x2),
group=c(rep("A", 100), rep("B", 100))
)
t.test(x1,x2, paired=F)
summary(lm(response~group, data=dd)) # same outcome
If the observations are paired one can account for it with t.test()
but how to do so in lm()
(if at all possible)? I tried to use a mixed-effect model approach, but:
summary(lmerTest::lmer(response~group + (1+group|ID), data=dd))
Gives an error:
Error: number of observations (=200) <= number of random effects (=200) for term (1 + group | ID);
the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable
While:
summary(lmerTest::lmer(response~group + (1|ID), data=dd))
Runs but the fixed effect parameter estimates and associated Std. Errors and t values are identical to those produced by lm()
.
Example 2: linear regression with paired observations
Let's imagine that the observations in the dataset I created were from subjects measured 30 days apart - namely, each of the 100 subjects were measured on day 0, then again on day 30 - and we wanted to estimate the rate of change over time:
dd$time=c(rep(0,100), rep(30, 100)) # add "time" variable to dd
Data look like this (linear regression in black, paired data linked by red lines):
lm1 <- lm(response~time, data=dd)
lm1
does not account for the paired nature of the observations.
I thought of running a mixed-effect model that allowed for each pair of data to differ in intercept and slope, but R protests again that I am trying to estimate too many parameters:
lmerTest::lmer(response ~ time + (time | ID), data=dd)
# Error: number of observations (=200) <= number of random effects (=200) for term (time | ID);
# the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable
A simpler model that allows data pairs to differ in intercept but not in slope, namely:
lmer(response ~ time + (1 | ID), data=dd)
Complains that:
boundary (singular) fit: see ?isSingular
But runs and produces fixed effect estimates that are identical to those produced by lm()
.
[UPDATE]
@Limey reminded me that a paired t-test is nothing but a t-test assessing whether the pairwise differences between the two groups differ from zero. Such pairwise difference could be used to perform any paired statistical test beside a t test. To verify this I created three different "Response" variables that are a combination of x1
and x2
ordered in different ways (respectively: original random order; x1
in increasing and x2
in decreasing order; both in increasing order).
dd$response2 <- c(sort(x1, decreasing = FALSE), sort(x2, decreasing = T))
dd$response3 <- c(sort(x1, decreasing = FALSE), sort(x2, decreasing = F))
I calculated the corresponding differences:
dd$diff1 <- c((dd$response[1:100]-dd$response[1:100]),
(dd$response[101:200]-dd$response[1:100]))
dd$diff2 <- c((dd$response2[1:100]-dd$response2[1:100]),
(dd$response2[101:200]-dd$response2[1:100]))
dd$diff3 <- c((dd$response3[1:100]-dd$response3[1:100]),
(dd$response3[101:200]-dd$response3[1:100]))
And used them to perform linear models:
lm2.diff1 <- lm(diff1~time, data=dd)
lm2.diff2 <- lm(diff2 ~time, data=dd)
lm2.diff3 <- lm(diff3 ~time, data=dd)
I expected them to differ in their slope estimates, but they were all the same:
summary(lm2.diff1)$coeff[2] # 0.9993754
summary(lm2.diff2)$coeff[2] # 0.9993754
summary(lm2.diff3)$coeff[2] # 0.9993754
Their slope estimate is the same estimated from the corresponding "unpaired" linear models (lm(response~time)
, lm(response2~time)
, and lm(response3~time)
). What am I missing?
Good question! There are a couple of tricky bits here.
t.test()
and by hand:pairedtest1 <- t.test(x1,x2, paired=TRUE)
d <- x1-x2
n <- length(x1)
tstat <- mean(d)/(sd(d)/sqrt(n)) ## -37.58846
pval <- 2*pt(abs(tstat), lower.tail=FALSE, df=n-1) ## 2.065802e-60
all.equal(pairedtest1$p.value,pval) ## TRUE
all.equal(unname(pairedtest1$statistic),tstat) ## TRUE
lmer
to fit it if you want:m0 <- lme4::lmer(response~group+(group|ID), data=dd, REML=TRUE,
control=lmerControl(check.nobs.vs.nRE="ignore", calc.deriv=FALSE))
(Note that we can also see if two models are giving equivalent fits by comparing the log-likelihoods or REML criteria - when we have overfitted models like this one, different models can get equivalent fits with different linear combinations of model parameters ...)
If we run
library(lmerTest)
coef(summary(as(m1,"lmerModLmerTest"),ddf="Kenward-Roger"))["groupB",]
(the default Satterthwaite ddf calculation fails here) we get
Estimate Std. Error df t value Pr(>|t|)
2.998126e+01 7.976192e-01 9.900000e+01 3.758844e+01 2.065922e-60
the t-statistic and p-value very closely match the results above (I could just have said summary()
but wanted to pull out this specific row of the coefficient table).
m1 <- lme4::lmer(response~group+(1|ID), data=dd, REML=TRUE)
As you noted, the fit is singular (if you check, the RE variance will be listed as 0). Here the t-statistic and p-value are a little off (at this moment, I'm not quite sure why the previous model worked!). The reason is that, for this data set, the within-group variance is slightly greater than the between-group variance. In general we expect var(between) = sigma^2_between + sigma_2_within/n
, and this works asymptotically/in expectation, but in small data sets the ordering can be in the direction we observe here, in which case we would need a negative variance in order to perfectly fit the data.
resids <- with(dd,response-ave(response,group, FUN=mean))
wv <- var(resids-ave(resids, dd$ID, FUN=mean)) ## 15.82
bv <- var(tapply(resids, list(dd$ID), FUN=mean)) ## 14.92
(If we fit the same model with lme
it appears OK and gives us a small [but non-zero] estimate for the among-group intercept variance, but if we try intervals(m2)
it complains that the approximate var-cov matrix is not positive definite ...)
library(nlme)
m2 <- lme(response~group,random=list(ID=pdCompSymm(form=~group-1)),
data=dd,method="REML")
all.equal(tstat^2, anova(m2)[["F-value"]][2]) ## TRUE
all.equal(pval, anova(m2)[["p-value"]][2]) ## TRUE
The p-value from anova()
matches our result above, and the F-statistic matches the square of our t-statistic.
glmmTMB
: we have to be a little bit careful because the default cs()
covariance structure in glmmTMB
allows different variances for each term, while corCompSymm
imposes a homogeneous variance:m3 <- glmmTMB::glmmTMB(response~group+cs(group-1|ID),
data=dd, REML=TRUE)
m4 <- update(m3, map=list(theta=factor(c(1,1,2))))
(the map
argument sets the first two elements of the random-effects parameter vector, which correspond to the log-sd of the variation in the first group and the second group, to be identical)
The coefficient table gets the t-statistic (which it calls a z value) correct, but doesn't have a concept of "denominator degrees of freedom", so does a Z-test rather than a t-test ...
coef(summary(m4))$cond["groupB",]
Estimate Std. Error z value Pr(>|z|)
groupB 29.98126 0.7976217 37.58832 0