I have constructed a model that accounted for a covariate. There are two dependent variables ("A","B") and two independent variables ("C", "D") and one continuous covariate ("E"). I ran a MANCOVA as follows:
x<-cbind(A,B) #combining dependent variables
y<-cbind(C,D) #combining independent variables
fit<-manova(x~y+E)
summary(fit, test="Pillai")
This all worked perfectly and I found that there was an effect of the covariate on the dependent variables. Therefore I want to account for the covariance with an estimated marginal mean, using the emmeans package. However, when I try to run the following code I receive this error:
library(emmeans)
emmeans(fit,~y+E)
>Error in eval(expr, envir, enclos) : object 'spc.l$Ghopper.Start..g.' not
found
>Error in ref_grid(object, ...) : Perhaps a 'data' or 'params' argument is
needed
Here is my data:
structure(list(ï..insect = c(105L, 106L, 107L, 108L, 110L, 112L,
113L, 114L, 115L, 116L, 117L, 118L, 119L, 120L, 121L, 122L, 123L,
125L, 126L, 127L, 128L), C = structure(c(1L, 2L, 1L, 2L, 2L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L
), .Label = c("Pair A 7p:35c-35p:7c", "Pair B 7p:35c-28p:14c"
), class = "factor"), D = structure(c(1L, 1L, 2L, 2L, 1L, 2L,
1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L), .Label = c("F",
"M"), class = "factor"), E = c(0.357, 0.259, 0.128, 0.104, 0.248,
0.111, 0.218, 0.213, 0.13, 0.123, 0.335, 0.22, 0.247, 0.295,
0.297, 0.219, 0.132, 0.194, 0.207, 0.266, 0.234), A = c(0.025333333,
0.041666665, 0.043833332, 0.046333331, 0.108499995, 0.051999997,
0.101833329, 0.06083333, 0.059499998, 0.056166664, 0.017833333,
0.053666664, 0.066333331, 0.025499998, 0.073666664, 0.149333324,
0.044666665, 0.047499998, 0.051833331, 0.020499999, 0.062499997
), B = c(0.050666667, 0.020333321, 0.023166668, 0.029666645,
0.032499992, 0.028999981, 0.029166671, 0.024166656, 0.025500002,
0.020833325, 0.021166667, 0.038333304, 0.023666669, 0.022499981,
0.040333336, 0.121666569, 0.023333335, 0.017500002, 0.01816666,
0.018500001, 0.024499989)), .Names = c("ï..insect", "C", "D",
"E", "A", "B"), class = "data.frame", row.names = c(NA, -21L))
I am sure that there is a simple fix to this problem, but I am a bit lost and there are few questions posted on stackexchange about emmeans!
Here is something that works, computationally:
R> fit = manova(x ~ cbind(C, D) + E, data = dat)
R> ref_grid(fit)
'emmGrid' object with variables:
C = 1.5238
D = 1.2857
E = 0.21605
rep.meas = multivariate response levels: A, B
R> emmeans(fit, ~ C + D + E)
C D E emmean SE df lower.CL upper.CL
1.52381 1.285714 0.2160476 0.04438095 0.005377854 17 0.03303467 0.05572723
Results are averaged over the levels of: rep.meas
Confidence level used: 0.95
I'll say more later about these results. So, replacing y
with cbind(C, D)
in the model call will make it work (computationally). Using y
directly, I get an error message:
R> fit0 = manova(x ~ y+E, data = dat)
R> ref_grid(fit0)
Error in model.matrix(trms, m, contrasts.arg = object$contrasts)[, nm, :
subscript out of bounds
This is not the same error message shown in the OP, and I can only guess that the scoping is different somehow. But what is happening here is that C
and D
are actually factors, not numeric predictors. cbind(C,D)
converts these to a numeric matrix with two columns. There is some technical reason for the error that I need to investigate and correct.
But the important thing is that neither fit
nor fit0
is model you want to use for post-hoc comparisons, because after all, C
and D
are factors. The reference grid, and hence the EMMs, for fit
are based on the average values of the numeric recodings of C
and D
, and the average E
. That's why there is only one row of emmeans
output.
What I think is needed is to take the cbind
out of the call:
R> fit1 = manova(x ~ C + D + E, data = dat)
R> summary(fit1)
Df Pillai approx F num Df den Df Pr(>F)
C 1 0.07083 0.6098 2 16 0.55559
D 1 0.03150 0.2602 2 16 0.77408
E 1 0.37794 4.8606 2 16 0.02242
Residuals 17
R> ref_grid(fit1)
'emmGrid' object with variables:
C = Pair A 7p:35c-35p:7c, Pair B 7p:35c-28p:14c
D = F, M
E = 0.21605
rep.meas = multivariate response levels: A, B
Since E
is numeric and reduced to only its mean, you need not include it in the emmeans
call:
R> emmeans(fit1, ~ C + D)
C D emmean SE df lower.CL upper.CL
Pair A 7p:35c-35p:7c F 0.05021337 0.011726789 17 0.02547201 0.07495473
Pair B 7p:35c-28p:14c F 0.05576090 0.008808415 17 0.03717677 0.07434503
Pair A 7p:35c-35p:7c M 0.01962942 0.016299910 17 -0.01476039 0.05401922
Pair B 7p:35c-28p:14c M 0.02517695 0.019816637 17 -0.01663250 0.06698640
Results are averaged over the levels of: rep.meas
Confidence level used: 0.95
These results are averaged over the two repeated measures in the dependent variable. You might want to separate them out, or average over the levels of some other factor. Since neither C
nor D
is significant, I'll just get the EMMs and comparison thereof for rep.meas
:
R> emmeans(fit1, pairwise ~ rep.meas)
$emmeans
rep.meas emmean SE df lower.CL upper.CL
A 0.04551606 0.008547982 17 0.02748139 0.06355073
B 0.02987426 0.006885079 17 0.01534801 0.04440050
Results are averaged over the levels of: C, D
Confidence level used: 0.95
$contrasts
contrast estimate SE df t.ratio p.value
A - B 0.01564181 0.005645167 17 2.771 0.0131
Results are averaged over the levels of: C, D
Since E
is significant and it is a covariate, it might be interesting to see if there is a different slope for each level of rep.meas
:
R> emtrends(fit1, pairwise ~ rep.meas, var = "E")
$emtrends
rep.meas E.trend SE df lower.CL upper.CL
A -0.3472133 0.1737591 17 -0.7138129 0.01938632
B 0.0213882 0.1399564 17 -0.2738940 0.31667042
Results are averaged over the levels of: C, D
Confidence level used: 0.95
$contrasts
contrast estimate SE df t.ratio p.value
A - B -0.3686015 0.1147521 17 -3.212 0.0051
Results are averaged over the levels of: C, D
Also, try this to get a nice graph of the predictions (result not shown):
emmip(fit1, rep.meas ~ E|C*D, at=list(E = c(.1,.35)))