I have a mixed effects meta-analysis model (rma.mv) containing a continuous polynomial moderator and a categorical moderator. I would like to predict on new data for plotting purposes.
As a reproducible example, we can copy the dataframe available here: http://www.metafor-project.org/doku.php/tips:non_linear_meta_regression
dat <- structure(list(yi = c(0.99, 0.54, -0.01, 1.29, 0.66, -0.12, 1.18,
-0.23, 0.03, 0.73, 1.27, 0.38, -0.19, 0.01, 0.31, 1.41, 1.32, 1.22, 1.24,
-0.05, 1.17, -0.17, 1.29, -0.07, 0.04, 1.03, -0.16, 1.25, 0.27, 0.27, 0.07,
0.02, 0.7, 1.64, 1.66, 1.4, 0.76, 0.8, 1.91, 1.27, 0.62, -0.29, 0.17, 1.05,
-0.34, -0.21, 1.24, 0.2, 0.07, 0.21, 0.95, 1.71, -0.11, 0.17, 0.24, 0.78,
1.04, 0.2, 0.93, 1, 0.77, 0.47, 1.04, 0.22, 1.42, 1.24, 0.15, -0.53, 0.73,
0.98, 1.43, 0.35, 0.64, -0.09, 1.06, 0.36, 0.65, 1.05, 0.97, 1.28), vi =
c(0.018, 0.042, 0.031, 0.022, 0.016, 0.013, 0.066, 0.043, 0.092, 0.009,
0.018, 0.034, 0.005, 0.005, 0.015, 0.155, 0.004, 0.124, 0.048, 0.006, 0.134,
0.022, 0.004, 0.043, 0.071, 0.01, 0.006, 0.128, 0.1, 0.156, 0.058, 0.044,
0.098, 0.154, 0.117, 0.013, 0.055, 0.034, 0.152, 0.022, 0.134, 0.038, 0.119,
0.145, 0.037, 0.123, 0.124, 0.081, 0.005, 0.026, 0.018, 0.039, 0.062, 0.012,
0.132, 0.02, 0.138, 0.065, 0.005, 0.013, 0.101, 0.051, 0.011, 0.018, 0.012,
0.059, 0.111, 0.073, 0.047, 0.01, 0.007, 0.055, 0.019, 0.104, 0.056, 0.006,
0.094, 0.009, 0.008, 0.02 ), xi = c(9.4, 6.3, 1.9, 14.5, 8.4, 1.8, 11.3,
4.8, 0.7, 8.5, 15, 11.5, 4.5, 4.3, 4.3, 14.7, 11.4, 13.4, 11.5, 0.1, 12.3,
1.6, 14.6, 5.4, 2.8, 8.5, 2.9, 10.1, 0.2, 6.1, 4, 5.1, 12.4, 10.1, 13.3,
12.4, 7.6, 12.6, 12, 15.5, 4.9, 0.2, 6.4, 9.4, 1.7, 0.5, 8.4, 0.3, 4.3, 1.7,
15.2, 13.5, 6.4, 3.8, 8.2, 11.3, 11.9, 7.1, 9, 9.9, 7.8, 5.5, 9.9, 2.6,
15.5, 15.3, 0.2, 3.2, 10.1, 15, 10.3, 0, 8.8, 3.6, 15, 6.1, 3.4, 10.2, 10.1,
13.7)), class = "data.frame", row.names = c(NA, -80L))
Then add a factor:
dat$fac<-rep(letters[1:3],length=length(dat$xi))
Then run a model with a cubic polynomial (okay it's not an rma.mv because I couldn't find a convenient mixed effects example but an rma should work the same):
res.cub<-rma(yi,vi,mods=~poly(xi,3,raw=T)*fac,data=dat)
I can predict onto the existing moderator values with no problems, and if I append these to the original dataframe I can plot different lines for different factors:
mypreds<-predict(res.cub,addx=TRUE)
dat$pred<-mypreds$pred
ggplot(data=dat,aes(x=xi,y=pred,colour=fac,group=fac)) +
geom_line() +
geom_point() +
theme_bw()
But let's say I really want those lines to be super smooth (no straight lines between points). The example in the link above shows how to do that for a single continuous moderator, by creating a vector with lots of values of xi and predicting on to those. However I can't get the prediction grid set up properly to do that for each level of the factor.
I understand factors need to be specified as dummy variables in predict.rma, so I think the below produces the new data matrix I need:
xs<-seq(-1, 16, length=50)
newmods1<-as.matrix(expand.grid(unname(poly(xs, degree=3, raw=TRUE)),unname(rep(c(0,1))),
unname(rep(c(0,1))),unname(rep(c(0,1)))))
lotsofpreds<-predict(res.cub, newmods=newmods1)
But I get "Error: Could not find variable 'Var1' in the model."
I guess my problem is to do with the names of the variables in the new matrix, but I can't find an example of how to deal with this as most other examples don't seem to use named variables in the newmods argument (e.g. the examples in the predict.rma helpfile, and examples in the link above).
Any advice on how to get this working would be much appreciated, thanks in advance!
Here's a more generalisable solution using model.matrix(), in case anyone else is doing weird and wonderful predictions from rma models:
#model (from original post):
res.cub<-rma(yi,vi,mods=~poly(xi,3,raw=T)*fac,data=dat)
#copy and save model formula:
newform <- (~poly(xi,3,raw=T)*fac)
#create new x values wanted:
n <- 50
xs <- seq(-1, 16, length=n)
#generate x values for all levels of factors using same term names in same order as in model formula
newgrid <- data.frame(expand.grid(xi=xs,fac=levels(as.factor(dat$fac))))
#create the new model matrix and remove the intercept
predgrid<-model.matrix(newform,data=newgrid)[,-1]
#predict onto the new model matrix
mypreds <- predict(res.cub, newmods=predgrid)
#attach predictions to variables for plotting
newgrid$pred<-mypreds$pred
#plot
ggplot(data=newgrid, aes(x=xi, y=pred, colour=fac, group=fac)) +
geom_line() +
geom_point() +
theme_bw()
As a bonus, if one is doing adjusted estimates or marginal means (http://www.metafor-project.org/doku.php/tips:computing_adjusted_effects), and wants to average over levels of certain variables, then the following code snippet can be inserted just before predicting onto the new matrix:
#turn matrix temporarily into a dataframe
predgrid2<-as.data.frame(predgrid)
#run a for loop to replace all columns containing 'fac' with the column mean
for (colname in colnames(predgrid2)) {
if (grepl('fac', colname)){
predgrid2[[colname]] <- colMeans(predgrid2[colname])
}
}
#remove duplicates (combinations of x with different fac levels no longer exist)
predgrid2<-predgrid2[!duplicated(predgrid2),]
#turn back into a matrix so predict.rma can read it
predgrid<-as.matrix(predgrid2)