rmetafor

predict.rma() onto complex new data in metafor (polynomials and factor levels)


I have a mixed effects meta-analysis model (rma.mv) containing a continuous polynomial moderator and a categorical moderator. I would like to predict on new data for plotting purposes.

As a reproducible example, we can copy the dataframe available here: http://www.metafor-project.org/doku.php/tips:non_linear_meta_regression

dat <- structure(list(yi = c(0.99, 0.54, -0.01, 1.29, 0.66, -0.12, 1.18,
-0.23, 0.03, 0.73, 1.27, 0.38, -0.19, 0.01, 0.31, 1.41, 1.32, 1.22, 1.24,
-0.05, 1.17, -0.17, 1.29, -0.07, 0.04, 1.03, -0.16, 1.25, 0.27, 0.27, 0.07,
0.02, 0.7, 1.64, 1.66, 1.4, 0.76, 0.8, 1.91, 1.27, 0.62, -0.29, 0.17, 1.05,
-0.34, -0.21, 1.24, 0.2, 0.07, 0.21, 0.95, 1.71, -0.11, 0.17, 0.24, 0.78,
1.04, 0.2, 0.93, 1, 0.77, 0.47, 1.04, 0.22, 1.42, 1.24, 0.15, -0.53, 0.73,
0.98, 1.43, 0.35, 0.64, -0.09, 1.06, 0.36, 0.65, 1.05, 0.97, 1.28), vi =
c(0.018, 0.042, 0.031, 0.022, 0.016, 0.013, 0.066, 0.043, 0.092, 0.009,
0.018, 0.034, 0.005, 0.005, 0.015, 0.155, 0.004, 0.124, 0.048, 0.006, 0.134,
0.022, 0.004, 0.043, 0.071, 0.01, 0.006, 0.128, 0.1, 0.156, 0.058, 0.044,
0.098, 0.154, 0.117, 0.013, 0.055, 0.034, 0.152, 0.022, 0.134, 0.038, 0.119,
0.145, 0.037, 0.123, 0.124, 0.081, 0.005, 0.026, 0.018, 0.039, 0.062, 0.012,
0.132, 0.02, 0.138, 0.065, 0.005, 0.013, 0.101, 0.051, 0.011, 0.018, 0.012,
0.059, 0.111, 0.073, 0.047, 0.01, 0.007, 0.055, 0.019, 0.104, 0.056, 0.006,
0.094, 0.009, 0.008, 0.02 ), xi = c(9.4, 6.3, 1.9, 14.5, 8.4, 1.8, 11.3,
4.8, 0.7, 8.5, 15, 11.5, 4.5, 4.3, 4.3, 14.7, 11.4, 13.4, 11.5, 0.1, 12.3,
1.6, 14.6, 5.4, 2.8, 8.5, 2.9, 10.1, 0.2, 6.1, 4, 5.1, 12.4, 10.1, 13.3,
12.4, 7.6, 12.6, 12, 15.5, 4.9, 0.2, 6.4, 9.4, 1.7, 0.5, 8.4, 0.3, 4.3, 1.7,
15.2, 13.5, 6.4, 3.8, 8.2, 11.3, 11.9, 7.1, 9, 9.9, 7.8, 5.5, 9.9, 2.6,
15.5, 15.3, 0.2, 3.2, 10.1, 15, 10.3, 0, 8.8, 3.6, 15, 6.1, 3.4, 10.2, 10.1,
13.7)), class = "data.frame", row.names = c(NA, -80L))

Then add a factor:

dat$fac<-rep(letters[1:3],length=length(dat$xi))

Then run a model with a cubic polynomial (okay it's not an rma.mv because I couldn't find a convenient mixed effects example but an rma should work the same):

res.cub<-rma(yi,vi,mods=~poly(xi,3,raw=T)*fac,data=dat)

I can predict onto the existing moderator values with no problems, and if I append these to the original dataframe I can plot different lines for different factors:

mypreds<-predict(res.cub,addx=TRUE)
dat$pred<-mypreds$pred    
ggplot(data=dat,aes(x=xi,y=pred,colour=fac,group=fac)) +
 geom_line() +
 geom_point() +
 theme_bw()

enter image description here

But let's say I really want those lines to be super smooth (no straight lines between points). The example in the link above shows how to do that for a single continuous moderator, by creating a vector with lots of values of xi and predicting on to those. However I can't get the prediction grid set up properly to do that for each level of the factor.

I understand factors need to be specified as dummy variables in predict.rma, so I think the below produces the new data matrix I need:

xs<-seq(-1, 16, length=50)
newmods1<-as.matrix(expand.grid(unname(poly(xs, degree=3, raw=TRUE)),unname(rep(c(0,1))),
                                unname(rep(c(0,1))),unname(rep(c(0,1)))))
lotsofpreds<-predict(res.cub, newmods=newmods1)

But I get "Error: Could not find variable 'Var1' in the model."

I guess my problem is to do with the names of the variables in the new matrix, but I can't find an example of how to deal with this as most other examples don't seem to use named variables in the newmods argument (e.g. the examples in the predict.rma helpfile, and examples in the link above).

Any advice on how to get this working would be much appreciated, thanks in advance!


Solution

  • Here's a more generalisable solution using model.matrix(), in case anyone else is doing weird and wonderful predictions from rma models:

    #model (from original post):
    res.cub<-rma(yi,vi,mods=~poly(xi,3,raw=T)*fac,data=dat)
    
    #copy and save model formula:
    newform <- (~poly(xi,3,raw=T)*fac) 
    
    #create new x values wanted:
    n  <- 50
    xs <- seq(-1, 16, length=n) 
    
    #generate x values for all levels of factors using same term names in same order as in model formula
    newgrid <- data.frame(expand.grid(xi=xs,fac=levels(as.factor(dat$fac)))) 
    
    #create the new model matrix and remove the intercept
    predgrid<-model.matrix(newform,data=newgrid)[,-1] 
    
    #predict onto the new model matrix
    mypreds <- predict(res.cub, newmods=predgrid) 
    
    #attach predictions to variables for plotting
    newgrid$pred<-mypreds$pred 
    
    #plot
    ggplot(data=newgrid, aes(x=xi, y=pred, colour=fac, group=fac)) + 
      geom_line() +
      geom_point() +
      theme_bw()
    

    As a bonus, if one is doing adjusted estimates or marginal means (http://www.metafor-project.org/doku.php/tips:computing_adjusted_effects), and wants to average over levels of certain variables, then the following code snippet can be inserted just before predicting onto the new matrix:

    #turn matrix temporarily into a dataframe
    predgrid2<-as.data.frame(predgrid)
    
    #run a for loop to replace all columns containing 'fac' with the column mean
    for (colname in colnames(predgrid2)) {
      if (grepl('fac', colname)){
        predgrid2[[colname]] <- colMeans(predgrid2[colname])
      }
    }
    
    #remove duplicates (combinations of x with different fac levels no longer exist)
    predgrid2<-predgrid2[!duplicated(predgrid2),]
    
    #turn back into a matrix so predict.rma can read it
    predgrid<-as.matrix(predgrid2)