Obtaining mean effect size estimates and confidence intervals when a factor level combination is missing

I fitted a model using the metafor package with a moderator interaction for which there is no data for a combination of factor levels.

library(orchaRd)
data(fish)
warm_dat <- fish
warm_dat$factor2<-as.factor(seq(c("A","B","C"),133) )

mod_fit <- metafor::rma.mv(yi = lnrr, V = lnrr_vi,
random = list(~1 | group_ID, ~1 | es_ID),
mods = ~ trait.type*factor2,
method = "REML", test = "t",
control=list(optimizer="optim", optmethod="Nelder-Mead"), data = warm_dat)

I wanted to use the orchaRd package to obtain mean effect sizes and confidence intervals but I get an error.

results1 <- mod_results(mod_fit, group = "group_ID", mod = "trait.type", by = "Measurement_cat")
results2 <- mod_results(mod_fit, group = "group_ID", mod = "trait.type")

Error in ref_grid(result, ...) : Something went wrong: Non-conformable elements in reference grid.

I had a look at the code used in mod_results and found that it relies on emmeans. I tried to adapt the code by manually adding the missing combination of factor levels to the reference grid but the results I obtained were identical to estimates provided by the summary function.

coef1<-mod_fit$b
coef1[12]<-NA
names(coef1)<-c(row.names(mod_fit$b),"trait.typelife-history:factor2C")
vcov1<-stats::vcov(mod_fit)
vcov1<- rbind(vcov1,rep(NA, times=11))
vcov1<- cbind(vcov1,rep(NA, times=12))
row.names(vcov1)<-c(row.names(vcov(mod_fit)),"trait.typelife-history:factor2C")
colnames(vcov1)<-row.names(vcov1)

 grid <- emmeans::qdrg(formula =  stats::formula(mod_fit), 
       data = warm_dat, coef = coef1, vcov = vcov1, 
      df = mod_fit$k - 1)
    mm <- emmeans::emmeans(grid, specs =~trait.type * factor2, df = as.numeric(mod_fit$ddf[[1]]))
    mm_pi <- pred_interval_esmeans(mod_fit, mm, mod = trait.type)

Any suggestions as to how to obtain estimates and confidnece intervals in this instance would be much appreciated.

Solution

Following my 2nd comment, I think you may kind of have things backwards, in that you are setting some estimates you need to NA rather than creating a longer list of coefficients that includes NAs. Try something like this:

X = model.matrix(stats::formula(mod_fit), data = warm_dat)
b = mod_fit$b
full.coef = X[1, ] * NA   # a bunch of NAs with the right names
full.coef[names(b)] = b   # so now full.coef has NAs only where excluded from b
grid <- emmeans::qdrg(formula =  stats::formula(mod_fit), 
       data = warm_dat, coef = full.coef, vcov = vcov(mod_fit), 
      df = mod_fit$k - 1)
# etc.

Note that the vcov part is supposed to be the incomplete version with rows and columns dropped.

Update

The code in the OP does not quite work right. So for factor2 I used

warm_dat$factor2 = factor(rep(LETTERS[1:3], 137))[1:410]

The fix I proposed almost works. But an additional issue is that the intercept is named intrcpt instead of the conventional (Intercept) returned by model.matrix. (What's so wrong with package developers following established conventions???) Anyway this necessitates an additional line of code to set the intercept.

So below is the result I obtain. Note that the second element is correctly flagged as non-estimable.

data(fish, package="orchaRd")
warm_dat <- fish
warm_dat$factor2 = factor(rep(LETTERS[1:3], 137))[1:410]
mod_fit <- metafor::rma.mv(yi = lnrr, V = lnrr_vi,
                           random = list(~1 | group_ID, ~1 | es_ID),
                           mods = ~ trait.type*factor2,
                           method = "REML", test = "t",
                           control=list(optimizer="optim", optmethod="Nelder-Mead"), data = warm_dat)
## Warning: Redundant predictors dropped from the model.


library(emmeans)

b = mod_fit$b
full.coef = NA * model.matrix(stats::formula(mod_fit), data = warm_dat)[1, ]
full.coef[rownames(b)] = b   # so now full.coef has NAs only where excluded from b
full.coef[1] = b[1]       # needed because of unconventional naming of the intercept
RG <- emmeans::qdrg(formula =  stats::formula(mod_fit), 
                      data = warm_dat, coef = full.coef, vcov = vcov(mod_fit), 
                      df = mod_fit$k - 1)
confint(RG)
##  trait.type   factor2 prediction     SE  df lower.CL upper.CL
##  behaviour    A         0.953537 0.2992 409   0.3654   1.5417
##  life-history A           nonEst     NA  NA       NA       NA
##  morphology   A         0.001582 0.0224 409  -0.0425   0.0456
##  physiology   A         0.038965 0.0454 409  -0.0502   0.1282
##  behaviour    B        -0.028576 0.1827 409  -0.3878   0.3306
##  life-history B         0.444830 0.1296 409   0.1901   0.6995
##  morphology   B         0.004905 0.0226 409  -0.0394   0.0492
##  physiology   B        -0.000301 0.0467 409  -0.0922   0.0916
##  behaviour    C         0.057822 0.1889 409  -0.3135   0.4291
##  life-history C         0.972144 0.1695 409   0.6389   1.3054
##  morphology   C        -0.004610 0.0234 409  -0.0507   0.0415
##  physiology   C         0.004195 0.0404 409  -0.0752   0.0836
## 
## Confidence level used: 0.95

^{Created on 2024-07-26 with reprex v2.1.0}