Average of multiple ggpredict objects

I have generated 10 model summaries after fitting the same model on 10 different subsets of the dataset which is as follows

library(mice)
data("nhanes")
head(nhanes)

imp <- mice(nhanes, print = FALSE, m = 10, seed = 24415)
df <- complete(imp, "long")
model_fit <- lapply(1:10, function(i) {
  model = lm(bmi ~ age + hyp + chl,
             data = subset(df, `.imp`==i))
})

From this I get different ggpredict objects

ggpredict(model_fit[[1]], c("age", "hyp"))
ggpredict(model_fit[[2]], c("age", "hyp"))
ggpredict(model_fit[[3]], c("age", "hyp"))
ggpredict(model_fit[[4]], c("age", "hyp"))
ggpredict(model_fit[[5]], c("age", "hyp"))
ggpredict(model_fit[[6]], c("age", "hyp"))
ggpredict(model_fit[[7]], c("age", "hyp"))
ggpredict(model_fit[[8]], c("age", "hyp"))
ggpredict(model_fit[[9]], c("age", "hyp"))
ggpredict(model_fit[[10]], c("age", "hyp"))

I am looking for an efficient way to a) Estimate the average of all the ggpredict objects by hp and age

The expected output would look like this.

  age    hp      Predicted                        95% C.I
  --------------------------------------------------------
  1      1       (28.38 + 29.35 + 27.3...)/10      (26.67 +2 6.83 + 25.25...)/10 ; (30.08 + 31.87 + 29.35....)/10 
  2      1       (24.21 + 26.01 + 25.40...)/10     (22.71 + 23.56 + 23.54...)/10 ;  (25.71 + 28.46 + 27.26....)/10
  3      1       (20.05 + 22.67 + 23.51...)/10     (17.13 + 17.89 + 20.08..)/10 ;   (22.96 + 27.44 + 26.94...)/10

  1      2       (31.82 + 29.35 +28.87...)/10    (28.07 + 23.58 + 24.24...)/10 ;  (35.58 + 35.13 + 33.49....)/10 
  2      2       (27.66 + 26.01 +26.97...)/10    (24.88 + 22.02 + 23.43...)/10 ;  (30.43 + 30.00 + 30.52....)/10  
  3      2       (23.49 + 22.67 +25.08...)/10    (20.63 + 18.73 + 21.50...)/10 ;  (26.35 + 26.61 + 28.66....)/10

b) Plot based on the final averaged values using ggplot function.

So far I tried storing the results from each ggpredict function as list object and

`Reduce(`+`, list_ggpred)/length(list_ggpred)`

I got warning,

 " In Ops.factor(left, right) : `+1 not meaningful for factors.

Any suggestions highly appreciated. Thanks.

Solution

Simply use pool_predictions():

library(ggeffects)
# example for multiple imputed datasets
data("nhanes2", package = "mice")
imp <- mice::mice(nhanes2, printFlag = FALSE)
predictions <- lapply(1:5, function(i) {
  m <- lm(bmi ~ age + hyp + chl, data = mice::complete(imp, action = i))
  predict_response(m, "age")
})
pool_predictions(predictions)
#> # Predicted values of bmi
#> 
#> age   | Predicted |       95% CI
#> --------------------------------
#> 20-39 |     30.09 | 28.17, 32.02
#> 40-59 |     24.60 | 21.48, 27.72
#> 60-99 |     21.75 | 18.24, 25.26
#> 
#> Adjusted for:
#> * hyp =     no
#> * chl = 196.32

# and:
# pool_predictions(predictions) |> plot()

^{Created on 2024-04-30 with reprex v2.1.0}

Note that pool_predictions() account for the missing values / multiple imputations and adjusts the standard error, thus you get slightly larger confidence intervals than just taking the mean.