[SOLVED] Monte Carlo and pooling a list of models fitted to multiply imputed data

Monte Carlo and pooling a list of models fitted to multiply imputed data

I created a lavaan model out of an existing dataframe and used that model to generate a list of dataframes with (with some data missing from each dataframe). Then I conducted multiple imputation to create a list of mids objects (to address the missing data), and then I fit the original lavaan model to each of those mids objects using semTools::runMI. My final product is a list of lavaan.mi objects. (I can easily generate get the summary for each lavaan.mi object individually or I can make another list of lavaan.parameterEstimates/lavaan.data.frame objects).

Now that I have a list of these lavaan.mi objects in a list called final_imp, is there any way to pool the lavaan.mi objects together into a single lavaan.parameterEstimates object? Or is it impossible since technically each lavaan.mi object came from its own unique dataframe (after being generated from the original model)?

set.seed(123)
suppressMessages(library(mice))
suppressMessages(library(lavaan))
suppressMessages(library(simsem))
suppressMessages(library(semTools))
suppressMessages(library(tidyverse))

data(mtcars)
model <- 'gear ~ carb'
fit <- sem(model, data = mtcars)
make_missing <- miss(package = "mice", m = 2, maxit = 2, seed = 123)

biglist <- sim(
  nRep = 10,
  model = fit,
  n = 5,
  rawData = mtcars,
  miss = make_missing,
  lavaanfun = "sem",
  modelBoot = TRUE,
  seed = 123,
  dataOnly = TRUE)
#> Progress: 1 / 11 
#> Progress: 2 / 11 
#> Progress: 3 / 11 
#> Progress: 4 / 11 
#> Progress: 5 / 11 
#> Progress: 6 / 11 
#> Progress: 7 / 11 
#> Progress: 8 / 11 
#> Progress: 9 / 11 
#> Progress: 10 / 11 
#> Progress: 11 / 11

run_mi <- function(x) { 
  mice::mice(x, m = 2, maxit = 2, seed = 123, printFlag = FALSE)
}

df_imp <- purrr::map(biglist, run_mi)
#> Warning: Number of logged events: 4
#> Warning: Number of logged events: 1

#> Warning: Number of logged events: 1

#> Warning: Number of logged events: 1
#> Warning: Number of logged events: 3
#> Warning: Number of logged events: 1

#> Warning: Number of logged events: 1
#> Warning: Number of logged events: 2
class(df_imp)
#> [1] "list"
class(df_imp[[1]])
#> [1] "mids"

run_sem <- function(x) {
  runMI(model = model, data = x, fun = "sem", miPackage = "mice", seed = 123)
}

final_imp <- purrr::map(df_imp, run_sem)
class(final_imp)
#> [1] "list"
length(final_imp)
#> [1] 10
class(final_imp[[1]])
#> [1] "lavaan.mi"
#> attr(,"package")
#> [1] "semTools"

Solution

You can use the summary() method on a lavaan.mi object (returned by the runMI() function). That is the semTools analog of lavaan::parameterEstimates(), and it returns pooled results. See the class?lavaan.mi help page for details about arguments, and other applicable methods.