I have imputed data saved as a mids
object and am trying to adapt my usual workflow around imputed data. However, I cannot figure out how to use sjPlot's tab_corr()
and tab_df()
and psych's describe
on a mids
object.
My goal is to generate a table of descriptive statistics and a correlation matrix without averaging the imputed datasets together. I was able to generate correlations using miceadds::micombine.cor
, but the output isn't formatted like a typical correlation matrix. I also can individually compute means, SDs, etc. of variables from the mids
object, but I'm looking for something that will generate a table.
library(mice)
library(miceadds)
library(sjPlot)
library(tidyverse)
library(psych)
set.seed(123)
## correlation matrix
data(nhanes)
imp <- mice(nhanes, print = FALSE)
head(micombine.cor(mi.res = imp)) # ugly
#> variable1 variable2 r rse fisher_r fisher_rse fmi
#> 1 age bmi -0.38765907 0.1899398 -0.40904214 0.2234456 0.09322905
#> 2 age hyp 0.51588273 0.1792162 0.57071301 0.2443348 0.25939786
#> 3 age chl 0.37685482 0.2157535 0.39638877 0.2515615 0.30863126
#> 4 bmi hyp -0.01748158 0.2244419 -0.01748336 0.2245067 0.10249784
#> 5 bmi chl 0.29082393 0.2519295 0.29946608 0.2752862 0.44307791
#> 6 hyp chl 0.30271060 0.1984525 0.31250096 0.2185381 0.04935528
#> t p lower95 upper95
#> 1 -1.83061192 0.06715849 -0.68949235 0.0288951
#> 2 2.33578315 0.01950255 0.09156846 0.7816509
#> 3 1.57571320 0.11509191 -0.09636276 0.7111171
#> 4 -0.07787455 0.93792784 -0.42805131 0.3990695
#> 5 1.08783556 0.27666771 -0.23557593 0.6852881
#> 6 1.42996130 0.15272813 -0.11531056 0.6296450
data(iris)
iris %>%
select(-c(Species)) %>%
tab_corr() # pretty
Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length -0.118 0.872\*\*\* 0.818\*\*\*
Sepal.Width -0.118 -0.428\*\*\* -0.366\*\*\*
Petal.Length 0.872\*\*\* -0.428\*\*\* 0.963\*\*\*
Petal.Width 0.818\*\*\* -0.366\*\*\* 0.963\*\*\*
Computed correlation used pearson-method with listwise-deletion.
## descriptive statistics
psych::describe(imp) # error
#> Warning in mean.default(x, na.rm = na.rm): argument is not numeric or logical:
#> returning NA
#> Error in is.data.frame(x): 'list' object cannot be coerced to type 'double'
mean(imp$data$age) # inefficient
#> [1] 1.76
iris %>%
select(-c(Species)) %>%
psych::describe() %>%
select(-(c(vars, n, median, trimmed, mad))) %>%
tab_df() # pretty
mean sd min max range skew kurtosis se
5.84 0.83 4.30 7.90 3.60 0.31 -0.61 0.07
3.06 0.44 2.00 4.40 2.40 0.31 0.14 0.04
3.76 1.77 1.00 6.90 5.90 -0.27 -1.42 0.14
1.20 0.76 0.10 2.50 2.40 -0.10 -1.36 0.06
Created on 2021-12-11 by the reprex package (v2.0.1)
I have created two functions, mice_df
and mice_cor
(link to Github repo here) that will generate a correlation matrix and a table of descriptive statistics from a mids
object using Rubin's Rules.
gtsummary
will neatly format models based on mids
objects.
library(mice)
library(gtsummary)
library(missy)
library(dplyr)
data(nhanes)
imp <- mice(nhanes, m = 3, print = FALSE)
mod <- with(imp, lm(age ~ bmi + chl))
tbl_regression(as.mira(mod)) %>% as_kable()
vs <- c("bmi", "chl", "age", "hyp")
title <- "Table 1: Correlation matrix"
mice_cor(imp = imp,
vs = vs,
title = title)