I have a dataset with a couple of missing values and would need to run a propensity score matching using the variable 'y' as Treatment variable and x1, x2 and x3 as variables for adjustment. By using the following code with Matchit
ModMatch <- matchit(y ~ x1+x2+x3, method = 'nearest', data = data)
I obtain the error 'Missing values exist in the data'
I have therefore tried to run a multiple imputation using mice:
ImputedDF <- mice(data)
ModMatch <- matchit(y ~ x1+x2+x3, method = 'nearest', data = ImputedDF)
And I get the error 'cannot coerce an object of class mids to a dataframe'. I would probably need a way to print an imputed data frame, could anyone know if that is possible?
You should use the MatchThem
package, which was specifically designed for performing matching after multiple imputation. The matchthem()
function calls matchit()
and performs matching within each imputed dataset. You can then check balance in the imputed dataset using the cobalt
package, which was designed to be compatible with MatchThem
. Afterward, you can use the with()
function in MatchThem
to estimate the effect. Here's an example of this workflow:
library(mice); library(MatchThem); library(cobalt)
#Impute the data with 20 imputations (more is better)
imp <- mice(data, m = 20)
#Perform matching within each imputation
ModMatch <- matchthem(y ~ x1 + x2 + x3, method = 'nearest', data = imp)
#Assess balance
bal.tab(ModMatch, un = TRUE)
love.plot(Modmatch)
#Estimate the effect
summary(pool(with(ModMatch, svyglm(outcome ~ y + x1 + X2 + X2))))
I would caution you that you are using advanced statistical techniques that should not be used by someone without advanced training. Using the defaults in mice
and MatchThem
is rarely a good idea.
Regarding the error messages you were getting: the output of a call to mice()
is not a data frame; it's a mids
object. The data
argument in matchit()
requires a data frame. matchthem()
accepts a mids
object to perform the matching within each imputed dataset.