rmatchmatchingpropensity-score-matching

Why the PSM results differ using Matching and Matchit in R


I conducted prospensity score matching in R using the R-package "Matching" and "Matchit" respectively, but the number of matches were completely different.

The dataset is here http://web.hku.hk/~bcowling/data/propensity.csv or http://web.hku.hk/~bcowling/examples/propensity.htm. example <- propensity

The code using "Matching" was:

m.ps <- glm(trt ~ age + risk + severity, family="binomial", data=example)

example$ps <- predict(m.ps, type="response")

PS.m <- Match(Y=example$death, Tr=example$trt, X=example$ps, M=1, caliper=0.2, replace=FALSE) summary(PS.m )

SE.........  0.041299 
T-stat.....  -2.1126 
p.val......  0.034634 

Original number of observations..............  400 
Original number of treated obs...............  192 
Matched number of observations...............  149 
Matched number of observations  (unweighted).  149 

Caliper (SDs)........................................   0.2 
Number of obs dropped by 'exact' or 'caliper'  43  

The number of matches was 149.

The code using "MatchIt" was: psm<-matchit(trt ~ age+risk+severity, data=example, method="nearest",caliper=0.2) summary(psm)

Sample Sizes:
         Control Treated
All           208     192
Matched       161     161
Unmatched      47      31
Discarded       0       0

The number of matches was 161, and it was different from 149 when using Matching. Why were they different?


Solution

  • Two reasons: 1) Matching proceeds through the matches in the order of units in the dataset while MatchIt by default proceeds through matches based on descending order of the propensity score, and 2) Matching uses a nonzero distance tolerance by default, meaning that any two units with a propensity score difference of .00001 or less will be considered exactly matched, whereas MatchIt has no such tolerance.

    To ensure the results are the same between Matching and MatchIt, set m.order = "data" in matchit() and set distance.tolerance = 0 in Match().

    PS.m <- Match(Y=example$death, Tr=example$trt, X=example$ps, M=1, caliper=0.2, replace=FALSE, ties = F,
                  distance.tolerance = 0)
    
    psm <- matchit(trt ~ age+risk+severity, data=example, method="nearest",caliper=0.2, 
                   m.order = "data")
    
    cobalt::bal.tab(psm, weights = PS.m)
    #> Call
    #>  matchit(formula = trt ~ age + risk + severity, data = example, 
    #>     method = "nearest", m.order = "data", caliper = 0.2)
    #> 
    #> Balance Measures
    #>              Type Diff.matchit Diff.Match
    #> distance Distance       0.0043     0.0043
    #> age       Contin.       0.0902     0.0902
    #> risk      Contin.      -0.0348    -0.0348
    #> severity  Contin.      -0.0342    -0.0342
    #> 
    #> Effective sample sizes
    #>         Control Treated
    #> All         208     192
    #> matchit     149     149
    #> Match       149     149
    

    Created on 2022-02-22 by the reprex package (v2.0.1)

    Here I used cobalt::bal.tab() to verify that the result matched sample sizes are the same and the balance statistics match identically, indicating the same matched sample is produced using both methods.