rr-marginaleffects

Join results of marginaleffects::predictions() back to main df?


I've run a lm model and then run predictions() from marginaleffects on the output. I want to join this back to by main df (that I fed into the lm) but I can't see what option is the right one to use in this case.

Does anyone know what I need to do here? The output of predictions() has a rowid but joining on an index (which may have changed order) seems like a risky way forward.

For example, take the following code (from the documentation):

mod <- lm(mpg ~ hp + factor(cyl), data = mtcars)
pred <- predictions(mod)

pred %>% head()

 Estimate Std. Error    z Pr(>|z|) 2.5 % 97.5 %
     20.0      1.204 16.6   <0.001  17.7   22.4
     20.0      1.204 16.6   <0.001  17.7   22.4
     26.4      0.962 27.5   <0.001  24.5   28.3
     20.0      1.204 16.6   <0.001  17.7   22.4
     15.9      0.992 16.0   <0.001  14.0   17.9
     20.2      1.219 16.5   <0.001  17.8   22.5

Columns: rowid, estimate, std.error, statistic, p.value, conf.low, conf.high, mpg, hp, cyl 


> mtcars %>% head()
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

There is 1 prediction in pred for each row in mtcars but how do I join these?


Solution

  • Ah, I got it! It seems you just need to pass the original df through to the newdata argument of predictions. E.g.

    mod <- lm(mpg ~ hp + factor(cyl), data = mtcars)
    pred <- predictions(mod, newdata = mtcars)
    
    pred %>% head()
    
     Estimate Std. Error    z Pr(>|z|) 2.5 % 97.5 % cyl disp  hp drat   wt qsec vs am gear
         20.0      1.204 16.6   <0.001  17.7   22.4   6  160 110 3.90 2.62 16.5  0  1    4
         20.0      1.204 16.6   <0.001  17.7   22.4   6  160 110 3.90 2.88 17.0  0  1    4
         26.4      0.962 27.5   <0.001  24.5   28.3   4  108  93 3.85 2.32 18.6  1  1    4
         20.0      1.204 16.6   <0.001  17.7   22.4   6  258 110 3.08 3.21 19.4  1  0    3
         15.9      0.992 16.0   <0.001  14.0   17.9   8  360 175 3.15 3.44 17.0  0  0    3
         20.2      1.219 16.5   <0.001  17.8   22.5   6  225 105 2.76 3.46 20.2  1  0    3
     carb
        4
        4
        1
        1
        2
        1
    
    Columns: rowid, estimate, std.error, statistic, p.value, conf.low, conf.high, mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb