I've run a lm
model and then run predictions()
from marginaleffects
on the output. I want to join this back to by main df (that I fed into the lm
) but I can't see what option is the right one to use in this case.
Does anyone know what I need to do here? The output of predictions()
has a rowid
but joining on an index (which may have changed order) seems like a risky way forward.
For example, take the following code (from the documentation):
mod <- lm(mpg ~ hp + factor(cyl), data = mtcars)
pred <- predictions(mod)
pred %>% head()
Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
20.0 1.204 16.6 <0.001 17.7 22.4
20.0 1.204 16.6 <0.001 17.7 22.4
26.4 0.962 27.5 <0.001 24.5 28.3
20.0 1.204 16.6 <0.001 17.7 22.4
15.9 0.992 16.0 <0.001 14.0 17.9
20.2 1.219 16.5 <0.001 17.8 22.5
Columns: rowid, estimate, std.error, statistic, p.value, conf.low, conf.high, mpg, hp, cyl
> mtcars %>% head()
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
There is 1 prediction in pred
for each row in mtcars
but how do I join these?
Ah, I got it! It seems you just need to pass the original df through to the newdata
argument of predictions
. E.g.
mod <- lm(mpg ~ hp + factor(cyl), data = mtcars)
pred <- predictions(mod, newdata = mtcars)
pred %>% head()
Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 % cyl disp hp drat wt qsec vs am gear
20.0 1.204 16.6 <0.001 17.7 22.4 6 160 110 3.90 2.62 16.5 0 1 4
20.0 1.204 16.6 <0.001 17.7 22.4 6 160 110 3.90 2.88 17.0 0 1 4
26.4 0.962 27.5 <0.001 24.5 28.3 4 108 93 3.85 2.32 18.6 1 1 4
20.0 1.204 16.6 <0.001 17.7 22.4 6 258 110 3.08 3.21 19.4 1 0 3
15.9 0.992 16.0 <0.001 14.0 17.9 8 360 175 3.15 3.44 17.0 0 0 3
20.2 1.219 16.5 <0.001 17.8 22.5 6 225 105 2.76 3.46 20.2 1 0 3
carb
4
4
1
1
2
1
Columns: rowid, estimate, std.error, statistic, p.value, conf.low, conf.high, mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb