I cannot replicate marginaleffects::avg_predictions
by hand when using the by
argument.
In the following example, the coefficient returned by marginaleffects
for the nodegree==0
condition equals 8290.
How can I obtain it by hand?
library(tidyverse)
mydf <- Rdatasets::rddata("lalonde")
mod <- lm(re78 ~ treat + married + nodegree, data=mydf)
summary(mod)
marginaleffects::avg_predictions(mod, variables="treat", by="nodegree") # 8290
mdf <- mydf
mdf$treat = 0
mdf$nodegree = 0
res1 <- predict(mod, newdata = mdf, type = "response")
mean(res1) # nope
mdf <- mydf %>% filter(nodegree==0)
mdf$treat = 0
res2 <- predict(mod, newdata = mdf, type = "response")
mean(res2) # nope
Counterfactual variables replicate the entire data set (see ?avg_predictions
). So to replicate
mdf <- mdf2 <- mydf
mdf$treat = 0
mdf2$treat = 1
mdf <- rbind(mdf, mdf2)
res1 <- predict(mod, newdata = mdf, type = "response")
mean(res1[mdf$nodegree==0]) # 8290
mean(res1[mdf$nodegree==1]) # 6046