I have two types of models (GAM and beta regression) that I am running for my data and I am using a lapply function to iterate over multiple variables. I am encountering an error when trying to export result summaries into a csv. Reproducible example below
library(mgcv)
library(betareg)
data(mtcars)
vars <- names(mtcars[c(2:11)])
vars
models = lapply(vars, function(x) {
gam(substitute(i ~ s(mpg) , list(i = as.name(x))),
method = 'REML',
data = mtcars)
})
y=lapply(models, summary)
names(y) <- vars # To add names of the model instead of list number
df=purrr::map_df(y, broom::tidy, .id = 'vars')
write.csv(df, "results/summary.csv")
The map_df and tidy method work for standard linear models.
df=purrr::map_df(y, broom::tidy, .id = 'vars')
but not for summary.betareg or summary.gam list
I have also tried df=as.data.frame(do.call(rbind,y))
which is closer but still not quite right.
Is there a way to extract these summaries easily?
The issue is that broom
does not provide a tidy
method for objects of class summary.gam
whereas for linear models aka models of class lm
we have both a tidy.lm
and a tidy.summary.lm
method. While the implementation of the methods differ slightly, tidy.lm
calls summary
and you should get the same df from both methods.
And the same holds for tidy.gam
. As with tidy.lm
it calls summary
under the hood. Hence, there is no need to use summary
and you could achieve your desired result by calling tidy
on the "raw" model object, i.e. use y <- models
:
library(mgcv)
#> Loading required package: nlme
#> This is mgcv 1.9-1. For overview type 'help("mgcv-package")'.
vars <- names(mtcars[c(2:11)])
models <- lapply(vars, function(x) {
gam(substitute(i ~ s(mpg), list(i = as.name(x))),
method = "REML",
data = mtcars
)
})
y <- models
names(y) <- vars
df <- purrr::map_df(y, broom::tidy, .id = "vars")
df
#> # A tibble: 10 × 6
#> vars term edf ref.df statistic p.value
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 cyl s(mpg) 4.57 5.56 34.7 0
#> 2 disp s(mpg) 2.54 3.17 34.4 0
#> 3 hp s(mpg) 3.05 3.80 16.9 0.00000130
#> 4 drat s(mpg) 1.00 1.00 26.0 0.0000180
#> 5 wt s(mpg) 2.32 2.91 38.1 0
#> 6 qsec s(mpg) 1.00 1.00 6.37 0.0171
#> 7 vs s(mpg) 1.00 1.00 23.7 0.0000345
#> 8 am s(mpg) 1.00 1.00 16.9 0.000285
#> 9 gear s(mpg) 1.00 1.00 8.99 0.00540
#> 10 carb s(mpg) 1.00 1.00 13.1 0.00109
And just as reference, for a simple linear model we get the following outputs when applying tidy
to the raw model object and the output of ´summary()`:
model <- lm(mpg ~ hp, mtcars)
broom::tidy(model)
#> # A tibble: 2 × 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 30.1 1.63 18.4 6.64e-18
#> 2 hp -0.0682 0.0101 -6.74 1.79e- 7
broom::tidy(summary(model))
#> # A tibble: 2 × 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 30.1 1.63 18.4 6.64e-18
#> 2 hp -0.0682 0.0101 -6.74 1.79e- 7