In the spirit of purr, broom, modelr, I am trying to create a "meta" data.frame in which each row denotes the dataset (d) and the model parameters (yvar, xvars, FEvars). For instance:
iris2 <- iris %>% mutate(Sepal.Length=Sepal.Length^2)
meta <- data.frame(n=1:4,
yvar = c('Sepal.Length','Sepal.Length','Sepal.Length','Sepal.Length'),
xvars= I(list(c('Sepal.Width'),
c('Sepal.Width','Petal.Length'),
c('Sepal.Width'),
c('Sepal.Width','Petal.Length'))),
data= I(list(iris,iris,iris2,iris2)) )
Now, I would like to run a model for each column of "meta". And then add a list column "model" with the model output object. To run the model I use an auxiliary function that uses a dataset, a y variable and a vector of x variables:
OLS_help <- function(d,y,xvars){
paste(y, paste(xvars, collapse=" + "), sep=" ~ ") %>% as.formula %>%
lm(d)
}
y <- 'Sepal.Length'
xvars <- c('Sepal.Width','Petal.Length')
OLS_help(iris,y,xvars)
How can I execute OLS_help
for all the rows of meta and adding the output of OLS_help as a list column in meta
? I tryed the following code, but it did not work:
meta %>% mutate(model = map2(d,yvar,xvars,OLS_help) )
Error: Can't convert a `AsIs` object to function
Call `rlang::last_error()` to see a backtrace
OBS: The solution to when only the "data" (nested) list column (corvered in Hadley's book here) is:
by_country <- gapminder %>% group_by(country, continent) %>% nest()
country_model <- function(df) { lm(lifeExp ~ year, data = df) }
by_country <- by_country %>% mutate(model = map(data, country_model))
We can use pmap
in the following way
df <- meta %>%
as_tibble() %>%
mutate_if(is.factor, as.character) %>%
mutate(fit = pmap(
list(yvar, xvars, data),
function(y, x, df) lm(reformulate(x, response = y), data = df)))
## A tibble: 4 x 5
# n yvar xvars data fit
# <int> <chr> <I<list>> <I<list>> <list>
#1 1 Sepal.Length <chr [1]> <df[,5] [150 × 5]> <lm>
#2 2 Sepal.Length <chr [2]> <df[,5] [150 × 5]> <lm>
#3 3 Sepal.Length <chr [1]> <df[,5] [150 × 5]> <lm>
#4 4 Sepal.Length <chr [2]> <df[,5] [150 × 5]> <lm>
Explanation: pmap
iterates over multiple arguments simultaneously (similar to base R's Map
); here we simultaneously loop throw entries in column yvar
, xvar
and data
, then use reformulate
to construct the formula to be used within lm
. We store the lm
fit object in column fit
.