I have a dataset with multiple columns for the outcome variables that I would like to predict with the same preprocessing steps and models. Is there a way to run the same recipe and models (with tuning - I'm using workflow_map()
) on multiple outcome variables (separate models for each outcome)?
Essentially, I want loop through the same preprocessing steps and models for each outcome. Basically I want to avoid having to do this:
model_recipe1 <- recipe(outcome_1 ~ ., data) %>%
step_1
model_recipe2 <- recipe(outcome_2 ~ ., data) %>%
step_1
model_recipe3 <- recipe(outcome_3 ~ ., data) %>%
step_1
and would instead like to do something like this:
model_recipe <- recipe(outcome[i] ~ ., data) %>%
step_1
I'm not sure if we 100% recommend the approach you are trying, but it will work in some circumstances:
library(tidymodels)
folds <- bootstraps(mtcars, times = 5)
wf_set <- workflow_set(list(mpg ~ ., wt ~ ., disp ~ .), list(linear_reg()))
workflow_map(wf_set, "fit_resamples", resamples = folds)
#> # A workflow set/tibble: 3 × 4
#> wflow_id info option result
#> <chr> <list> <list> <list>
#> 1 formula_1_linear_reg <tibble [1 × 4]> <opts[1]> <rsmp[+]>
#> 2 formula_2_linear_reg <tibble [1 × 4]> <opts[1]> <rsmp[+]>
#> 3 formula_3_linear_reg <tibble [1 × 4]> <opts[1]> <rsmp[+]>
Created on 2022-08-04 by the reprex package (v2.0.1)
To make many recipes in an iterative fashion, you'll need a bit of metaprogramming such as with rlang. You can write a function to take (in this case) a string and create a recipe:
library(rlang)
my_recipe <- function(outcome) {
form <- new_formula(ensym(outcome), expr(.))
recipe(form, data = mtcars) %>%
step_normalize(all_numeric_predictors())
}
And then you can use this function with purrr::map()
across your outcomes:
library(tidymodels)
library(rlang)
folds <- bootstraps(mtcars, times = 5)
wf_set <- workflow_set(
map(c("mpg", "wt", "disp"), my_recipe),
list(linear_reg())
)
workflow_map(wf_set, "fit_resamples", resamples = folds)
#> # A workflow set/tibble: 3 × 4
#> wflow_id info option result
#> <chr> <list> <list> <list>
#> 1 recipe_1_linear_reg <tibble [1 × 4]> <opts[1]> <rsmp[+]>
#> 2 recipe_2_linear_reg <tibble [1 × 4]> <opts[1]> <rsmp[+]>
#> 3 recipe_3_linear_reg <tibble [1 × 4]> <opts[1]> <rsmp[+]>
Created on 2022-08-04 by the reprex package (v2.0.1)