I can use the retain=TRUE
parameter to the prep()
function to store the preprocessed train data to the recipe. The help page of prep()
highly recommends using workflows. But how can I pass the retain=TRUE
parameter in this case?
Here's a toy example.
library(tidymodels)
mod <- linear_reg()
rec <- recipe(displ ~ cyl + drv, data=mpg) %>%
step_filter(drv != "r")
wf <- workflow() %>% add_model(mod) %>% add_recipe(rec)
fitted <- wf %>% fit(data=mpg)
fitted %>% extract_recipe() %>% bake(new_data = NULL)
I expected this to return the preprocessed data, but instead I got the following error message:
Error in
juice()
: Useretain = TRUE
inprep()
to be able to extract the training set.
Currently, you cannot pass prep
via a workflow.
However you can use
library(tidymodels)
mod <- linear_reg()
rec <-
recipe(disp ~ cyl + carb, data = mtcars) %>%
# add this show that the recipe was processed:
step_mutate(half_cyl = cyl / 2) %>%
step_filter(carb != 4)
wf <- workflow(rec, mod)
fitted <- wf %>% fit(data = mtcars)
fitted %>%
extract_recipe() %>%
bake(new_data = mtcars)
#> # A tibble: 32 × 4
#> cyl carb disp half_cyl
#> <dbl> <dbl> <dbl> <dbl>
#> 1 6 4 160 3
#> 2 6 4 160 3
#> 3 4 1 108 2
#> 4 6 1 258 3
#> 5 8 2 360 4
#> 6 6 1 225 3
#> 7 8 4 360 4
#> 8 4 2 147. 2
#> 9 4 2 141. 2
#> 10 6 4 168. 3
#> # ℹ 22 more rows
Created on 2024-10-22 with reprex v2.1.0