rtidymodelsr-recipes

How can I pass the retain parameter, when fitting a tidymodels workflow


I can use the retain=TRUE parameter to the prep() function to store the preprocessed train data to the recipe. The help page of prep() highly recommends using workflows. But how can I pass the retain=TRUE parameter in this case?

Here's a toy example.

library(tidymodels)
mod <- linear_reg()
rec <- recipe(displ ~ cyl + drv, data=mpg)  %>%
  step_filter(drv != "r")
wf <- workflow() %>% add_model(mod) %>% add_recipe(rec)
fitted <- wf %>% fit(data=mpg)
fitted %>% extract_recipe() %>% bake(new_data = NULL)

I expected this to return the preprocessed data, but instead I got the following error message:

Error in juice(): Use retain = TRUE in prep() to be able to extract the training set.


Solution

  • Currently, you cannot pass prep via a workflow.

    However you can use

    library(tidymodels)
    mod <- linear_reg()
    rec <-
      recipe(disp ~ cyl + carb, data = mtcars)  %>%
      # add this show that the recipe was processed:
      step_mutate(half_cyl = cyl / 2) %>% 
      step_filter(carb != 4)
    wf <- workflow(rec, mod)
    fitted <- wf %>% fit(data = mtcars)
    fitted %>% 
      extract_recipe() %>% 
      bake(new_data = mtcars)
    #> # A tibble: 32 × 4
    #>      cyl  carb  disp half_cyl
    #>    <dbl> <dbl> <dbl>    <dbl>
    #>  1     6     4  160         3
    #>  2     6     4  160         3
    #>  3     4     1  108         2
    #>  4     6     1  258         3
    #>  5     8     2  360         4
    #>  6     6     1  225         3
    #>  7     8     4  360         4
    #>  8     4     2  147.        2
    #>  9     4     2  141.        2
    #> 10     6     4  168.        3
    #> # ℹ 22 more rows
    

    Created on 2024-10-22 with reprex v2.1.0