When using the step_regex
function to build a recipe for a model, it creates additional columns for certain patterns in the original column. Is there way to exclude the original column from the recipe once I'm done with it?
For example in the example below, the product contains both original description
column and two newly created by step_regex
. I want a solution that's integrated with the recipe
object, so that I can use it directly in the caret::train
.
library(recipe)
data(covers)
rec <- recipe(~ description, covers) %>%
step_regex(description, pattern = "(rock|stony)", result = "rocks") %>%
step_regex(description, pattern = "ratake families")
rec2 <- prep(rec, training = covers)
with_dummies <- bake(rec2, newdata = covers)
Just found the solution. I think I can change the role for the columns I don't want to be used as predictors.
rec <- rec %>% add_role(description, new_role = "dont_use")