rtidymodelsr-recipes

Only keep interaction term in recipe formula


I'm trying to create a formula that has a variable interacted with another variable in the final formula, but not the main effects of the variable on its own. I can't figure out how to do this with recipes. In base R I can specify which interactions I want with a colon in the formula, but I don't know how to do this with recipes. I've put together a quick reprex below with roughly what I'm getting at, if anyone has any advice that would be great :)

library(tidymodels)

basic_mod <- lm(Petal.Length ~ Petal.Width + Petal.Width:Species, data = iris)

iris_rec <- recipe(Petal.Length ~ Petal.Width + Species, data = iris) |> 
  step_dummy("Species") |> 
  step_interact(~ Petal.Width:starts_with("Species")) 

formula(iris_rec |> prep()) # This formula includes Species on its own as well as the interaction term
#> Petal.Length ~ Petal.Width + Species_versicolor + Species_virginica + 
#>     Petal.Width_x_Species_versicolor + Petal.Width_x_Species_virginica
#> <environment: 0x127838968>

iris_rec |> 
  remove_role(starts_with("Species"), old_role = "predictor") |> 
  prep() |> 
  formula() # This formula still includes Species on its own
#> Petal.Length ~ Petal.Width + Species_versicolor + Species_virginica + 
#>     Petal.Width_x_Species_versicolor + Petal.Width_x_Species_virginica
#> <environment: 0x1106178a0>

Created on 2022-11-21 with reprex v2.0.2


Solution

  • If I'm following you correctly, you would use step_interact() to make the interactions and the look for the default separator ("_x_") for keeping terms. We ask that you make dummy variables before interactions.

    library(tidymodels)
    
    rec <- 
      recipe(Petal.Length ~ ., data = iris) %>% 
      step_dummy(Species) %>% 
      # dummy indicators, be default, start with {varname}_
      step_interact(~ Petal.Width:starts_with("Species_")) %>% 
      step_select(all_outcomes(), contains("_x_"))
    
    rec %>% prep() %>% bake(new_data = NULL) %>% names()
    #> [1] "Petal.Length"                     "Petal.Width_x_Species_versicolor"
    #> [3] "Petal.Width_x_Species_virginica"
    

    Created on 2022-11-21 by the reprex package (v2.0.1)