I'd like to create features from an ispline in tidymodels. One way to do this would be to use step_mutate
as follows:
library(tidymodels)
library(tidyverse)
library(splines2)
data <- data.frame(x = seq(0, 10, 0.1))
rec <- recipe(.~x, data=data) %>%
step_mutate(
model.matrix(~isp(x, df=5))
) %>%
prep()
bake(rec, new_data = data)
#> # A tibble: 101 × 2
#> x model.matrix(~isp(x, df = 5…¹ [,"isp(x, df = 5)1"] [,"isp(x, df = 5)2"]
#> <dbl> <dbl> <dbl> <dbl>
#> 1 0 1 0 0
#> 2 0.1 1 0.0776 0.00118
#> 3 0.2 1 0.151 0.00461
#> 4 0.3 1 0.219 0.0102
#> 5 0.4 1 0.284 0.0177
#> 6 0.5 1 0.344 0.0271
#> 7 0.6 1 0.400 0.0382
#> 8 0.7 1 0.453 0.0509
#> 9 0.8 1 0.502 0.0651
#> 10 0.9 1 0.548 0.0806
#> # ℹ 91 more rows
#> # ℹ abbreviated name: ¹`model.matrix(~isp(x, df = 5))`[,"(Intercept)"]
#> # ℹ 1 more variable: `model.matrix(~isp(x, df = 5))`[4:6] <dbl>
Created on 2024-06-29 with reprex v2.1.0
This is, however, a 101x2 dataframe. The spline features are somehow nested
bake(rec, new_data = data) %>%
dim
#> [1] 101 2
Created on 2024-06-29 with reprex v2.1.0
Is there a way I can use step_mutate
or other tidymodels functions to get each feature as a column right out from the call to bake
?
The short answer is no.
Long answer. Ideally you would only use step_mutate()
for single column creation. If you want to create multiple columns at a time, then I would suggest that you go ahead and write your own custom step.
For this specific issue, if you are wanting isplines using the {splines2} package, then you can use step_spline_monotone() which does just that.
This also illustrates why step_mutate()
isn't going to help here. Using splines like this should be a learned task, hence you need to save some information such that you can apply the same transformation to the new data.