rdplyrtidymodels

step_rename does not work like dplyr::rename


With dplyr::rename I can rename columns if they exist:

library(dplyr)
df <- data.frame(a_old = 1:3, b_new = 11:13)
lkp <- c(a_new = "a_old", b_new = "b_old")
df %>% rename(any_of(lkp))

#   a_new b_new
# 1     1    11
# 2     2    12
# 3     3    13

Now I want to achieve exactly the same behaviour with recipes::step_rename but I get strange column names:

library(recipes)
recipe(df) %>%
  step_rename(any_of(lkp)) %>%
  prep() %>%
  bake(new_data = NULL)
# # A tibble: 3 x 2
#   `any_of(lkp)...a_new` b_new
#                   <int> <int>
# 1                     1    11
# 2                     2    12
# 3                     3    13

How can I replicate the dplyr::rename functionality with step_rename?


Solution

  • It appears that step_rename doesn't honor the named-shortcut supported by any_of and dplyr::rename. (This is further suggested by the step_*_at functions which mimic since-superseded mechanisms in dplyr.)

    We can reproduce the desired effect using step_rename_at.

    df <- structure(list(a_old = 1:3, b_new = 11:13), class = "data.frame", row.names = c(NA, -3L))
    lkp <- c(a_new = "a_old", b_new = "b_old")
    library(recipe)
    library(dplyr)
    
    recipe(~., data = df) |>
      step_rename_at(any_of(unname(lkp)), fn = ~ names(lkp)[ match(.x, lkp) ]) |>
      prep() |>
      bake(new_data=NULL)
    # # A tibble: 3 × 2
    #   a_new b_new
    #   <int> <int>
    # 1     1    11
    # 2     2    12
    # 3     3    13
    

    There are other ways to do the "replace the value with the name" that I'm using names(.)[ match(.,.) ] for, feel free to adapt to your preferences. The point of my suggestion is migrating to step_rename_at().