rdplyrtidyr

Most idiomatic way to mutate multiple similar columns?


I'm generating multiple columns in a mutate() call, all by using a function of 1) an already-existing column and 2) some value which is different for each output column. The following code produces what I want, but it smells:

df <- tibble(base_string = c("a", "b", "c"))
df_desired_result <- df |>
  mutate(
    one = str_c(base_string, "1"),
    two = str_c(base_string, "2"),
    three = str_c(base_string, "3")
  )
df_desired_result
# A tibble: 3 × 4
  base_string one   two   three
  <chr>       <chr> <chr> <chr>
1 a           a1    a2    a3   
2 b           b1    b2    b3   
3 c           c1    c2    c3   

If there were many other columns, this would be a bad solution.

The best improvement I've come up with is:

df_also_desired_result <- df |>
  expand_grid(
    tibble(
      number_name = c("one", "two", "three"),
      number_string = c("1", "2", "3")
    )
  ) |>
  mutate(final_string = str_c(base_string, number_string)) |>
  pivot_wider(
    id_cols = base_string,
    names_from = number_name,
    values_from = final_string
  )
df_also_desired_result
# A tibble: 3 × 4
  base_string one   two   three
  <chr>       <chr> <chr> <chr>
1 a           a1    a2    a3   
2 b           b1    b2    b3   
3 c           c1    c2    c3   

But this seems too verbose. Would love any suggestions on a nicer way to do this.


Solution

  • Thanks all for the answers. Edward's is so nice that I previously accepted it, but changed my mind since I really wanted something using the Tidyverse. Today I ended up getting the kind of thing I was originally looking for by creating constant columns for the numbers and then using across():

    df <- tibble(base_string = c("a", "b", "c"))
    df |>
      mutate(
        one = "1", two = "2", three = "3",
        across(one:three, \(x) str_c(base_string, x))
      )
    

    And if the column definitions live in a named vector, you can use !!! to splice the arguments.

    column_definitions <- c(one = "1", two = "2", three = "3")
    df |>
      mutate(
        !!!column_definitions,
        across(names(column_definitions), \(x) str_c(base_string, x))
      )