tidyversetidyr

pivot_longer from multiple columns into a singular names_to and two values_to


I've spent most of the day on this and finally calling in some help. There are multiple entries here on related questions, but none that quite get at what I'm trying to do.

Below is an example df.

x <- tibble::tribble(
    ~kpu_operating, ~kpu_procedure, ~sf_operating, ~sf_procedure,
                3L,             2L,          572L,          120L,
                3L,             1L,          440L,          121L,
                6L,             NA,          535L,          122L,
                3L,             NA,          542L,          123L
    )

Im trying to pivot longer such that "kpu_" and "sf_" are stripped away. Then I'm trying to collect "operating" and "procedure" from all four columns into a single new column. Lastly, I would like the two columns that r starts_with("kpu_") to be put into a new column called "n_unit" and the two columns that r starts_with("sf_") to be placed into a new column called "sf_unit". It would look something like this:

y <- tibble::tribble(
          ~unit, ~n_unit, ~sf_unit,
    "Operating",      3L,     572L,
    "Operating",      3L,     440L,
    "Operating",      6L,     535L,
    "Operating",      3L,     542L,
    "Procedure",      2L,     120L,
    "Procedure",      1L,     121L,
    "Procedure",      NA,     122L,
    "Procedure",      NA,     123L
    )

I've tried endless variations of this:

x <- x |>
    pivot_longer(cols = starts_with("kpu_"),
                 names_prefix = "kpu_",
                 names_to = "unit",
                 values_to =  "n_unit") |>
    pivot_longer(cols = starts_with("sf_"),
                 names_prefix = "sf_",
                 names_to = "unit2",
                 values_to = "sf_unit")

and another:

x <- x |> pivot_longer(df, cols  = c(starts_with("kpu_", "sf_")),
                 names_prefix = c("kpu_", "sf_"),
                 names_to = "unit",
                 values_to = "n_unit")

Here are some of the similar questions I've studied:

Is there way to pivot_longer to multiple values columns in R?

Using pivot_longer with existing names_to column

tidyverse pivot_longer several sets of columns, but avoid intermediate mutate_wider steps

https://cran.r-project.org/web/packages/tidyr/vignettes/pivot.html

pivot_longer into multiple columns

https://dcl-wrangle.stanford.edu/pivot-advanced.html

Thank you in advance for any help.


Solution

  • This can be done with tidyr::names_pattern() and a bit of regex:

    library(tibble)
    library(dplyr)
    library(tidyr)
    
    x <- tibble::tribble(
      ~kpu_operating, ~kpu_procedure, ~sf_operating, ~sf_procedure,
      3L,             2L,          572L,          120L,
      3L,             1L,          440L,          121L,
      6L,             NA,          535L,          122L,
      3L,             NA,          542L,          123L
    )
    
    
    y <- x |>
      pivot_longer(cols = everything(),
                   names_to = c(".value", "unit"),
                   names_pattern = "^(.*)_(.*)$") |>
      rename(n_unit = "kpu", sf_unit = "sf") |>
      arrange(unit)
    
    y
    # # A tibble: 8 × 3
    #   unit      n_unit sf_unit
    #   <chr>      <int>   <int>
    # 1 operating      3     572
    # 2 operating      3     440
    # 3 operating      6     535
    # 4 operating      3     542
    # 5 procedure      2     120
    # 6 procedure      1     121
    # 7 procedure     NA     122
    # 8 procedure     NA     123