I've spent most of the day on this and finally calling in some help. There are multiple entries here on related questions, but none that quite get at what I'm trying to do.
Below is an example df.
x <- tibble::tribble(
~kpu_operating, ~kpu_procedure, ~sf_operating, ~sf_procedure,
3L, 2L, 572L, 120L,
3L, 1L, 440L, 121L,
6L, NA, 535L, 122L,
3L, NA, 542L, 123L
)
Im trying to pivot longer such that "kpu_" and "sf_" are stripped away. Then I'm trying to collect "operating" and "procedure" from all four columns into a single new column. Lastly, I would like the two columns that r starts_with("kpu_")
to be put into a new column called "n_unit" and the two columns that r starts_with("sf_")
to be placed into a new column called "sf_unit". It would look something like this:
y <- tibble::tribble(
~unit, ~n_unit, ~sf_unit,
"Operating", 3L, 572L,
"Operating", 3L, 440L,
"Operating", 6L, 535L,
"Operating", 3L, 542L,
"Procedure", 2L, 120L,
"Procedure", 1L, 121L,
"Procedure", NA, 122L,
"Procedure", NA, 123L
)
I've tried endless variations of this:
x <- x |>
pivot_longer(cols = starts_with("kpu_"),
names_prefix = "kpu_",
names_to = "unit",
values_to = "n_unit") |>
pivot_longer(cols = starts_with("sf_"),
names_prefix = "sf_",
names_to = "unit2",
values_to = "sf_unit")
and another:
x <- x |> pivot_longer(df, cols = c(starts_with("kpu_", "sf_")),
names_prefix = c("kpu_", "sf_"),
names_to = "unit",
values_to = "n_unit")
Here are some of the similar questions I've studied:
Is there way to pivot_longer to multiple values columns in R?
Using pivot_longer with existing names_to column
tidyverse pivot_longer several sets of columns, but avoid intermediate mutate_wider steps
https://cran.r-project.org/web/packages/tidyr/vignettes/pivot.html
pivot_longer into multiple columns
https://dcl-wrangle.stanford.edu/pivot-advanced.html
Thank you in advance for any help.
This can be done with tidyr::names_pattern()
and a bit of regex:
library(tibble)
library(dplyr)
library(tidyr)
x <- tibble::tribble(
~kpu_operating, ~kpu_procedure, ~sf_operating, ~sf_procedure,
3L, 2L, 572L, 120L,
3L, 1L, 440L, 121L,
6L, NA, 535L, 122L,
3L, NA, 542L, 123L
)
y <- x |>
pivot_longer(cols = everything(),
names_to = c(".value", "unit"),
names_pattern = "^(.*)_(.*)$") |>
rename(n_unit = "kpu", sf_unit = "sf") |>
arrange(unit)
y
# # A tibble: 8 × 3
# unit n_unit sf_unit
# <chr> <int> <int>
# 1 operating 3 572
# 2 operating 3 440
# 3 operating 6 535
# 4 operating 3 542
# 5 procedure 2 120
# 6 procedure 1 121
# 7 procedure NA 122
# 8 procedure NA 123