I'm trying to create a new column that concatenates all values of a specific set of columns that contain a specific string in their names, with a semi-colon separator. I'm working within dplyr
, so I'm seeking a tidyverse
solution.
I attempted to use grepl()
in combination with mutate()
, case_when()
, and paste()
to identify the columns that contained the string I wanted in their names ("Games"
) and concatenate their contents together into a new column. When that failed, I attempted to use str_detect
instead, but that didn't work out.
As far as I can tell, my issue is that I can't correctly instruct the code to evaluate all column names and then return the ones that have a string that contains my specified pattern. I've tried using contains("Games")
, colnames(.x)
, and other variations on those arguments. I know I could do this if I specifically named every column I wished to paste together, but I prefer a relative solution that prevents me from typing out multiple names.
Thank you!
# Sample Data
test<-as_tibble(data.frame(`ID` = c("1","2","3"),
`Gender` = c("Female","Male","Non-Binary"),
`Games_Chess`=c("Chess",NA,"Chess"),
`Games_Clue`=c("Clue",NA,NA),
`Games_Scrabble`=c("Scrabble",NA,"Scrabble")))
# A tibble: 3 × 5
ID Gender Games_Chess Games_Clue Games_Scrabble
<chr> <chr> <chr> <chr> <chr>
1 1 Female Chess Clue Scrabble
2 2 Male NA NA NA
3 3 Non-Binary Chess NA Scrabble
# Desired Output
ID Gender Games_Chess Games_Clue Games_Scrabble Games
1 Female Chess Clue Scrabble Chess; Clue; Scrabble
2 Male NA NA NA NA
3 Non-Binary Chess NA Scrabble Chess; Scrabble
# Attempted Code 1
test<-test%>%
mutate(`Games` = case_when(str_detect(colnames(test),"Games") ~ paste(.x, collapse = ";"), TRUE ~ NA))
# Error Code 1
Error in `mutate()`:
ℹ In argument: `Games = case_when(...)`.
Caused by error in `case_when()`:
! Failed to evaluate the right-hand side of formula 1.
Caused by error:
! object '.x' not found
# Attempted Code 2
test<-test%>%
mutate(`Games` = case_when(grepl("Games",.) ~ paste(., collapse = ";"), TRUE ~ NA))
# Error Code 2
Error in `mutate()`:
ℹ In argument: `Games = case_when(...)`.
Caused by error:
! `Games` must be size 3 or 1, not 4.
Run `rlang::last_trace()` to see where the error occurred.
Though not entirely dplyr
-based, the following solution does still fall under the tidyverse
using tidyr
:
test %>%
tidyr::unite(
# Name of new column
col = "Games",
# Select columns to unite using tidy-select syntax
dplyr::starts_with("Games"),
# Specify semi-colon as separator
sep = "; ",
# Keep original Games_* columns
remove = FALSE,
# Remove NA's prior to concatenation
na.rm = TRUE
)
Which results in the following output:
#> ID Gender Games Games_Chess Games_Clue Games_Scrabble
#> 1 1 Female Chess; Clue; Scrabble Chess Clue Scrabble
#> 2 2 Male <NA> <NA> <NA>
#> 3 3 Non-Binary Chess; Scrabble Chess <NA> Scrabble
Edit: optionally, if you'd like the new column to be at the very end of your data, then you can add the following code at the end using the %>%
pipe operator to move the new column to the end:
dplyr::relocate(
Games,
.after = dplyr::everything()
)