I have a process where I am merging dataframes together, and then generating an ID variable by concatenating some of the columns together. However, the names of the columns that need to be concatenated are in a character vector.
I can do this in separate steps like this:
new <- merge(old1, old2, by = "MergeVar")
id <- do.call(paste0, new[idcols])
new$id <- id
However, I'd like to do this using pipes. If the columns were fixed, then I know I could do something like this:
new <- merge(old1, old2, by = "MergeVar") %>%
mutate(id = paste0(colA, colB))
Is there a nice way to merge these two approaches together so that I can keep the pipe structure but reference the column names dynamically?
Assume the test data shown in the Note at the end. BOD
comes with R.
1) With base R we use merge
and then create a list with a single x
component holding the merge
result and then use that in the last leg of the pipe.
old1 |>
merge(old2, by = by_var) |>
list(x = _) |>
with(transform(x, id = do.call("paste0", x[idcols])))
## Time demand.x demand.y id
## 1 1 8.3 8.3 8.38.3
## 2 2 10.3 10.3 10.310.3
## 3 3 19.0 19.0 1919
2) With dplyr we can use left_join
and use !!
and !!!
to insert the variables.
library(dplyr)
old1 %>%
left_join(old2, join_by(!!sym(by_var))) %>%
mutate(id = paste0(!!!syms(idcols)))
## Time demand.x demand.y id
## 1 1 8.3 8.3 8.38.3
## 2 2 10.3 10.3 10.310.3
## 3 3 19.0 19.0 1919
3) With magrittr but no other packages we can rewrite (1) as:
library(magrittr)
old1 %>%
merge(old2, by = by_var) %>%
transform(id = do.call("paste0", .[idcols]))
## Time demand.x demand.y id
## 1 1 8.3 8.3 8.38.3
## 2 2 10.3 10.3 10.310.3
## 3 3 19.0 19.0 1919
old1 <- old2 <- head(BOD, 3)
by_var <- "Time"
idcols <- c("demand.x", "demand.y")