I have 18 pairs of variable and I would like to do pair-wise math on them to calculate 18 new variables. The across() function in dplyr is quite handy when applying a formula to one column. Is there a way to apply across() to pairs of columns?
Tiny example with simple division of 2 variables (my actual code will be more complex, some ifelse, ...):
library(tidyverse)
library(glue)
# filler data
df <- data.frame("label" = c('a','b','c','d'),
"A" = c(4, 3, 8, 9),
"B" = c(10, 0, 4, 1),
"error_A" = c(0.4, 0.3, 0.2, 0.1),
"error_B" = c(0.3, 0, 0.4, 0.1))
# what I want to have in the end
# instead of just 2 (A, B), I have 18
df1 <- df %>% mutate(
'R_A' = A/error_A,
'R_B' = B/error_B
)
# what I'm thinking about doing to use both variables A and error_A to calculate the new column
df2 <- df %>% mutate(
across(c('A','B'),
~.x/{HOW DO I USE THE COLUMN WHOSE NAME IS glue('error_',.x)}
.names = 'R_{.col}'
)
One option is map/reduce
. Specify the columns of interest ('nm1'), loop over them in map
, select
those columns from the dataset, reduce
by dividing, rename
the columns after column binding (_dfc
), and bind those with the original dataset
library(dplyr)
library(purrr)
library(stringr)
nm1 <- c('A', 'B')
map_dfc(nm1, ~ df %>%
select(ends_with(.x)) %>%
reduce(., `/`) ) %>%
rename_all(~ str_c('R_', nm1)) %>%
bind_cols(df, .)
-output
# label A B error_A error_B R_A R_B
#1 a 4 10 0.4 0.3 10 33.33333
#2 b 3 0 0.3 0.0 10 NaN
#3 c 8 4 0.2 0.4 40 10.00000
#4 d 9 1 0.1 0.1 90 10.00000
Or another option with across
df %>%
mutate(across(c(A, B), ~
./get(str_c('error_', cur_column() )), .names = 'R_{.col}' ))
# label A B error_A error_B R_A R_B
#1 a 4 10 0.4 0.3 10 33.33333
#2 b 3 0 0.3 0.0 10 NaN
#3 c 8 4 0.2 0.4 40 10.00000
#4 d 9 1 0.1 0.1 90 10.00000