I have two dataframes: one in which I would like to replace values (df_1), the other one from which I would like to obtain the values for replacement (df_2). Please consider the example data below:
df_1 <- data.frame(
var=c("xAp", "xBp", "sCp", "sABp", "dBCp", "dCBp"),
A=NA,
B=NA,
C=NA)
df_2 <- data.frame(A=1, B=40, C=25)
If in df_1 the column name occurs in the first column, then I want to replace the value in that column and row by a value from df_2, the value that corresponds to this column name. So imagine cell df_1[1,2]. The column name is A. The value A occurs in the first column (in df_1[1,1]). This means I want to replace the NA value with the value that belongs to A in df_2, which is 1.
If the column name does not occur in the first column, I want it replaced by zero.
As I want to perform this action for every row, I have been thinking about a mutate combined with across. I am however stuck already when trying to extract column names and comparing them to values in the first column.
data.frame(
var=c("xAp", "xBp", "sCp", "sABp", "dBCp", "dCBp"),
A=c(1, 0, 0, 1, 0, 0),
B=c(0, 40, 0, 40, 40, 40),
C=c(0, 0, 25, 0, 25, 25))
It would be great if someone can help out. Thanks!
Here is one option - loop across
the column names of 'df_2', create a condition whether the 'var' column substring exists in (cur_column()
), then return the value of 'df_2' for that corresponding column or else return 0 in case_when
library(dplyr)
library(stringr)
out2 <- df_1 %>%
mutate(across(all_of(names(df_2)),
~ case_when(str_detect(var, cur_column()) ~ df_2[[cur_column()]], TRUE ~ 0)))
-checking with OP's expected
identical(out, out2)
[1] TRUE