Say I have the following dataframes:
DF1 <- data.frame("A" = rep(c("A","B"), 18),
"B" = rep(c("C","D","E"), 12),
"NUM"= rep(rnorm(36,10,1)),
"TEST" = rep(NA,36))
DF2 <- data.frame("A" = rep("A",6),
"B" = rep(c("C","D"),6),
"VAL" = rep(c(1,3),3))
*Note: Each unique combination of variables A
and B
in DF2
should have a unique VAL
.
For each row, I would like to replace the NA
in TEST
with the corresponding value of VAL
in DF1
if the values in columns A
and A
match and the values in columns B
and B
match for that row. Otherwise, I'd leave TEST
as NA
. How would I do this without looping through each combination using match?
Ideally, an answer would scale to two data frames with many columns to match upon.
As Akrun mentioned in comments, your lookup table (DF2) needs to be reduced to just its unique A/B combinations. For your current dataframe, this isn't a problem, but you will need additional rules if there are multiple possible values for the same combination. From there, the solution is easy:
DF2.u <- unique(DF2)
DF3 <- merge(DF1, DF2.u, all = T)
Note that this will produce a new dataframe with an empty TEST column (all values NA
), and a VAL column assigned from DF2. To do exactly what you wanted (replace TEST with VAL where possible), here is some slightly clunkier code:
DF1$TEST <- merge(DF1, DF2.u, all = T)$VAL
EDIT: in response to your question, you can boil down DF2 if necessary quite simple:
DF2$C <- c(1:12) #now unique() won't work
DF2.u <- unique(DF2[1:3])
A B VAL
1 A C 1
2 A D 3