I want to merge df1 and df2, by pulling in only the numerics from df2 into df1? This would be a nested (Lookup(xlookup) in excel, but Im having difficulty working this in r? Any info would be much appreciated.
df1 <- data.frame(Names = c("A","B","C","D","E"),
Rank = c("R1","R3","R4","R2","R5"),
Time_in_rank = c(2,4.25,3,1.5,5))
df2 <- data.frame(Time_in_rank =c(0,1,2,3,4),
R1 =c(20000,25000,30000,35000,40000),
R2 =c(45000,50000,55000,60000,65000),
R3 =c(70000,75000,80000,85000,90000),
R4 =c(95000,96000,97000,98000,100000),
R5 =c(105000,107000,109000,111000,112000))
Desired output
Names Time_in_rank Rank Salary
A 2 R1 30000
B 4.25 R3 90000
C 3 R4 98000
Close but no cigar - Merge two data frames considering a range match between key columns
Close but still no cigar - Complex non-equi merge in R
You could use a tidyverse
approach:
library(dplyr)
library(tidyr)
df1 %>%
mutate(time_floor = floor(Time_in_rank)) %>%
left_join(df2 %>%
pivot_longer(-Time_in_rank, names_to = "Rank", values_to = "Salary"),
by = c("Rank", "time_floor" = "Time_in_rank")) %>%
select(-time_floor)
This returns
Names Rank Time_in_rank Salary
1 A R1 2.00 30000
2 B R3 4.25 90000
3 C R4 3.00 98000
4 D R2 1.50 50000
5 E R5 5.00 NA
The main ideas are:
df1
's Time_in_rank
rounded down corresponds to Time_in_rank
used in df2
.df2
is converted into a long format to join by time and rank.