I have dataframe, df_master, which is comprised of longitudinal data for multiple patients ("record_id") where each row represents a day of follow-up ("time").
A second data frame has relapse time points for patients were they to relapse. In this toy example, patient A relapses once at time = 2, patient B relapses at time = 1 and time = 3, and patient C never relapses.
Toy data:
library(tidyverse)
record_id <- c("A", "A", "A", "B", "B", "B", "B", "C", "C", "C", "C", "C")
time <- c(1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5)
df_master <- tibble(record_id, time)
record_id <- c("A", "B", "B")
day_relapse <- c(2, 1, 3)
df_relapses <- tibble(record_id, day_relapse)
Our two dataframes look like this:
> print(df_master)
# A tibble: 12 × 2
record_id time
<chr> <dbl>
1 A 1
2 A 2
3 A 3
4 B 1
5 B 2
6 B 3
7 B 4
8 C 1
9 C 2
10 C 3
11 C 4
12 C 5
> print(df_relapses)
# A tibble: 3 × 2
record_id day_relapse
<chr> <dbl>
1 A 2
2 B 1
3 B 3
I'd like to create a merged dataframe that includes all of the data from df_master where the variable day_relapse is added to the new dataframe, df_merged, where the value for day_relapse is NA except for when time == day_relapse for a given record_id in which case day_relapse will equal the time value. The merged dataframe should look like this:
# A tibble: 12 × 3
record_id time day_relapse
<chr> <dbl> <dbl>
1 A 1 NA
2 A 2 2
3 A 3 NA
4 B 1 1
5 B 2 NA
6 B 3 3
7 B 4 NA
8 C 1 NA
9 C 2 NA
10 C 3 NA
11 C 4 NA
12 C 5 NA
How do I generate the merged dataframe? The strange formatting of the merged df is required for a type of swim plot that I am creating. Thanks in advance.
We could duplicate the day_relapse column with the same name (time
) as the other table we want to join with. Then we'll get day_relapse
when it's available in the 2nd table and NA when it's not.
df_master |>
left_join(df_relapses |> mutate(time = day_relapse))
Result
Joining with `by = join_by(record_id, time)`
# A tibble: 12 × 3
record_id time day_relapse
<chr> <dbl> <dbl>
1 A 1 NA
2 A 2 2
3 A 3 NA
4 B 1 1
5 B 2 NA
6 B 3 3
7 B 4 NA
8 C 1 NA
9 C 2 NA
10 C 3 NA
11 C 4 NA
12 C 5 NA