I have an R data frame that has an ID column with multiple records for an ID. When the flag is set to 1 for an ID, I want to create a column new timeline that starts from 1 and increases sequentially in increments of 6 (1,6,12...). How can I achieve this in R using dplyr ?
Below is a sample data frame
ID | Timepoint | Flag |
---|---|---|
A | 0 | 0 |
A | 6 | 0 |
A | 12 | 0 |
A | 18 | 1 |
A | 24 | 0 |
A | 30 | 0 |
A | 36 | 0 |
Expected Dataframe
ID | Timepoint | Flag | New_Timepoint |
---|---|---|---|
A | 0 | 0 | |
A | 6 | 0 | |
A | 12 | 0 | |
A | 18 | 1 | 1 |
A | 24 | 0 | 6 |
A | 30 | 0 | 12 |
A | 36 | 0 | 18 |
Another dplyr
option
df %>%
group_by(ID) %>%
mutate(New_Timepoint = pmax(1, Timepoint - c(NA, Timepoint[Flag == 1])[cumsum(Flag) + 1])) %>%
ungroup()
gives
ID Timepoint Flag New_Timepoint
<chr> <int> <int> <dbl>
1 A 0 0 NA
2 A 6 0 NA
3 A 12 0 NA
4 A 18 1 1
5 A 24 0 6
6 A 30 0 12
7 A 36 0 18