I have a dataframe with the first three columns 'Person', 'NoShow', and 'date'. I want to create the two additional columns 'prior_noshow' and 'prior_noshow_f', but I cannot figure out how to do this using the lag function.
Person NoShow date prior_noshow prior_noshow_f
1 123 0 2019-01-01 NA NA
2 123 1 2019-04-01 0 0
3 123 0 2020-01-01 1 1
4 123 1 2021-01-01 1 1
5 123 0 2021-04-30 1 2
6 123 0 2022-01-05 1 2
7 334 0 2019-07-07 NA NA
8 334 1 2019-10-11 0 0
9 334 1 2020-07-07 1 1
10 334 0 2021-01-04 1 2
Column 'prior_noshow' should show whether a person missed a previous appointment. So, person 123 had their first noshow on 2019-04-01, and I want every following row for person 123 to have prior_noshow==1.
Column 'prior_noshow_f" should show the frequency of previously missed appointments. So, person 123 should have prior_noshow_f==1 for dates 2020-01-01 and 2021-01-01 and prior_noshow_f==2 for dates 2021-04-30 and 2022-01-05.
Code to create the dataset:
df <- data.frame(Person = c('123','123','123','123','123','123','334','334','334','334'),
NoShow = c(0,1,0,1,0,0,0,1,1,0),
date = c('2019-01-01','2019-04-01','2020-01-01','2021-01-01','2021-04-30','2022-01-05','2019-07-07','2019-10-11','2020-07-07','2021-01-04')
)
I've tried different approaches I saw in previous discussions here regarding the lag-function, but I haven't able to figure this out.
Any help would be greatly appreciated!
You need to lag
the cumsum
of NoShow
to get prior_noshow_f
, and prior_noshow
is just as.numeric(prior_noshow_f > 0)
, so you could do:
library(tidyverse)
df %>%
group_by(Person) %>%
mutate(prior_noshow = lag(as.numeric(cumsum(NoShow) > 0)),
prior_noshow_f = lag(cumsum(NoShow)))
#> # A tibble: 10 x 5
#> # Groups: Person [2]
#> Person NoShow date prior_noshow prior_noshow_f
#> <int> <int> <chr> <dbl> <int>
#> 1 123 0 2019-01-01 NA NA
#> 2 123 1 2019-04-01 0 0
#> 3 123 0 2020-01-01 1 1
#> 4 123 1 2021-01-01 1 1
#> 5 123 0 2021-04-30 1 2
#> 6 123 0 2022-01-05 1 2
#> 7 334 0 2019-07-07 NA NA
#> 8 334 1 2019-10-11 0 0
#> 9 334 1 2020-07-07 1 1
#> 10 334 0 2021-01-04 1 2
Created on 2022-08-22 with reprex v2.0.2