I have longitudinal patient data in R. I would like to subset patients in the patid column based on this condition: three or more occurrences within one year period (one year= any 12 months period)
Table1:
patid | observation_date |
---|---|
1 | 07/07/2016 |
1 | 07/08/2016 |
1 | 07/11/2016 |
1 | 07/07/2019 |
2 | 07/05/2015 |
2 | 02/12/2016 |
3 | 07/05/2015 |
3 | 07/06/2015 |
3 | 16/06/2015 |
4 | 07/05/2015 |
4 | 02/12/2016 |
4 | 18/12/2016 |
4 | 15/01/2017 |
library(tidyverse)
library(lubridate)
df <- read_table(
"patid observation_date
1 07/07/2016
1 07/08/2016
1 07/11/2016
1 07/07/2019
2 07/05/2015
2 02/12/2016
3 07/05/2015
3 07/06/2015
3 16/06/2015
4 07/05/2015
4 02/12/2016
4 18/12/2016
4 15/01/2017"
) %>%
mutate(observation_date = observation_date %>%
as.Date("%d/%m/%Y"))
df %>%
count(patid, year = year(observation_date)) %>%
filter(n >= 3)
# A tibble: 2 x 3
patid year n
<dbl> <dbl> <int>
1 1 2016 3
2 3 2015 3