I have longitudinal patient data in R. I would like to subset patients in the patid column based on this condition: three or more occurrences within one year period (one year= any 12 months period) code to get the same table:
structure(list(patid = c("1", "1", "1", "1", "2", "2", "3", "3",
"3", "4", "4", "4", "4"), observation_date = c("07/07/2016",
"07/08/2016", "07/11/2016", "07/07/2019", "07/05/2015", "02/12/2016",
"07/05/2015", "07/06/2015", "16/06/2015", "07/05/2015", "02/12/2016",
"18/12/2016", "15/01/2017")), class = "data.frame", row.names = c(NA,
-13L))
Table1:
patid | observation_date |
---|---|
1 | 07/07/2016 |
1 | 07/08/2016 |
1 | 07/11/2016 |
1 | 07/07/2019 |
2 | 07/05/2015 |
2 | 02/12/2016 |
3 | 07/05/2015 |
3 | 07/06/2015 |
3 | 16/06/2015 |
4 | 07/05/2015 |
4 | 02/12/2016 |
4 | 18/12/2016 |
4 | 15/01/2017 |
The expected table would look like this (a list of patids that meet the criteria; 3 or more observations in 12 months interval)
patid |
---|
1 |
3 |
4 |
The answer above was helpful. I figured I could also do this:
#First I only kept patid's with observation dates that are less than 365 days apart
multi_pts <- df %>%
arrange(observation_date) %>%
group_by(patid) %>%
mutate(gap = observation_date - lag(observation_date)) %>%
filter(gap <365)
#then I filtered patients that have at least three codes
count_multi_pts <-
multi_pts %>%
count(patid) %>%
filter(n >= 3)