rrepeatlongitudinal

how to filter repeat data based on repeated id and date


I have longitudinal patient data in R. I would like to subset patients in the patid column based on this condition: three or more occurrences within one year period (one year= any 12 months period)

Table1:

patid observation_date
1 07/07/2016
1 07/08/2016
1 07/11/2016
1 07/07/2019
2 07/05/2015
2 02/12/2016
3 07/05/2015
3 07/06/2015
3 16/06/2015
4 07/05/2015
4 02/12/2016
4 18/12/2016
4 15/01/2017

Solution

  • library(tidyverse)
    library(lubridate)
    
    df <- read_table(
      "patid    observation_date
    1   07/07/2016
    1   07/08/2016
    1   07/11/2016
    1   07/07/2019
    2   07/05/2015
    2   02/12/2016
    3   07/05/2015
    3   07/06/2015
    3   16/06/2015
    4   07/05/2015
    4   02/12/2016
    4   18/12/2016
    4   15/01/2017"
    ) %>%
      mutate(observation_date = observation_date %>%
               as.Date("%d/%m/%Y"))
    
    df %>%  
      count(patid, year = year(observation_date)) %>% 
      filter(n >= 3)
    
    # A tibble: 2 x 3
      patid  year     n
      <dbl> <dbl> <int>
    1     1  2016     3
    2     3  2015     3