rdatetidyverseamr

How can i filter rows by one condition (id) which are repeated in two days


i have a df, which containing 4 columns:

  1. microbiological data (mo),
  2. sample_date,
  3. and patient id (pid_code),
  4. id (for each rows).

Its looks like this:

id mo sample_date pid_code
1 Staphylococcus epidermidis 2019-12-13 120
2 Staphylococcus epidermidis 2019-12-14 120
3 Staphylococcus capitis 2020-01-13 131
4 Staphylococcus hominis 2021-11-14 155
5 Staphylococcus hominis 2021-11-17 155
6 Micrococcus luteus 2023-06-06 133
7 Micrococcus luteus 2023-06-07 133

i want to filter those rows where the pid_code and mo is same, and repeated in 2 days.

So i want to get back these rows.

id mo sample_date pid_code
1 Staphylococcus epidermidis 2019-12-13 120
2 Staphylococcus epidermidis 2019-12-14 120
6 Micrococcus luteus 2023-06-06 133
7 Micrococcus luteus 2023-06-07 133

Do you have something advise? Thanks


Solution

  • Here is a dplyr solution:

    library(dplyr)
    df %>% 
      filter(any(abs(difftime(sample_date, lag(sample_date), units = "days")) <= 2), .by=c(mo, pid_code))
    
                  id          mo sample_date pid_code
    1 Staphylococcus epidermidis  2019-12-13      120
    2 Staphylococcus epidermidis  2019-12-14      120
    6    Micrococcus      luteus  2023-06-06      133
    7    Micrococcus      luteus  2023-06-07      133