For this question you need to copy the code below to get the df. I have repeat patient data of 5 patients. Each row represents a consultation with a consultation date (consdate). The follow up time for each patient is from regstartdate till end_date. My date of interest is portal_reg_date. I would like to only keep rows that meet this criteria: (1) Keep rows where consdate occurs 1 year before portal_reg_date (2) keep rows where consdate occurs 1 year after portal_reg_date (3) among those that meet criteria (1) and (2) only keep rows that have 1 year before AND 1 year after portal_reg_date that is within regstartdate AND end_date [so only keep patids that have 1 year follow up before and after portal_reg_date
dfr <- tibble::tribble(
~patid, ~consdate, ~portal_reg_date, ~regstartdate, ~end_date,
1, "2020-02-26", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2015-06-19", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-06-28", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2020-03-02", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-04-11", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-06-02", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-05-23", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-05-21", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-06-18", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-04-08", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-06-24", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2020-10-17", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-07-03", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2011-06-02", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-02-10", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2013-12-09", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-01-31", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-09-30", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2021-05-17", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-11-04", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2011-04-27", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-01-07", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-01-31", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-03-13", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2016-12-22", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-09-01", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-07-31", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-02-03", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-01-10", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-10-02", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-10-30", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-02-24", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2016-12-21", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-02-28", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-05-20", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-04-08", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-01-12", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-06-06", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-10-31", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-03-11", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2016-12-08", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-07-13", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-05-09", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-03-20", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-10-29", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-02-27", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2015-10-05", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-05-06", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-05-09", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-09-12", "2017-06-19", "2010-10-06", "2021-08-15"
)
As an example these are the rows that would qualify:
*If consdate is on: 2020-02-26 and portal_reg_date is on: 2017-06-19 This wouldn't qualify because consdate is more than a year after portal_reg_date
*If consdate is on: 2017-04-11 and portal_reg_date is on: 2017-06-19 This would qualify because consdate is within 1 year before portal_reg_date. also regstartdate is on 2010-10-06 and end_date is on 2021-08-15 which include both the consdate and portal_reg_date.
*If consdate is on: 2018-06-18 and portal_reg_date is on: 2017-06-19 This would qualify because consdate is within 1 year after portal_reg_date also regstartdate is on 2010-10-06 and end_date is on 2021-08-15 which include both the consdate and portal_reg_date.
Here is a solution using some verbs from the tidyverse. Fewer rows are needed to reproduce the problem (here 50) and that also ease the reading of the problem.
# To export the data, use head to fetch few rows
# + converting date as character
# dfr_raw <- head(df %>% mutate(across(where(is.Date), as.character)), 50)
# datapasta::tribble_paste(dfr_raw)
dfr <- tibble::tribble(
~patid, ~consdate, ~portal_reg_date, ~regstartdate, ~end_date,
1, "2020-02-26", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2015-06-19", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-06-28", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2020-03-02", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-04-11", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-06-02", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-05-23", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-05-21", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-06-18", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-04-08", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-06-24", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2020-10-17", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-07-03", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2011-06-02", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-02-10", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2013-12-09", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-01-31", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-09-30", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2021-05-17", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-11-04", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2011-04-27", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-01-07", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-01-31", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-03-13", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2016-12-22", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-09-01", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-07-31", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-02-03", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-01-10", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-10-02", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-10-30", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-02-24", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2016-12-21", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-02-28", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-05-20", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-04-08", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-01-12", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-06-06", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-10-31", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-03-11", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2016-12-08", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2018-07-13", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2017-05-09", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-03-20", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-10-29", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-02-27", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2015-10-05", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2014-05-06", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-05-09", "2017-06-19", "2010-10-06", "2021-08-15",
1, "2019-09-12", "2017-06-19", "2010-10-06", "2021-08-15"
)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
dfr <- dfr %>% mutate_at(-1, as_date)
treshlod <- dyears(1)
dfr %>%
arrange_all() %>%
mutate(diff = consdate - portal_reg_date) %>%
filter(
consdate - treshlod < portal_reg_date & portal_reg_date < consdate + treshlod,
regstartdate < consdate & consdate < end_date
)
#> # A tibble: 15 × 6
#> patid consdate portal_reg_date regstartdate end_date diff
#> <dbl> <date> <date> <date> <date> <drtn>
#> 1 1 2016-12-08 2017-06-19 2010-10-06 2021-08-15 -193 days
#> 2 1 2016-12-21 2017-06-19 2010-10-06 2021-08-15 -180 days
#> 3 1 2016-12-22 2017-06-19 2010-10-06 2021-08-15 -179 days
#> 4 1 2017-01-10 2017-06-19 2010-10-06 2021-08-15 -160 days
#> 5 1 2017-01-31 2017-06-19 2010-10-06 2021-08-15 -139 days
#> 6 1 2017-02-10 2017-06-19 2010-10-06 2021-08-15 -129 days
#> 7 1 2017-02-24 2017-06-19 2010-10-06 2021-08-15 -115 days
#> 8 1 2017-03-13 2017-06-19 2010-10-06 2021-08-15 -98 days
#> 9 1 2017-04-11 2017-06-19 2010-10-06 2021-08-15 -69 days
#> 10 1 2017-05-09 2017-06-19 2010-10-06 2021-08-15 -41 days
#> 11 1 2017-10-30 2017-06-19 2010-10-06 2021-08-15 133 days
#> 12 1 2018-01-12 2017-06-19 2010-10-06 2021-08-15 207 days
#> 13 1 2018-05-21 2017-06-19 2010-10-06 2021-08-15 336 days
#> 14 1 2018-05-23 2017-06-19 2010-10-06 2021-08-15 338 days
#> 15 1 2018-06-18 2017-06-19 2010-10-06 2021-08-15 364 days
Created on 2022-09-28 with reprex v2.0.2