Hi I have this data frame which contains the start and end dates of healthcare services. It looks like this:
id <- c("A", "B", "B", "C", "D")
start_date <- c("2014-08-18", "2014-08-04", "2016-10-10", "2016-04-01", "2015-12-31")
end_date <- c("2018-09-01", "2019-09-22", "2016-10-18", "2016-05-01", "2016-04-08")
df <- data.frame(id, start_date, end_date)
id start_date end_date
1 A 2014-08-18 2018-09-01
2 B 2014-08-04 2019-09-22
3 B 2016-10-10 2016-10-18
4 C 2016-04-01 2016-05-01
5 D 2015-12-31 2016-04-08
The IDs are not unique because people may had multiple services simultaneously. These services may spanned across many years. However, I would like to calculate numbers of weeks for each service within the financial year 2016, which starts at 2016-04-01 and ends at 2017-03-31. The data frame that I would like to have is:
id start_date end_date wks_FY16
1 A 2014-08-18 2018-09-01 52
2 B 2014-08-04 2019-09-22 52
3 B 2016-10-10 2016-10-18 1
4 C 2016-04-01 2016-05-01 4
5 D 2015-12-31 2016-04-08 1
where "wks_FY16" represents financial year 2016.
I think difftime
should be useful, but I don't know how to specify the period that I need. I will be really grateful for your help on this.
We get the difference in 'weeks' using difftime
after making corrections for the '2016' year
library(dplyr)
df %>%
mutate(across(start_date:end_date, as.Date),
wks_FY16 = as.integer(difftime(pmin(end_date,
as.Date('2017-03-31')), pmax(as.Date('2016-04-01'),
start_date), units = "weeks")))