I created an custom function to add working days to a date. The function depends on the following packages:
library(tidyverse)
library(lubridate)
library(tidyquant)
This is the function I created:
add_workingdays <- function(start_date, number_of_days, switch_count_weekendsholidays = TRUE, remove_weekends = TRUE, holidays = NULL){
start_date <- start_date %>% as.Date()
if (!is.Date(start_date)) stop("add_workingdays(): start_date must be a date.", call. = FALSE)
target_date <- start_date + number_of_days
if(switch_count_weekendsholidays){
target_date_lenght <- tidyquant::WORKDAY_SEQUENCE(start_date, target_date, remove_weekends, holidays = holidays) %>% length()
while(target_date_lenght != number_of_days) {
target_date <- target_date + 1
target_date_lenght <- tidyquant::WORKDAY_SEQUENCE(start_date, target_date, remove_weekends, holidays = holidays) %>% length()
}
}
target_date %>% return()
}
When I run the function in the following scenario, it works without problems.
add_workingdays(start_date = '2022-04-08' %>% as.Date(), number_of_days = 5)
[1] "2022-04-14"
'2022-04-08' %>% as.Date() %>% add_workingdays(number_of_days = 5)
[1] "2022-04-14"
But when I try to use it within a mutate
function in a tibble, I get error messages I do not understand.
I use the following code and it gives the error at the end:
tibble(
+ dates = rep('2022-04-08' %>% as.Date()), #) seq.Date(from = '2022-04-08' %>% as.Date(), by = 'days', length.out = 5),
+ days_to_add = rep(10:5)
+ ) %>%
+ print() %>%
+ mutate(
+ target_date = add_workingdays(start_date = dates, number_of_days = days_to_add)
+ )
# A tibble: 6 x 2
dates days_to_add
<date> <int>
1 2022-04-08 10
2 2022-04-08 9
3 2022-04-08 8
4 2022-04-08 7
5 2022-04-08 6
6 2022-04-08 5
Error in `mutate()`:
! Problem while computing `target_date =
add_workingdays(start_date = dates, number_of_days =
days_to_add)`.
Caused by error in `seq.Date()`:
! 'from' must be of length 1
Run `rlang::last_error()` to see where the error occurred.
Can anyone explain to me what I do wrong when using this custom function within a mutate function?
The error has nothing to do with mutate
. It comes from the improper input for the argument in the tidyquant::WORKDAY_SEQUENCE
function. If you check this function by printing it to the console, you get this:
tidyquant::WORKDAY_SEQUENCE #with no parens
#function (start_date, end_date, remove_weekends = TRUE, holidays = NULL)
#{
# day_sequence <- DATE_SEQUENCE(start_date, end_date, by = "day")
# ret_tbl <- tibble::tibble(day_sequence = day_sequence) %>%
# ...#and more
which shows that this function uses DATE_SEQUENCE
function. Likewise, if you check this function, you'll find that it uses seq.Date
function, which requires a single date for the from
argument. For example:
seq.Date(from = as.Date("2020-01-01"), to = as.Date("2020-01-03"), by = 'days')
#[1] "2020-01-01" "2020-01-02" "2020-01-03"
WORKDAY_SEQUENCE("2020-01-01", "2020-01-03")
#[1] "2020-01-01" "2020-01-02" "2020-01-03"
If you entry more than one date data to from
, you get the same error:
seq.Date(from = c(as.Date("2020-01-01"), as.Date("2020-01-02")), to = as.Date("2020-01-03"), by = 'days')
#Error in seq.Date(c(as.Date("2020-01-01"), as.Date("2020-01-02")), #as.Date("2020-01-03"), :
# 'from' must be of length 1
WORKDAY_SEQUENCE(c("2020-01-01", "2020-01-02"), "2020-01-03")
#Error in seq.Date(from = AS_DATE(start_date), to = AS_DATE(end_date), :
# 'from' must be of length 1
To avoid this error, when you want to entry a column for start_date
, you have to vectorize the WORKDAY_SEQUENCE
function, which means you apply the function to each date in the column. As @danloo shows, you can use Vectorize
. You can also use *apply
family.
Case 1 : multiple dates for start_date
and a single date forend_date
.
In this case, you can use lapply
or sapply
.
lapply(c("2020-01-01", "2020-01-02"), WORKDAY_SEQUENCE, end_date = "2020-01-03")
#[[1]]
#[1] "2020-01-01" "2020-01-02" "2020-01-03"
#[[2]]
#[1] "2020-01-02" "2020-01-03"
You also need to vectorize length()
function. Because lapply
returns a list, if you need to return a vector so that it can be used to create a column in a data frame, you can use unlist()
:
c("2020-01-01", "2020-01-02") %>%
lapply(WORKDAY_SEQUENCE, end_date = "2020-01-03") %>%
lapply(length) %>%
unlist()
#[1] 3 2
Hence, if you'd like to follow the lapply
method above, the relevant line in your function should be revised from
target_date_lenght <- tidyquant::WORKDAY_SEQUENCE(start_date,
target_date, remove_weekends,
holidays = holidays) %>%
length()
to
target_date_length <- start_date %>%
lapply(tidyquant::WORKDAY_SEQUENCE,
end_date = target_date,
remove_weekends = remove_weekends,
holidays = holidays) %>%
lapply(length) %>%
unlist()
Case 2 : multiple dates for both of start_date
and end_date
In this case, you can use Map
or mapply
target_date_length <- mapply(tidyquant::WORKDAY_SEQUENCE,
start_date = start_date,
end_date = end_date,
remove_weekends = remove_weekends,
holidays = holidays)%>%
lapply(length) %>%
unlist()