I trying to convert dates inside a dataframe to business days using bizdays package. This dataframe may have some missing values (NA), so I added an ifelse statement to ignore these empty cells, but it seems like it breaks the code and I don't know why.
This is a small example of the error:
library(bizdays)
library(dplyr)
holidays <- c("2022-03-01",
"2022-03-07",
"2022-03-08",
"2022-03-25")
start_date = as.Date("01/01/2010", format = "%d/%m/%Y")
end_date = as.Date("01/01/2060", format = "%d/%m/%Y")
calendar <- create.calendar("my_cal",
holidays = holidays,
weekdays =c("saturday", "sunday"),
start.date = start_date,
end.date = end_date)
bizdays.options$set(default.calendar="my_cal")
date_1 <- "2022-03-13" # sunday
print(adjust.previous(date_1)) # friday "2022-03-11"
days <- c()
for (i in c(1:31)) {
days <- c(days, paste("2022-03-", formatC(i, width = 2, flag = '0'), sep = ""))
}
df <- data.frame(days = days)
df_1 <- df %>% mutate(days_1 = adjust.previous(days))
head(df_1) # correct
# days days_1
#1 2022-03-01 2022-02-28
#2 2022-03-02 2022-03-02
#3 2022-03-03 2022-03-03
#4 2022-03-04 2022-03-04
#5 2022-03-05 2022-03-04
#6 2022-03-06 2022-03-04
df_2 <- df %>% mutate(days_2 = ifelse(is.na(days),
days,
adjust.previous(days)))
head(df_2) # date is converted to a number
# days days_2
#1 2022-03-01 19051
#2 2022-03-02 19053
#3 2022-03-03 19054
#4 2022-03-04 19055
#5 2022-03-05 19055
#6 2022-03-06 19055
This is not to do with the bizdays
package but rather how ifelse()
returns objects of class Date
as numeric. See this example:
class(Sys.Date()) # Date
ifelse(TRUE, Sys.Date(), Sys.Date()) # 19066
class(ifelse(TRUE, Sys.Date(), Sys.Date())) # numeric
Conversely:
if(TRUE) class(Sys.Date()) # Date
In your case, it seems to me that the ifelse()
is unnecessary as adjust.previous
handles NA
values:
df$days[1] = NA
df_2 <- df %>% mutate(
days_2 = adjust.previous(days)
)
# Seems to work
head(df_2)
# days days_2
# 1 <NA> <NA>
# 2 2022-03-02 2022-03-02
# 3 2022-03-03 2022-03-03
# 4 2022-03-04 2022-03-04
# 5 2022-03-05 2022-03-04
# 6 2022-03-06 2022-03-04
However, if this isn't working for your real data, I would just leave dplyr
world, which is great but slightly weaker when subsetting columns, and do it in base R:
df_3 <- df
df_3$days_3 <- as.Date(0, origin = "1970-01-01") # Create date column
df_3$days_3[is.na(df_3$days)] <- NA # Fill NA
df_3$days_3[!is.na(df_3$days)] <- adjust.previous(df_3$days[!is.na(df_3$days)]) # Fill values
# Output as above
head(df_3)
# days days_3
# 1 <NA> <NA>
# 2 2022-03-02 2022-03-02
# 3 2022-03-03 2022-03-03
# 4 2022-03-04 2022-03-04
# 5 2022-03-05 2022-03-04
# 6 2022-03-06 2022-03-04