rdatedatetimeweekend

R: Difference between two dates excluding weekends and given list of holidays specific to state


Extending this question from this post

Difference between two dates excluding weekends and given list of holidays in R

What if i have holidays specific to states. How do i incorporate holidays by state?

holiday <- data.frame(h = as.Date(c("2016/05/3", "2016/05/3"),'%Y/%m/%d'), 
                      s = c('state1','state2'),
                      stringsAsFactors=FALSE
                      )

d <- data.frame(d1 = as.Date(c('2016/05/2','2016/05/2'),'%Y/%m/%d'),
                d2= as.Date(c('2016/05/10','2016/05/10'),'%Y/%m/%d'),
                s = c('state1','state3'),
                stringsAsFactors=FALSE)

produces

> d
          d1         d2      s
1 2016-05-02 2016-05-10 state1
2 2016-05-02 2016-05-10 state3
> holiday
           h      s
1 2016-05-03 state1
2 2016-05-03 state2

The solution offered in the link is holiday specific to dates, but doesn't take into account the states. How can i incorporate the state.

Desired state is an extra column in d which calculates the days excluding weekends and holidays specific to states


Solution

  • Here is an inelegant solution to my question. hope you find this useful.

    library('dplyr')
    
    holiday <- data.frame(h = as.Date(c("2016/05/3", "2016/05/3"),'%Y/%m/%d'), 
                          s = c('state1','state2'),
                          stringsAsFactors=FALSE
                          )
    
    d <- data.frame(d1 = as.Date(c('2016/05/2','2016/05/2'),'%Y/%m/%d'),
                    d2= as.Date(c('2016/05/10','2016/05/10'),'%Y/%m/%d'),
                    s = c('state1','state3'),
                    stringsAsFactors=FALSE)
    
    state <- as.character(unique(d$s))
    
    #function to exclude weekends and holidays
    f <- function(a, b, h) { 
      d <- seq(a, b, 1)[-1]   
      sum(!format(d, "%u") %in% c("6", "7") & !d %in% h)
    }
    vf <- Vectorize(f, c("a", "b"))
    
    #Function to calculate days excluding weekends and holidays by state
    #state is done iteratively
    datalist = list()
    for (i in 1:length(state)) {
      # ... make some data
    
      #extract holidays by state as vectors
      holi <- holiday %>% 
        filter(s== state[i]) %>%
        select(h) %>%
        pull()
    
      #create new data by state
      dat <- d %>%
        filter(s == state[i])
      dat$diff <- with(dat,vf(d1, d2, holi))
    
      datalist[[i]] <- dat # add it to your list
      rm('dat')
    }
    
    new_df = dplyr::bind_rows(datalist)
    

    Differentiating holidays by specific to states and excluding the weekends

              d1         d2      s diff
    1 2016-05-02 2016-05-10 state1    5
    2 2016-05-02 2016-05-10 state3    6