pythonpandasnumpydatepython-holidays

Numpy busday_count not considering holidays


I have a dataset and I need to calculate working days from a given date to today, excluding the given list of holidays. I will be including weekends.

Date Sample:

enter image description here

This is the code I tried:

import pandas as pd
import numpy as np
from datetime import date
df = pd.read_excel('C:\\sample.xlsx')

#get todays date
df["today"] = date.today()
#Convert data type
start = df["R_REL_DATE"].values.astype('datetime64[D]')
end = df["today"].values.astype('datetime64[D]')
holiday = ['2021-06-19', '2021-06-20']
#Numpy function to find in between days
days = np.busday_count(start, end, weekmask='1111111', holidays=holiday)
#Add this column to dataframe
df["Days"] = days
df

When I run this code, it gives difference between R_REL_DATE and today, but doesn't subtract given holidays. enter image description here

Please help, I want the given list of holidays deducted from the days.


Solution

  • Make sure today and R_REL_DATE are in pandas datetime format with pd.to_datetime():

    import pandas as pd
    import numpy as np
    import datetime
    df = pd.DataFrame({'R_REL_DATE': {0: '7/23/2020', 1: '8/26/2020'},
     'DAYS IN QUEUE': {0: 338, 1: 304}})
    df["today"] = pd.to_datetime(datetime.date.today())
    df["R_REL_DATE"] = pd.to_datetime(df["R_REL_DATE"])
    start = df["R_REL_DATE"].values.astype('datetime64[D]')
    end = df["today"].values.astype('datetime64[D]')
    holiday = ['2021-06-19', '2021-06-20']
    #Numpy function to find in between days
    days = np.busday_count(start, end, weekmask='1111111', holidays=holiday)
    #Add this column to dataframe
    df["Days"] = days - 1
    df
    Out[1]: 
      R_REL_DATE  DAYS IN QUEUE      today  Days
    0 2020-07-23            338 2021-06-27   336
    1 2020-08-26            304 2021-06-27   302