pythonpandasdate

How to calculate the quantity of business days between two dates using Pandas


I created a pandas df with columns named start_date and current_date. Both columns have a dtype of datetime64[ns]. What's the best way to find the quantity of business days between the current_date and start_date column?

I've tried:

from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay

us_bd = CustomBusinessDay(calendar=USFederalHolidayCalendar())

projects_df['start_date'] = pd.to_datetime(projects_df['start_date'])
projects_df['current_date'] = pd.to_datetime(projects_df['current_date'])

projects_df['days_count'] = len(pd.date_range(start=projects_df['start_date'], end=projects_df['current_date'], freq=us_bd))

I get the following error message:

Cannot convert input....start_date, dtype: datetime64[ns]] of type <class 'pandas.core.series.Series'> to Timestamp

I'm using Python version 3.10.4.


Solution

  • pd.date_range's parameters need to be datetimes, not series.
    For this reason, we can use df.apply to apply the function to each row.
    In addition, pandas has bdate_range which is just date_range with freq defaulting to business days, which is exactly what you need.
    Thanks to Felipe Whitaker for showing me np.busday_count which does this faster! Applying it on each row will give you a count of business days between two given dates (and even has customizable weekday masks) (docs)

    Using apply and a lambda function, we can create a new Series calculating business days between each start and current date for each row.

    projects_df['start_date'] = pd.to_datetime(projects_df['start_date'])
    projects_df['current_date'] = pd.to_datetime(projects_df['current_date'])
    
    projects_df['days_count'] = projects_df.apply(lambda row: len(np.busday_count(row['start_date'], row['current_date'])), axis=1)
    

    Using a random sample of 10 date pairs, my output is the following:

               start_date        current_date  bdays
    0 2022-01-03 17:08:04 2022-05-20 00:53:46    100
    1 2022-04-18 09:43:02 2022-06-10 16:56:16     40
    2 2022-09-01 12:02:34 2022-09-25 14:59:29     17
    3 2022-04-02 14:24:12 2022-04-24 21:05:55     15
    4 2022-01-31 02:15:46 2022-07-02 16:16:02    110
    5 2022-08-02 22:05:15 2022-08-17 17:25:10     12
    6 2022-03-06 05:30:20 2022-07-04 08:43:00     86
    7 2022-01-15 17:01:33 2022-08-09 21:48:41    147
    8 2022-06-04 14:47:53 2022-12-12 18:05:58    136
    9 2022-02-16 11:52:03 2022-10-18 01:30:58    175