I generated a new column displaying the 7th business day of the year and month from this df:
YearOfSRC MonthNumberOfSRC
0 2022 3
1 2022 4
2 2022 5
3 2022 6
4 2021 4
... ... ... ...
20528 2022 1
20529 2022 2
20530 2022 3
20531 2022 4
20532 2022 5
With this code:
df['PredictionDate'] = (pd
.to_datetime(df[['YearOfSRC', 'MonthNumberOfSRC']]
.set_axis(['year' ,'month'], axis=1)
.assign(day=1)
)
.sub(pd.offsets.BusinessDay(1))
.add(pd.offsets.BusinessDay(7))
)
To output this dataframe (df_final
) with the new column, PredictionDate
:
YearOfSRC MonthNumberOfSRC PredictionDate
0 2022 3 2022-03-09
1 2022 4 2022-04-11
2 2022 5 2022-05-10
3 2022 6 2022-06-09
4 2021 4 2021-04-09
... ... ... ...
20528 2022 1 2022-01-11
20529 2022 2 2022-02-09
20530 2022 3 2022-03-09
20531 2022 4 2022-04-11
20532 2022 5 2022-05-10
(More details here)
However, I would like to make use of CustomBusinessDay
and Python's Holiday
package to modify the rows of PredictionDate
where a holiday in the first week would push back the 7th business day by 1 business day. I know that CustomBusinessDay
has a parameter for a holiday list so in a modular solution I would assign the list from the holiday
library to that parameter. I know I could hard-code the added business day by increasing the day by 1 for all months where there is a holiday in the first week, but I would prefer a solution that is more dynamic. I have tried this instead of the above code but I get a KeyError:
df_final['PredictionDate'] = (pd
.to_datetime(df_final[['YearOfSRC', 'MonthNumberOfSRC']]
.set_axis(['year' ,'month'], axis=1)
.assign(day=1)
)
.sub(df_final.apply(lambda x : pd.offsets.CustomBusinessDay(1, holidays = holidays.US(years = x['YearOfSRC']).keys())))
.add(df_final.apply(lambda x: pd.offsets.CustomBusinessDay(7, holidays = holidays.US(years = x['YearOfSRC']).keys())))
)
KeyError: 'YearOfSRC'
I'm sure I am implementing pandas apply
and lambda
functions incorrectly here, but I don't know why the error would be a key error when that's clearly a column in df_final
.
Per my comment above, try this:
df_final['PredictionDate'] = (pd
.to_datetime(df_final[['YearOfSRC', 'MonthNumberOfSRC']]
.set_axis(['year' ,'month'], axis=1)
.assign(day=1)
)
.sub(df_final.apply(lambda x : pd.offsets.CustomBusinessDay(1, holidays = holidays.US(year = x['YearOfSRC']).items()), axis=1))
.add(df_final.apply(lambda x: pd.offsets.CustomBusinessDay(7, holidays = holidays.US(year = x['YearOfSRC']).items()), axis=1))
)