I have a dataframe with years of data. I want to detect, assign True/False or 1/0 if it is a Holiday.
My code:
df = pd.DataFrame(index=['2004-10-01', '2004-10-02', '2004-10-03', '2004-10-04', '2004-10-05',
'2004-10-06', '2004-10-07', '2004-10-08', '2004-10-09', '2004-10-10',
'2018-07-25', '2018-07-26', '2018-07-27', '2018-07-28', '2018-07-29',
'2018-07-30', '2018-07-31', '2018-08-01', '2018-08-02', '2018-08-03'])
import holidays
# Detect US holidays
hldys = holidays.country_holidays('US')
# I have to apply this to each date in the dataframe index
df['isHoliday?'] = df.index.map(lambda x: hldy.get(x))
Present output:
isHoliday?
2004-10-01 None
2004-10-02 None
2004-10-03 None
2004-10-04 None
2004-10-05 None
2004-10-06 None
2004-10-07 None
2004-10-08 None
2004-10-09 None
2004-10-10 None
2018-07-25 None
2018-07-26 None
2018-07-27 None
2018-07-28 None
2018-07-29 None
2018-07-30 None
2018-07-31 None
2018-08-01 None
2018-08-02 None
Update I found the solution
us_hldys = holidays.country_holidays('US')
df['isHoliday?'] = df.index.to_series().apply(lambda x: x in us_hldys)
isHoliday?
2004-10-01 False
2004-10-02 False
2004-10-03 False
2004-10-04 False
2004-10-05 False
2004-10-06 False
2004-10-07 False
2004-10-08 False
2004-10-09 False
2004-10-10 False
2018-07-25 False
2018-07-26 False
2018-07-27 False
2018-07-28 False
2018-07-29 False
2018-07-30 False
2018-07-31 False
2018-08-01 False
2018-08-02 False
2018-08-03 False
A different strategy is to pre-calculate holidays to avoid apply
:
df.index = pd.to_datetime(df.index)
hol = pd.Series(holidays.country_holidays('US', years=range(df.index.min().year,
df.index.max().year+1)))
df['isHoliday?'] = df.index.isin(hol)
Output:
>>> df
isHoliday?
2004-10-01 False
2004-10-02 False
2004-10-03 False
2004-10-04 False
2004-10-05 False
2004-10-06 False
2004-10-07 False
2004-10-08 False
2004-10-09 False
2004-10-10 False
2018-07-25 False
2018-07-26 False
2018-07-27 False
2018-07-28 False
2018-07-29 False
2018-07-30 False
2018-07-31 False
2018-08-01 False
2018-08-02 False
2018-08-03 False
>>> hol.sort_index()
2004-01-01 New Year's Day
2004-01-19 Martin Luther King Jr. Day
2004-02-16 Washington's Birthday
2004-05-31 Memorial Day
2004-07-04 Independence Day
...
2018-10-08 Columbus Day
2018-11-11 Veterans Day
2018-11-12 Veterans Day (Observed)
2018-11-22 Thanksgiving
2018-12-25 Christmas Day
Length: 169, dtype: object