pythonpython-datetimepython-dateutilgenerator-expressionpython-holidays

why do does generating a list of holidays in the year 2005 blow up this function?


I am writing a Python script using datetime, holidays and dateutil to determine if a given date in YYYY-MM-DD format is a trading holiday. I'm using a generator expression to remove holidays where the market is not closed from the default list of holidays provided by the holidays library,

import datetime, holidays
import dateutil.easter as easter

def to_date(date_string):
    return datetime.datetime.strptime(date_string,'%Y-%m-%d').date()

def is_trading_holiday(date):
    us_holidays = holidays.UnitedStates(years=date.year)
    # generate list without columbus day and veterans day since markets are open on those days
    trading_holidays = [ "Columbus Day", "Columbus Day (Observed)", "Veterans Day", "Veterans Day (Observed)"]
    custom_holidays = [ date for date in us_holidays if us_holidays[date] not in trading_holidays ]
    # add good friday to list since markets are closed on good friday
    custom_holidays.append(easter.easter(year=date.year) - datetime.timedelta(days=2))

    return date in custom_holidays

if __name__=="__main__":
    first_date = to_date('2020-01-03')
    second_date = to_date('2015-11-26') # Thanksgiving
    third_date = to_date('2005-01-01') # New Years
    fourth_date = to_date('2005-01-07')

    print(is_trading_holiday(first_date))
    print(is_trading_holiday(second_date))
    print(is_trading_holiday(third_date))
    print(is_trading_holiday(fourth_date))

I've tested this for a variety of dates and it seems to work in all cases but one. When I use dates from the year 2005, this function blows up and tells me,

Traceback (most recent call last):
  File "./test.py", line 26, in <module>
    print(is_trading_holiday(third_date))
  File "./test.py", line 11, in is_trading_holiday
    custom_holidays = [ date for date in us_holidays if us_holidays[date] not in trading_holidays ]
  File "./test.py", line 11, in <listcomp>
    custom_holidays = [ date for date in us_holidays if us_holidays[date] not in trading_holidays ]
RuntimeError: dictionary changed size during iteration

I have no idea what is special about 2005 that makes this function blow up, or even if the year is what is causing this problem (I have tested this for dates going back to the seventies, and it works). I am not modifying the dictionary I am iterating over in the generator expression (or else, I don't think am?), so I'm not sure what this error is trying to tell me.

Anyone know what is going on here? Am I missing something obvious?


Solution

  • There seems to be a bug (or special case) in the UnitedStates class that generates datetime.date(2004, 12, 31): "New Year's Day (Observed)" for 2005. This causes if us_holidays[date] in your list comprehension to reference a different year (that has not been loaded yet) and makes alterations to the dictionary you are traversing.

    You can work around that problem by iterating over the items rather than re-accessing the dictionary with the keys:

    ... for date,name  in us_holidays.items() if name not in trading_holidays]
    

    Alternatively you could just convert to a list so that the iteration doesn't run through the actual dictionary:

    ... for date in list(us_holidays) if us_holidays[date] not in trading_holidays]