pythondictionaryyahoo-finance

historical stock price ten days before holidays for the past twenty years


even though still as a noob, I have been enthusiastically learning Python for a while and here's a project I'm working on. I need to collect historical stock price ten days before US public holidays in the past twenty years and here's what I've done: (I used pandas_datareader and holidays here)

start=datetime.datetime(1995,1,1)
end=datetime.datetime(2015,12,31)
history_price=web.get_data_yahoo('SPY', start, end)
us_holidays=holidays.UnitedStates()
test=[]
for i in dates:
    if i in us_holidays:
        test.append((history_price['Adj Close'].ix[pd.date_range(end=i, periods=11, freq='B')]))
test

And the result is like this:

Freq: B, Name: Adj Close, dtype: float64, 1995-02-06    32.707565
 1995-02-07    32.749946
 1995-02-08    32.749946
 1995-02-09    32.749946
 1995-02-10    32.792328
 1995-02-13    32.802975
 1995-02-14    32.845356
 1995-02-15    33.025457
 1995-02-16    32.983076
 1995-02-17    32.855933
 1995-02-20          NaN

The length of the list "test" is 233. My question is: how can I convert this list into a dictionary with the holidays being the keys and the stock prices being values under each key.

Thank you in advance for your guidance.


Solution

  • This uses a dictionary and list comprehension to generate a set of ten U.S. workdays preceding each holiday. The stock prices for those days are then stored in a dictionary (keyed on holiday) as a list of prices, most recent first (h-1) and oldes last (h-10).

    from pandas.tseries.holiday import USFederalHolidayCalendar
    from pandas.tseries.offsets import CustomBusinessDay
    
    holidays = USFederalHolidayCalendar().holidays(start='1995-1-1', end='2015-12-31')
    bday_us = CustomBusinessDay(calendar=USFederalHolidayCalendar())
    
    start = '1995-01-01'
    end = '2015-12-31'
    days = 10
    
    dates = {holiday: [holiday - bday_us * n for n in range(1, days + 1)]  
             for holiday in USFederalHolidayCalendar().holidays(start=start, end=end)}
    
    >>> dates
    {...
    Timestamp('2015-12-25 00:00:00'): [
        Timestamp('2015-12-24 00:00:00'),
        Timestamp('2015-12-23 00:00:00'),
        Timestamp('2015-12-22 00:00:00'),
        Timestamp('2015-12-21 00:00:00'),
        Timestamp('2015-12-18 00:00:00'),
        Timestamp('2015-12-17 00:00:00'),
        Timestamp('2015-12-16 00:00:00'),
        Timestamp('2015-12-15 00:00:00'),
        Timestamp('2015-12-14 00:00:00'),
        Timestamp('2015-12-11 00:00:00')]}
    
    result = {holiday: history_price.ix[dates[holiday]].values for holiday in dates}
    
    >>> result
    {...
     Timestamp('2015-12-25 00:00:00'): 
       array([ 203.56598 ,  203.902497,  201.408393,  199.597201,  197.964166,
               201.55487 ,  204.673725,  201.722125,  199.626485,  198.622952])}