even though still as a noob, I have been enthusiastically learning Python for a while and here's a project I'm working on. I need to collect historical stock price ten days before US public holidays in the past twenty years and here's what I've done: (I used pandas_datareader and holidays here)
start=datetime.datetime(1995,1,1)
end=datetime.datetime(2015,12,31)
history_price=web.get_data_yahoo('SPY', start, end)
us_holidays=holidays.UnitedStates()
test=[]
for i in dates:
if i in us_holidays:
test.append((history_price['Adj Close'].ix[pd.date_range(end=i, periods=11, freq='B')]))
test
And the result is like this:
Freq: B, Name: Adj Close, dtype: float64, 1995-02-06 32.707565
1995-02-07 32.749946
1995-02-08 32.749946
1995-02-09 32.749946
1995-02-10 32.792328
1995-02-13 32.802975
1995-02-14 32.845356
1995-02-15 33.025457
1995-02-16 32.983076
1995-02-17 32.855933
1995-02-20 NaN
The length of the list "test" is 233. My question is: how can I convert this list into a dictionary with the holidays being the keys and the stock prices being values under each key.
Thank you in advance for your guidance.
This uses a dictionary and list comprehension to generate a set of ten U.S. workdays preceding each holiday. The stock prices for those days are then stored in a dictionary (keyed on holiday) as a list of prices, most recent first (h-1) and oldes last (h-10).
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
holidays = USFederalHolidayCalendar().holidays(start='1995-1-1', end='2015-12-31')
bday_us = CustomBusinessDay(calendar=USFederalHolidayCalendar())
start = '1995-01-01'
end = '2015-12-31'
days = 10
dates = {holiday: [holiday - bday_us * n for n in range(1, days + 1)]
for holiday in USFederalHolidayCalendar().holidays(start=start, end=end)}
>>> dates
{...
Timestamp('2015-12-25 00:00:00'): [
Timestamp('2015-12-24 00:00:00'),
Timestamp('2015-12-23 00:00:00'),
Timestamp('2015-12-22 00:00:00'),
Timestamp('2015-12-21 00:00:00'),
Timestamp('2015-12-18 00:00:00'),
Timestamp('2015-12-17 00:00:00'),
Timestamp('2015-12-16 00:00:00'),
Timestamp('2015-12-15 00:00:00'),
Timestamp('2015-12-14 00:00:00'),
Timestamp('2015-12-11 00:00:00')]}
result = {holiday: history_price.ix[dates[holiday]].values for holiday in dates}
>>> result
{...
Timestamp('2015-12-25 00:00:00'):
array([ 203.56598 , 203.902497, 201.408393, 199.597201, 197.964166,
201.55487 , 204.673725, 201.722125, 199.626485, 198.622952])}