I want to predict daily volatility by EGARCH(1,1) model using arch
package.
Interval of Prediction: 01-04-2015
to 12-06-2018
(mm-dd-yyyy format)
hence i should grab data (for example) from 2013
till 2015
to fit EGARCH(1,1) model on it, and then predict daily volatility for 01-04-2015
to 12-06-2018
so i tried to write it like this:
# Packages That we need
from pandas_datareader import data as web
from arch import arch_model
import pandas as pd
#---------------------------------------
# grab Microsoft daily adjusted close price data from '01-03-2013' to '12-06-2018' and store it in DataFrame
df = pd.DataFrame(web.get_data_yahoo('MSFT' , start='01-03-2013' , end='12-06-2018')['Adj Close'])
#---------------------------------------
# calculate daily rate of return that is necessary for predicting daily Volatility by EGARCH
daily_rate_of_return_EGARCH = np.log(df.loc[ : '01-04-2015']/df.loc[ : '01-04-2015'].shift())
# drop NaN values
daily_rate_of_return_EGARCH = daily_rate_of_return_EGARCH.dropna()
#---------------------------------------
# Volatility Forecasting By EGARCH(1,1)
model_EGARCH = arch_model(daily_rate_of_return_EGARCH, vol='EGARCH' , p = 1 , o = 0 , q = 1)
fitted_EGARCH = model_EGARCH.fit(disp='off')
#---------------------------------------
# and finally, Forecasting step
# Note that as mentioned in `purpose` section, predict interval should be from '01-04-2015' to end of the data frame
horizon = len(df.loc['01-04-2015' : ])
volatility_FORECASTED = fitted_EGARCH.forecast(horizon = horizon , method='simulation')
and then i got this error:
MemoryError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_12900/1021856026.py in <module>
1 horizon = len(df.loc['01-04-2015':])
----> 2 volatility_FORECASTED = fitted_EGARCH.forecast(horizon = horizon , method='simulation')
MemoryError: Unable to allocate 3.71 GiB for an array with shape (503, 1000, 989) and data type float64
seems arch is going to save huge amount of data.
what i expect, is a simple pandas.Series
that contains daily volatility predictions from '01-04-2015'
until '12-06-2018'
. precisely i mean smth like this:
(Note: date format --> mm-dd-yyyy)
(DATE) (VOLATILITY)
'01-04-2015' .....
'01-05-2015' .....
'01-06-2015' .....
. .
. .
. .
'12-06-2018' .....
How can i achieve this?
You only need to pass the reindex=False
keyword and the memory requirement drops dramatically. You need a recent version of the arch package to use this feature which changes the output shape of the forecast to include only the forecast values, and so the alignment is different from the historical behavior.
# Packages That we need
from pandas_datareader import data as web
from arch import arch_model
import pandas as pd
#---------------------------------------
# grab Microsoft daily adjusted close price data from '01-03-2013' to '12-06-2018' and store it in DataFrame
df = pd.DataFrame(web.get_data_yahoo('MSFT' , start='01-03-2013' , end='12-06-2018')['Adj Close'])
#---------------------------------------
# calculate daily rate of return that is necessary for predicting daily Volatility by EGARCH
daily_rate_of_return_EGARCH = np.log(df.loc[ : '01-04-2015']/df.loc[ : '01-04-2015'].shift())
# drop NaN values
daily_rate_of_return_EGARCH = daily_rate_of_return_EGARCH.dropna()
#---------------------------------------
# Volatility Forecasting By EGARCH(1,1)
model_EGARCH = arch_model(daily_rate_of_return_EGARCH, vol='EGARCH' , p = 1 , o = 0 , q = 1)
fitted_EGARCH = model_EGARCH.fit(disp='off')
#---------------------------------------
# and finally, Forecasting step
# Note that as mentioned in `purpose` section, predict interval should be from '01-04-2015' to end of the data frame
horizon = len(df.loc['01-04-2015' : ])
volatility_FORECASTED = fitted_EGARCH.forecast(horizon = horizon , method='simulation', reindex=False)