I am creating 50 different time series models for housing prices in each US state. I am using pyramid ARIMA to accomplish this.
The data is from a .csv with Date, State, and Median_Listing_Price columns.
I've created the models and would like to predict values beyond my existing data, but I have no idea how to do this.
I have a chart that looks like this:
And I want a chart that looks something like this:
I would also like to output the forecasted values to a new .csv.
Current code:
# Indexing and creating series
df = pd.read_csv(f'state_csvs/{state}.csv', parse_dates=['Date'], date_parser=dateparse, index_col=0, header=0)
data = df[['Median_Listing_Price']]
# Divide into train and validation set
train = data.loc['2013-11':'2017-01']
valid = data.loc['2017-02':]
# Building the model
model = auto_arima(train, start_p=1, start_q=1, max_p=3, max_q=3, m=12, start_P=0, seasonal=True, d=1, D=1,
trace=True, error_action='ignore', suppress_warnings=True)
model.fit(train)
forecast = model.predict(n_periods=len(valid))
forecast = pd.DataFrame(forecast, index=valid.index, columns=['Prediction'])
# Plot the predictions for validation set
plt.plot(train, label='Train')
plt.plot(valid, label='Valid')
plt.plot(forecast, label='Prediction')
plt.title(f'{state}')
plt.show()
You can use following code,
import matplotlib as plt
import pandas as pd
#you can continue your code with following code
#But you need to first get final parameters from auto_arima
model.summary() #for getting final parameters
#Look for output like >> Model: ARIMA(x, y, z)
#Now use the following code after yours
model_future = ARIMA(df,order=(x, y, z))
results_future = model_future.fit()
predictions_future = results_future.predict(len(df),len(df)+12,typ = 'levels')
df.plot(legend=True , figsize = (12,8))
predictions_future.plot(legend= True)
#To add future predictions to csv you can use,
predictions_future.to_csv(path_to_folder)
Output in my case :
I have done it for 12 months into the future, you can have your own parameters set.