I am performing a time series analysis and was checking for stationarity using Kwiatkowski–Phillips–Schmidt–Shin (KPSS). I have loaded the data using the following:
import pandas as pd
import numpy as np
path = 'https://raw.githubusercontent.com/selva86/datasets/master/daily-min-temperatures.csv'
df = pd.read_csv(path, parse_dates=['Date'], index_col='Date')
df.plot(title='Daily Temperatures', figsize=(14,8), legend=None);
This is the code I used but I am unable to display the results.
# define function for kpss test
from statsmodels.tsa.stattools import kpss
# define KPSS
def kpss_test(timeseries):
print ('Results of KPSS Test:')
kpsstest = kpss(timeseries, regression='c')
kpss_output = pd.Series(kpsstest[0:3], index=['Test Statistic','p-value','Lags Used'])
for key,value in kpsstest[3].items():
kpss_output['Critical Value (%s)'%key] = value
You are almost there, Just return the kpss_output
like so:
def kpss_test(timeseries):
print ('Results of KPSS Test:')
kpsstest = kpss(timeseries, regression='c')
kpss_output = pd.Series(kpsstest[0:3], index=['Test Statistic','p-value','Lags Used'])
for key,value in kpsstest[3].items():
kpss_output['Critical Value (%s)'%key] = value
return kpss_output
when you call kpss_test(df.Temp)
you will get:
Test Statistic 0.06511
p-value 0.10000
Lags Used 30.00000
Critical Value (10%) 0.34700
Critical Value (5%) 0.46300
Critical Value (2.5%) 0.57400
Critical Value (1%) 0.73900
dtype: float64