pythonpandastime-seriesdecomposition

Time series decomposition


I have a time series that I want to decompose. Dataset (train - dataframe) example (stock price):

        Date    Close
7389    2014-12-24  104.589996
7390    2014-12-26  105.059998
7391    2014-12-29  105.330002
7392    2014-12-30  105.360001
7393    2014-12-31  104.5700

Here is my code:

train_dec = copy.deepcopy(train)
train_dec.index = pd.to_datetime(train_dec['Date'])
train_dec.index.freq = 'D'

# Transform DataFrame into a Series
train_series = train_dec['Close']

train_decomposition = seasonal_decompose(train_series, model='additive')

train_trend = train_decomposition.trend
train_seasonal = train_decomposition.seasonal
train_residual = train_decomposition.resid

I tried without converting into Series and with it. Tried set up frequency to 'D'.

I keep getting errors such as:

ValueError: Inferred frequency None from passed values does not conform to passed frequency D

or

ValueError: You must specify a period or x must be a pandas object with a PeriodIndex or a DatetimeIndex with a freq not set to None

when I do not set frequency.

Maybe it is because the data have gaps (weekends) when there is no data point (stock price). Should I convert it to a weekly format? But how can I do this if there are gaps (e.g. if I have removed outliers)?

It must be something trivial but I can not see the solution.

Your help is greatly appreciated!


Solution

  • You need to specify the period when doing seasonal decomposition:

    import pandas as pd
    import numpy as np
    from statsmodels.tsa.seasonal import seasonal_decompose
    import matplotlib.pyplot as plt
    import copy
    
    data = {
        'Date': ['2014-12-24', '2014-12-26', '2014-12-29', '2014-12-30', '2014-12-31'],
        'Close': [104.589996, 105.059998, 105.330002, 105.360001, 104.5700]
    }
    train = pd.DataFrame(data)
    
    train['Date'] = pd.to_datetime(train['Date'])
    train.set_index('Date', inplace=True)
    
    idx = pd.date_range(start=train.index.min(), end=train.index.max(), freq='D')
    train = train.reindex(idx)
    
    train['Close'] = train['Close'].ffill()
    
    decomposition = seasonal_decompose(train['Close'], model='additive', period=3)  
    fig = decomposition.plot()
    plt.show()
    
    

    enter image description here