pythonpandasnumpytime-seriescalculation

Efficient calculation of volatility using EWMA


I am trying to calculate the volatility using EWMA (Exponentially Weighted Moving Average).

Here is the function I developed:

def ewm_std(x, param=0.99):
    n = len(x)
    coefs = param ** np.arange(n)[::-1]
    mean_x = np.mean(x)
    squared_diff = (x - mean_x) ** 2
    res = np.sqrt(np.dot(squared_diff, coefs) / np.sum(coefs))
    return res

The weighting parameter is always set to 0.99, and the window size is 255.

I apply the function to my DataFrame (each column is a scenario, each row a date) as follows:

df.rolling(window=255).apply(ewm_std, raw=True)

The problem is that it is very slow, and I did not manage to optimize it. I want the same results at the end.

How can I improve the performance ?


Solution

  • Assuming "col" your column of interest, you could use pure and sliding_window_view:

    from numpy.lib.stride_tricks import sliding_window_view as swv
    
    # test input
    np.random.seed(0)
    df = pd.DataFrame({'col': np.random.random(1000)})
    
    window = 255
    param  = 0.99
    coefs = param ** np.arange(window)[::-1]
    
    x = swv(df['col'], window)
    mean_x = x.mean(axis=1)[:, None]
    squared_diff = (x - mean_x) ** 2
    res = np.sqrt(np.dot(squared_diff, coefs) / np.sum(coefs))
    df['out'] = pd.Series(res, index=df.index[window-1:])
    
    # validation
    df['expected'] = df.rolling(window=window)['col'].apply(ewm_std, raw=True)
    np.allclose(df.loc[window-1:, 'expected'], df.loc[window-1:, 'out'])
    # True