I am trying to calculate the volatility using EWMA (Exponentially Weighted Moving Average).
Here is the function I developed:
def ewm_std(x, param=0.99):
n = len(x)
coefs = param ** np.arange(n)[::-1]
mean_x = np.mean(x)
squared_diff = (x - mean_x) ** 2
res = np.sqrt(np.dot(squared_diff, coefs) / np.sum(coefs))
return res
The weighting parameter is always set to 0.99, and the window size is 255.
I apply the function to my DataFrame (each column is a scenario, each row a date) as follows:
df.rolling(window=255).apply(ewm_std, raw=True)
The problem is that it is very slow, and I did not manage to optimize it. I want the same results at the end.
How can I improve the performance ?
Assuming "col" your column of interest, you could use pure numpy and sliding_window_view
:
from numpy.lib.stride_tricks import sliding_window_view as swv
# test input
np.random.seed(0)
df = pd.DataFrame({'col': np.random.random(1000)})
window = 255
param = 0.99
coefs = param ** np.arange(window)[::-1]
x = swv(df['col'], window)
mean_x = x.mean(axis=1)[:, None]
squared_diff = (x - mean_x) ** 2
res = np.sqrt(np.dot(squared_diff, coefs) / np.sum(coefs))
df['out'] = pd.Series(res, index=df.index[window-1:])
# validation
df['expected'] = df.rolling(window=window)['col'].apply(ewm_std, raw=True)
np.allclose(df.loc[window-1:, 'expected'], df.loc[window-1:, 'out'])
# True