Rolling arithmetic mean can simply be computed with Numpy's 'convolve' function, but how could I efficiently create an array of running geometric means of some array a
and a given window size?
To give an example, for an array:
[0.5 , 2.0, 4.0]
and window size 2, (with window size decreasing at the edges)
I want to quickly generate the array:
[0.5, 1.0, 2.83, 4.0]
import numpy as np
from numpy.lib.stride_tricks import sliding_window_view
from scipy.stats import gmean
window = 2
a = [0.5, 2.0, 4.0]
padded = np.pad(a, window - 1, mode="constant", constant_values=np.nan)
windowed = sliding_window_view(padded, window)
result = gmean(windowed, axis=1, nan_policy="omit")
print(result)
# >>> [0.5 1. 2.82842712 4. ]
gmean()
from scipy.nan
, which, in combination with gmean(ā¦, nan_policy="omit")
, produces decreasing window sizes at the boundaries.sliding_window_view()
to create the running result.If you don't need the decreasing window sizes at the boundaries (this is referring to your comment), you can skip the padding step and choose the nan_policy
that suits you best.
Update: Realizing that gmean()
provides a weights
argument, we can replace the nan
padding with an equivalent weights
array (1 for the actual values, 0 for the padded values), and then are free again to choose the nan_policy
of our liking, even in the case of decreasing window sizes at the boundaries. This means, we could write:
padded = np.pad(a, window - 1, mode="constant", constant_values=1.)
windowed = sliding_window_view(padded, window)
weights = sliding_window_view(
np.pad(np.ones_like(a), window - 1, mode="constant", constant_values=0.),
window)
result = gmean(windowed, axis=1, weights=weights)
ā which will produce exactly the same result as above. My gut feeling tells me that the original version is faster, but I did not do any speed tests.