I need to calculate a running median in python. Currently I do it like this:
med_y = []
med_x = []
for i in numpy.arange(240, 380, 1):
med_y.append(numpy.median(dy[(dx > i)*(dx < i+20)]))
med_x.append(i + 10)
Here the data is stored in dx (x-coordinate) and dy (y-coordinate) and the median is taken over the dy and plotted agaist the dx (which has to be shifted by window/2). Assuming even spacing for x and window size here is 20.
Is there a shorter way?
For example, running average can be done like this:
cs = numpy.cumsum(dy)
y_20 = (cs[20:] - cs[:-20])/20.0
x_20 = dx[10:-10]
Predefined running X functions in site-packages are also ok.
This is the shortest:
from scipy.ndimage import median_filter
values = [1,1,1,0,1,1,1,1,1,1,1,2,1,1,1,10,1,1,1,1,1,1,1,1,1,1,0,1]
print median_filter(values, 7, mode='mirror')
and it works correctly in the edges (or you can choose how it works at the edges).
And any general running X is done like this (running standard deviation as an example):
import numpy
from scipy.ndimage.filters import generic_filter
values = numpy.array([0,1,2,3,4,5,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1]).astype('float')
print(generic_filter(values, numpy.std, size=7, mode='mirror'))
In the above, float input type is important.
Useful links:
improving code efficiency: standard deviation on sliding windows