In my PostgresSQL database, I have a table with columns id
and an integer value
.
The values contain noise that needs to be cleared out. I would like to do it by passing n
(e.g. 5) consecutive values to the function and then calculate their median (or average or something similar). So given the records with values 4, 1, 3, 4, 4, 2, 5, 4, 6, 4
the first output would be the median of the first 5 values (4, 1, 3, 4, 4
), the second one would be the median of the first 5 values with offset one (1, 3, 4, 4, 2
), then offset 2 and so forth.
I would like to use the function in a query like this SELECT id, value, noisless_value(value) FROM measurements
. Is this doable with Postgres? Or should such computations be performed outside of the database?
Unfortunately, Postgres does not have a convenient way to calculating a running median using window functions. So, the following handles both the average and the median using a lateral join:
select t.*, t2.*
from t cross join lateral
(select avg(value) as avg_5, percentile_cont(0.5) within group (order by value) as median_5
from (select t2.*
from t t2
where t2.id >= t.id
order by t2.id asc
limit 5
) t2
) t2;
The average, I should point out, is simpler using window functions:
select t.*,
avg(value) over (order by id range between current row and 4 following) as avg_5
from t;
Here is a db<>fiddle.