How do I fill backward, forward, and linear interpolation of a column in a table that has column timestamp in bigquery?
I have this table:
timestamp | mycol |
---|---|
1 | null |
2 | null |
3 | 69 |
4 | null |
5 | 71 |
6 | 72 |
7 | null |
I expect outcome like this:
timestamp | mycol |
---|---|
1 | 69 |
2 | 69 |
3 | 69 |
4 | 70 |
5 | 71 |
6 | 72 |
7 | 72 |
see below query, the result is inspired from https://justrocketscience.com/post/interpolation_sql/ however, my query handles the nulls if last row is a null
WITH input_data as (
SELECT 1 as _timestamp, null as my_col
UNION ALL
SELECT 2 as _timestamp, null as my_col
UNION ALL
SELECT 3 as _timestamp, 69 as my_col
UNION ALL
SELECT 4 as _timestamp, null as my_col
UNION ALL
SELECT 5 as _timestamp, 71 as my_col
UNION ALL
SELECT 6 as _timestamp, 72 as my_col
UNION ALL
SELECT 7 as _timestamp, null as my_col
) SELECT *,
cast(case when input_data.my_col is not null then input_data.my_col
when lag(input_data.my_col) over (ORDER BY _timestamp, my_col) is not null and
lead(input_data.my_col) over (ORDER BY _timestamp, my_col) is not null
then ( lag(input_data.my_col) over (ORDER BY _timestamp, my_col) +
lead(input_data.my_col) over (ORDER BY _timestamp, my_col) ) / 2
else IFNULL(FIRST_VALUE(my_col IGNORE NULLS) OVER (item_window ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
LAG(my_col) OVER (ORDER BY _timestamp, my_col))
end
as int64)
as new_value
FROM input_data
WINDOW item_window as (ORDER BY input_data._timestamp)
See query output in the screenshot