Given the following scheme and values I want to compute the value of each row starting by adding 1 to the previous row value. If the row value is null then set null. If there is a value then add 1 from the previous value.
How can I achieve this in Postgres? I've tried recursive CTE but without success.
Input data:
symbol date close computed
TBBB 2024-02-22 14:30:00+00 19.05
TBBB 2024-02-23 14:30:00+00 19.55
TBBB 2024-02-24 14:30:00+00 20.6
TBBB 2024-02-25 14:30:00+00 21.3
TBBB 2024-02-26 14:30:00+00 20.43 20.4
TBBB 2024-02-27 14:30:00+00 20.21
TBBB 2024-02-28 14:30:00+00 20.74
TBBB 2024-02-29 14:30:00+00 20.09
TBBB 2024-03-01 14:30:00+00 20.79
TBBB 2024-03-02 14:30:00+00 20.87
TBBB 2024-03-03 14:30:00+00 20.69
TBBB 2024-03-04 14:30:00+00 20.19
TBBB 2024-03-05 14:30:00+00 20.9
TBBB 2024-03-06 14:30:00+00 20.99
TBBB 2024-03-07 14:30:00+00 21.28
TBBB 2024-03-08 14:30:00+00 21.27
Expected result:
symbol date close computed
TBBB 2024-02-22 14:30:00+00 19.05
TBBB 2024-02-23 14:30:00+00 19.55
TBBB 2024-02-24 14:30:00+00 20.6
TBBB 2024-02-25 14:30:00+00 21.3
TBBB 2024-02-26 14:30:00+00 20.43 20.4
TBBB 2024-02-27 14:30:00+00 20.21 21.4
TBBB 2024-02-28 14:30:00+00 20.74 22.4
TBBB 2024-02-29 14:30:00+00 20.09 23.4
TBBB 2024-03-01 14:30:00+00 20.79 24.4
TBBB 2024-03-02 14:30:00+00 20.87 25.4
TBBB 2024-03-03 14:30:00+00 20.69 26.4
TBBB 2024-03-04 14:30:00+00 20.19 27.4
TBBB 2024-03-05 14:30:00+00 20.9 28.4
TBBB 2024-03-06 14:30:00+00 20.99 29.4
TBBB 2024-03-07 14:30:00+00 21.28 30.4
TBBB 2024-03-08 14:30:00+00 21.27 31.4
The +1 from previous value is just for simplicity. In reality I would have a math formula there.
What I've tried:
Some simpler example and not related to the question: here I calculate the computed column based on the close. This was a initial example, it's not what I want:
WITH computed_values AS (SELECT symbol,
date,
close,
CASE
WHEN LAG(computed, 5) OVER (PARTITION BY symbol ORDER BY date) IS NULL THEN null
ELSE (close * (2.0 / (period + 1))) +
(LAG(computed, 5) OVER (PARTITION BY symbol ORDER BY date) * (1 - (2.0 / (period + 1))))
END AS computed
FROM (SELECT symbol,
date,
close,
close AS computed,
5 AS period
FROM my_table
WHERE symbol = 'TBBB'
ORDER BY date) subquery)
SELECT *
FROM computed_values;
Generated by AI and not working:
WITH NumberedRows AS (
SELECT
symbol,
date,
close,
ROW_NUMBER() OVER (ORDER BY date ASC) AS row_num
FROM my_table
),
KnownClose AS (
SELECT
symbol,
date,
close,
row_num
FROM NumberedRows
WHERE close IS NOT NULL
),
RecursiveClose AS (
SELECT
symbol,
date,
close,
row_num
FROM KnownClose
UNION ALL
SELECT
nr.symbol,
nr.date,
rc.close - 2 AS close,
nr.row_num
FROM RecursiveClose rc
JOIN NumberedRows nr ON nr.row_num = rc.row_num - 1
WHERE nr.close IS NULL
UNION ALL
SELECT
nr.symbol,
nr.date,
rc.close + 2 AS close,
nr.row_num
FROM RecursiveClose rc
JOIN NumberedRows nr ON nr.row_num = rc.row_num + 1
WHERE nr.close IS NULL
) SELECT symbol,
date,
close
FROM RecursiveClose
ORDER BY row_num;
This is one of the rare cases that lends itself to a procedural solution.
Replace all instances of tbl
with your actual (schema-qualified) table name:
-- create function
CREATE OR REPLACE FUNCTION f_compute()
RETURNS SETOF tbl
LANGUAGE plpgsql AS
$func$
DECLARE
r tbl; -- use table type as row variable
r0 tbl;
BEGIN
FOR r IN
SELECT * FROM tbl t ORDER BY t.symbol, t.date
LOOP
IF r.symbol = r0.symbol -- same symbol
AND r0.computed IS NOT NULL -- last computed exists
AND r.computed IS NULL THEN -- this computed is null
r.computed := r0.computed + 1; -- your computation here
END IF;
RETURN NEXT r;
r0 := r; -- remember last row
END LOOP;
END
$func$;
Call:
SELECT * FROM f_compute();
Produces exactly your desired result. Compare performance with EXPLAIN ANALYZE
.
Related:
If you don't want to persist a function, consider a temporary function. See: