sqlpostgresqlplpgsqlwindow-functionsdynamic-sql

How to declare variables and use them to UPDATE each row based on the preceding row(s)


I'm looping through table1:

create table table1(id,"date",quantity,"value")as values
 (1,'2024-10-01',1,1)
,(2,'2024-10-02',1,1)
,(3,'2024-10-03',1,1)
,(4,'2024-10-04',1,1)
,(5,'2024-10-05',1,1)
,(6,'2024-10-06',1,1)
,(7,'2024-10-07',1,1);

and updating the column value with the value stored in the variable _previous_value (which I need to calculate the value of the next row in the loop).

Is there a better way to execute the update using those variables, without converting the command to a string?

Something like:

UPDATE table1 SET value = {_previous_value} WHERE table1.id = {_r.id};

I know I can use LAG() to get the value from the previous row, but in this specific case I need the function. It works as it is, I just want to know if there's a "correct" way to do this without concatenation and the EXECUTE command.

CREATE OR REPLACE FUNCTION functiontest() 
RETURNS VOID LANGUAGE plpgsql AS $f$
DECLARE _r RECORD;
DECLARE _previous_value DOUBLE PRECISION := 0;
BEGIN

FOR _r IN
    SELECT table1.date, table1.id  
    FROM table1
    ORDER BY table1.date, table1.id
LOOP
    WITH c AS
    (   SELECT 
        table1.id,
        table1.quantity + _previous_value AS total
        FROM table1 
        WHERE table1.id = _r.id 
        ORDER BY table1.date, table1.id
        LIMIT 1
    )
    SELECT
    COALESCE(c.total, 0)
    INTO
    _previous_value
    FROM c;

    EXECUTE 
    'UPDATE table1 SET value = ' || _previous_value || 
    ' WHERE table1.id = ' || _r.id || ';';
END LOOP;
END $f$;

DB<>Fiddle


Solution

  • Is there a better way to execute the update using those variables, without converting the command to a string? (...) a "correct" way to do this without concatenation and the EXECUTE command.

    To avoid concatenation, you can execute..using: demo1 at db<>fiddle

    EXECUTE $dsql$ UPDATE table1 
                   SET value = $1 
                   WHERE table1.id = $2 ;
            $dsql$ USING _previous_value, _r.id;
    

    To avoid execute entirely, you can let the update read from the select directly:
    demo2 at db<>fiddle

    CREATE OR REPLACE FUNCTION functiontest() 
    RETURNS VOID LANGUAGE plpgsql AS $f$
    DECLARE _r RECORD;
            _previous_value DOUBLE PRECISION := 0;
    BEGIN
    FOR _r IN SELECT table1.date, table1.id  
              FROM table1
              ORDER BY table1.date, table1.id
    LOOP WITH c AS(
          SELECT table1.id
               , table1.quantity + _previous_value AS total
          FROM table1 
          WHERE table1.id = _r.id 
          ORDER BY table1.date
                 , table1.id
          LIMIT 1
        ),updated as(
          UPDATE table1 
          SET value = COALESCE(c.total, 0)
          FROM c
          WHERE table1.id = c.id
          RETURNING value)--value used in that update gets returned
        SELECT value 
        FROM updated 
        INTO _previous_value;--and saved for the next iteration
    END LOOP;
    END $f$;
    

    Keep in mind you're by default in read committed transaction isolation mode, so unless you add some sort of locking, you might see anomalies if a concurrent client decides to insert into/update/delete from that table while you're in the middle of looping over it.


    You could also run this whole update operation in one go, in a single plain SQL query:
    demo3 at db<>fiddle

    with new_state as (
       select id
            , sum(quantity)over(order by date,id) as new_value
       from table1)
    update table1
    set value=new_value
    from new_state
    where table1.id=new_state.id;
    

    Looping in PL/pgSQL, while intuitive, typically performs worse than an equivalent in plain SQL so this should also speed things up. In a test on 200k random rows, doing the same thing takes: