sqlpostgresqlsql-update

UPDATE query gets stuck sometimes


I use SQL query to affect 120000 rows. Sometimes it works correctly (3-5 seconds), but often it gests stuck.

DB: Postgres. Tried on docker image of Postgres 16, on server DB. Tried with query console, tried from Java code - result is the same. Sometimes it works, sometimes it gets stuck.

update
    renamed_account.renamed_account
set
    level=0,
    chain_id=gen_random_uuid()
where
    old_number not in (
        select distinct new_number from renamed_account.renamed_account
    );

Table definition:

CREATE TABLE IF NOT EXISTS renamed_account.renamed_account
(
    ID          bigserial PRIMARY key,
    OLD_NUMBER  varchar(40),
    NEW_NUMBER  varchar(40),
    CHAIN_ID    varchar(36),
    LEVEL       integer
);

CREATE INDEX old_number_idx ON renamed_account.renamed_account (OLD_NUMBER);
CREATE INDEX new_number_idx ON renamed_account.renamed_account (NEW_NUMBER);

Solution

  • NOT IN with a subquery is a classical anti-pattern. Even worse, neither new_number nor old_number are defined NOT NULL. So the query results in dubious logic.

    Use NOT EXISTS instead, which is faster and does not break if the subquery returns a null value.

    UPDATE renamed_account.renamed_account r
    SET    level = 0
         , chain_id = gen_random_uuid ()
    WHERE  NOT EXISTS (
       SELECT FROM renamed_account.renamed_account r1
       WHERE  r1.new_number = r.old_number 
       );
    

    See:

    However, your query still shouldn't get "stuck". You probably have concurrency issues on top of the slow / incorrect query. While your query is "stuck", study the output of:

    SELECT * FROM pg_stat_activity;
    

    The column wait_event tells you what the "stuck" query is waiting for. Take action accordingly.

    For multiple concurrent transactions writing to the table concurrently, adopt the policy to update rows in a consistent, deterministic order within the same transaction - as much as possible. Say, order by the PK id.
    UPDATE has no ORDER BY. The workaround is to lock rows with SELECT ... FOR UPDATE in a CTE after ORDER BY id.

    WITH sel AS (
       SELECT id
       FROM   renamed_account.renamed_account r
       WHERE  NOT EXISTS (
          SELECT FROM renamed_account.renamed_account r1
          WHERE  r1.new_number = r.old_number 
          )
       ORDER BY id
       FOR   UPDATE
       )
    UPDATE renamed_account.renamed_account r
    SET    level = 0
         , chain_id = gen_random_uuid ()
    FROM  sel
    WHERE sel.id = r.id
       );
    

    Related:

    Still, you should see a deadlock. The only other common case is a session that is "idle in transaction" or a long running transaction after having locked rows you are trying to update. While that idle session just sits there and does not commit or roll back, your query has to wait. Can also be diagnosed with pg_stat_activity. Look at the state column. Any mention of "idle in transaction" is problematic. If such a transaction blocks your UPDATE, it's actually "stuck". See: