postgresqldeadlockpostgresql-15cloudnative-pg

Should a single update query create multiple exclusive locks on the same table in postgres (CNPG)?


My system is running instances of postgres managed using CloudNativePG; there's a primary and a couple of replicas.

When I run a query like this update <table> set <boolean-column> = NOT <boolean-column>, it's taking a while (> 19 hours) to update about ~100 million rows. I should also mention that there exists a view in the database on this table (mentioning this because it may be relevant to REFRESH MATERIALIZED VIEW CONCURRENTLY mentioned later in this post). Upon inspection, this is what I got:

SELECT 
  l.pid, application_name app, state, query,
  age(clock_timestamp(), state_change) AS change,
  age(clock_timestamp(), query_start) AS age,
  wait_event, mode, granted 
FROM pg_stat_activity p INNER JOIN pg_locks l on p.pid = l.pid 
WHERE query NOT LIKE '% FROM pg_stat_activity %' 
ORDER BY granted, age;
pid app state query change age wait_event mode granted
874173 psql active update t set b = NOT b; 19:39:06.762432 19:39:06.76241 DataFileRead RowExclusiveLock t
874173 psql active update t set b = NOT b; 19:39:06.762445 19:39:06.762419 DataFileRead ExclusiveLock t
874173 psql active update t set b = NOT b; 19:39:06.762448 19:39:06.762424 DataFileRead ExclusiveLock t
874173 psql active update t set b = NOT b; 19:39:06.762451 19:39:06.762427 DataFileRead RowExclusiveLock t

Solution

  • Does it make sense for the same query to have multiple ExclusiveLock locks on the same table? They are not 2 separate queries issued by different connections since the pid is the same; it's running in the same process.

    Those ExclusiveLock locks aren't on a table, at all. Include pg_locks.relation and you'll see they don't concern any specific object (it shows null). The pg_locks.locktype will tell you they both just identify that transaction, and that it began to alter the database state:
    demo at db<>fiddle

    create table t(b) as select true from generate_series(1,5e4);
    create index on t(b);
    
    create extension if not exists dblink;--spawning a parallel session
    select dblink_connect('_','');
    select dblink_send_query('_','begin;update t set b=not b;select pg_sleep(9)');
    
    SELECT pid
         , l.locktype
         , l.relation::regclass
         , left(p.query,12) as query
         , p.state
         , p.wait_event
         , l.mode
         , l.granted 
    FROM pg_stat_activity AS p 
    JOIN pg_locks AS l 
    USING(pid)
    WHERE pid <> pg_backend_pid(); --ignore my own query in pg_stat_activity
    
    pid locktype relation query state wait_event mode granted
    787 relation t_b_idx begin; updat active null RowExclusiveLock t
    787 relation t begin; updat active null RowExclusiveLock t
    787 virtualxid null begin; updat active null ExclusiveLock t
    787 transactionid null begin; updat active null ExclusiveLock t

    And why does it also have RowExclusiveLock locks and again, 2 of them?

    One RowExclusiveLock is for the table you're updating and I'm guessing the other concerns an index on it, which the update also has to process. The pg_locks.relation::regclass will tell you that.

    I understand from the documentation on ExclusiveLock locks that this lock is only acquired by REFRESH MATERIALIZED VIEW CONCURRENTLY.

    It's an example, not the only situation it appears in. You can just open a transaction with a begin; and do nothing else, and you'll get the ExclusiveLock with locktype='virtualxid' just for that. As soon as you start writing, you'll get another one with locktype='transactionid' because that's the earliest point MVCC needs to start worrying about your actions.

    Is the instance internally running concurrent operations to update the table even though I did not explicitly ask for this to happen in my query?

    In general, it is. You have some level of control over that through asynchronous behaviour settings. Thing is, you're seeing a single pid there. If you did get multiple parallel workers running the query, you'd see multiple instances of the same one query under different pids deployed for the task in pg_stat_activity, all of which would share a leader_pid.