sql postgresql transactions locking upsert

Postgres / SQL pattern for atomic SELECT and UPDATE or INSERT within a transaction

I have a simple scenario, where I want to atomically read and modify the state of a row. But the row may not exist yet.

For this example, I use the user_group_membership table:

user_id (pk) | group_id (pk) | state
-------------------------------------
1            | 3             | joined

User 1 is member of Group 3 with state joined (can also be invited or left or banned).
User 2 is not a member of Group 3. And never has been a member of that Group since there's no row in the table

The state value works like a state machine. There's a limited set of transitions:

null (no row present) -> invited, banned
invited -> joined, banned
joined  -> left, banned
left    -> invited, banned
banned  -> invited, left

If a row is already present I can use a SELECT ... FOR UPDATE to get the current state, validate the transition, update the state and commit the transaction. All other concurrent transactions will "wait" for the lock to be released. That's fine. In this case all state transitions run sequentially.

But if there is no row in the table, there's nothing to lock. So all concurrent transactions will try to execute an INSERT. The first will succeed and the rest will fail because of the duplicate primary key.

At this point I just could "rerun" the whole code, because now I know that the row exists and it will use the SELECT ... FOR UPDATE for locking/waiting. But I don't want to execute the same code twice. I'm looking for a more elegant solution.

What I came up with so far

This is a replacement for the SELECT ... FOR UPDATE:

INSERT INTO user_group_membership (user_id, group_id, state)
VALUES (2, 3, 'DUMMY_FOR_THE_ROW_LOCK')
ON CONFLICT (user_id, group_id) DO UPDATE
SET user_id = EXCLUDED.user_id
RETURNING *;

-- application code for validating state transition

UPDATE user_group_membership 
SET state = 'INVITED'
WHERE user_id = 2 AND group_id = 3;

This should prevent the situation where multiple concurrent transactions try to INSERT and will hit a duplicate key error.
The DO UPDATE part basically is a no-op, but it seems to be necessary to get RETURNING to work properly. This effectively replaces the SELECT.

Questions

Is this the right way to handle this scenario?
Is it "safe"?
Is there a better / easier solution?

Followup questions

How to properly handle dummy values? The state column is not nullable and of type enum (invited, joined, left, banned). Introducing a new enum value, which should never be used outside of this locking mechanism feels wrong. But I need some value to create and lock the row. Any ideas?

Solution

How to properly handle dummy values? The state column is not nullable

Don't use them. If you do, make it nullable and denote lack of user-group relation with a state is null.

Is this the right way to handle this scenario?
Is it "safe"?
Is there a better / easier solution?

If it works, it works. It's not unsafe but magic/dummy/flag/trip/rogue/signal/sentinel values aren't very elegant.

Make the selection, validation and upsert a single operation:
_{demo at db<>fiddle}

prepare find_validate_apply(int,int,text) as
with find as(
  select state
  from user_group_membership as f
  where $1=user_id and $2=group_id
  for update of f
  --limit 1--unncessary given the uniqueness and non-nullability
),empty_as_null as(--`coalesce()` for rows
 (select state from find)
  union(select null)
  order by 1 nulls last limit 1
),validate as(
  select exists(select from allowed_transitions as t
                where t.source_state is not distinct from found.state
                and t.target_state is not distinct from $3) 
         as is_transition_allowed
  from empty_as_null as found
),apply as(
  insert into user_group_membership
  select $1,$2,$3
  from validate 
  where is_transition_allowed
  on conflict(user_id,group_id)do update
  set state=$3
  returning *)
select*from apply;

Now each worker can wait for others to finish their whole thing before they attempt to apply their desired transition. Note that this doesn't require a dummy value or sentinel row - they lock each other out either in the initial select or in the final upsert (concurrent writes to the unique pk index), but the whole thing is one atomic operation so it doesn't matter which one exactly.

The demo uses an audit trigger to track operations on the table and dblink to run parallel workers - at the end you can see how they queued up and applied their changes one after another:

create function trg1_f()returns trigger as $f$
begin
  create table if not exists user_group_membership_history
  as select TG_OP,transaction_timestamp(),clock_timestamp()
           ,OLD.user_id as OLD_user_id
           ,OLD.group_id as OLD_group_id
           ,OLD.state as OLD_state
           ,NEW.* 
  limit 0;

  insert into user_group_membership_history
  select TG_OP,now(),clock_timestamp(),OLD.*,NEW.*;
  
  return new;
end $f$ language plpgsql;

create trigger trg1 
after insert or update or delete
on user_group_membership
for each row execute function trg1_f();

create extension dblink;
select dblink_connect('another_worker1','');--defaults to localhost here
select dblink_connect('another_worker2','');
select dblink_send_query('another_worker1',
  $q$ begin;
      select pg_sleep(0.5);
      select find_validate_apply(9,9,'invited');
      select pg_sleep(1);
      commit;
  $q$);

select dblink_send_query('another_worker2',
  $q$ begin;
      select pg_sleep(0.7);
      select find_validate_apply(9,9,'banned');
      select pg_sleep(1);
      commit;
  $q$);
select pg_sleep(4);

table user_group_membership_history;

user_id	group_id	state
9	9	banned

tg_op	transaction_timestamp	clock_timestamp	old_user_id	old_group_id	old_state	user_id	group_id	state
INSERT	2024-11-13 16:18:18.929716+00	2024-11-13 16:18:19.94236+00	null	null	null	9	9	invited
UPDATE	2024-11-13 16:18:18.932864+00	2024-11-13 16:18:20.946128+00	9	9	invited	9	9	banned

The allowed_transitions idea is self-explanatory:

create table allowed_transitions(source_state,target_state)
as values
 (null,'invited'),(null, 'banned')
,('invited','joined'),('invited','banned')
,('joined','left'),('joined','banned')
,('left','invited'),('left','banned')
,('banned','invited'),('banned','left');
alter table allowed_transitions 
  add constraint uniq unique(source_state,target_state);