postgresqldistributed-systemsystem-design

How do I limit the count of rows in a result set in Postgres without unnecessary locking?


This question is inspired by a 'general admission' variant of the common 'event ticketing' System Design interview question. Critically in this version, the user does not select a seat - they only say they want a ticket.

Given a table Reservations with some columns event_id, user_id, paid, hold_expires_at

I want to know how I can do the following:

  1. Count the number of reservations that are either paid or still on hold
  2. Insert a new reservation row into the table if that count is less than N

I know that with READ_COMMITTED isolation level I can't simply run a SELECT COUNT()... query followed by INSERT INTO Reservation VALUES... in a transaction and expect the limit to be enforced. It is not actually clear to me that setting the transaction's isolation level to SERIALIZABLE would even guarantee this in Postgres since from the DB's perspective, I wouldn't expect there to be any dependency detected on between queries in 1) and 2). Maybe if I combined the two queries into one and had a predicate on the INSERT based on the COUNT, that would work? The documentation has an example where a serialization error occurs as a result of inserting the sum of a column in a table back into that same table in two different transactions.

Ideally if there are, say 1000 tickets for sale and 1200 requests came in, they would not all have to execute serially - that is, they would execute concurrently without causing serialization errors until there was contention for the 1000th unsold/unheld ticket.

Also curious about ways to do this outside of Postgres.


Solution

  • Using the SERIALIZABLE isolation level will guarantee that your constraint won't be violated. If a transaction succeeds, you are guaranteed that there is an equivalent serial execution order for all transactions, and with serial execution, no anomaly could ever happen.

    An alternative would be a SHARE lock on the table, but that is a bad idea, since that lock conflicts with VACUUM.

    As a third suggestion, you could serialize the operations explicitly. If you want to do that with database means, you could have a second table that holds the current reservation count. That table gets updated by a deferred constraint trigger whenever you make a reservation, so that that row is not locked any longer than necessary to guarantee serialization. Then a check constraint on that second table will enforce the limit on the reservations.