Say I have:
SELECT id FROM x WHERE timestamp < y
(returns id = [1, 2, 3]
)DELETE FROM x WHERE id = 1
SELECT timestamp, value FROM x WHERE id = 1
with READ-COMMITTED
isolation.
Does step 3 run the risk of returning an empty result, or does step 1 acquire some sort of lock/snapshot that prevents step 2 from altering the result? (I assume REPEATABLE-READ
will do what I want, but this question is about READ-COMMITTED
).
I am using postgresql, but I am interested in a DB-independent answer. For example, if some databases block the delete while others do not, I'd like to know that. Thank you.
The PostgreSQL case:
In PostgreSQL, reading a row doesn't acquire a lock that prevents the row from being deleted concurrently:
If T2 commits before step 3, T1, will see its results and return an empty result set.
If step 3 runs before T2 commits, the results of T2 are not visible yet, and the query will return the matching row.
The general case:
Database system have different ways to provide transaction isolation, and the behavior will vary depending on the method used.
Some database systems, like DB2 or Microsoft SQL Server, will lock rows when they are read to prevent concurrent updates.
On such database systems, the DELETE
will block, and the row is visible in step 3.
Most database systems use some kind of multi-versioning, i.e. they keep old versions of the rows around while a transaction that modifies the row is in progress.
On such database systems, the DELETE
is not blocked, and the result of step 3 will depend on whether T2 is already committed.
A solution:
If you are looking for a solution to make the behavior deterministic on all database systems, you can either use a higher isolation level, or you can use pessimistic locking by specifying FOR UPDATE
in the SELECT
statement in step 1. Then step 2 will always block.