sql-servert-sqlnot-existssnapshot-isolation

Insert if not exist under RCSI


I have a database with READ_COMMITTED_SNAPSHOT_ISOLATION set ON (cannot change that).

I insert new rows into a table on many parallel sessions, but only if they don't exist there yet (classic left join check).

The inserting code looks like this:

INSERT INTO dbo.Destination(OrderID)
SELECT DISTINCT s.OrderID
FROM dbo.Source s
LEFT JOIN dbo.Destination d ON d.OrderID = s.OrderID
WHERE d.OrderID IS NULL;

If I run this on many parallel sessions I get a lot of duplicate key errors, since different sessions try to insert the same OrderIDs over and over again.

That is expected due to the lack of SHARED locks under RCSI.

The recommended solution here (as per my research) would be to use the READCOMMITTEDLOCK hint like this:

LEFT JOIN dbo.Destination d WITH (READCOMMITTEDLOCK) ON d.OrderID = s.OrderID

This somewhat works, as greatly reduces the duplicate key errors, but (to my surprise) doesn't completely eliminate them.

As an experiment I removed the unique constraint on the Destination table, and saw that many duplicates enters the table in the very same millisecond originated from different sessions.

It seems that despite the table hint, I still get false positive on the existence check, and the redundant insert fires.

I tried different hints (SERIALIZABLE) but it made it worse and swarmed me with deadlocks.

How could I make this insert work under RCSI?


Solution

  • The right lock hint for reading a table you are about to insert into is (UPDLOCK,HOLDLOCK), which will place U locks on the rows as you read them, and also place SERIALIZABLE-style range locks if the row doesn't exist.

    The problem with your approach is that each client is attempting to insert a batch of rows, and each batch has to either succeed completely or fail. If you use row-level locking, you will always have scenarios where a session inserts one row succesfully, but then becomes blocked waiting to read or insert a subsequent row. This inevitably leads to either PK failures or deadlocks, depending on the type of row lock used.

    The solution is to either:

    1) Insert the rows one-by-one, and not hold the locks from one row while you check and insert the next row.

    2) Simply escalate to a tablockx, or an Applciation Lock to force your concurrent sessions to serialize through this bit of code.

    So you can have highly-concurrent loads, or batch loads, but you can't have both. Well mostly.

    3) You could turn on IGNORE_DUP_KEY on the index, which instead of an error will just skip any duplicate when inserting.