I have a question about why some SQL (running on SQL Server 2005) is behaving the way it is. Specifically, I made a change to reduce lock contention during an update and it seems to be working in cases I didn't think it would.
Original code:
We had an update statement like this, that was being applied to a table with more than 3,000,000 records:
UPDATE USER WITH (ROWLOCK)
SET Foo = 'N', Bar = getDate()
WHERE ISNULL(email, '') = ''
AND Foo = 'Y'
As you can probably guess, this seemed to lock up the USER table for a while. Even with the ROWLOCK hint, other jobs running queries and updates against USER would block until this was done. That's not acceptable for this particular application, so I thought I'd apply a trick I read about by having the update statement only update 100 records at a time. This would give other queries a chance to get at the table occasionally.
Improved code:
DECLARE @LOOPAGAIN AS BIT;
SET @LOOPAGAIN = 1;
WHILE @LOOPAGAIN = 1
BEGIN
UPDATE TOP (100) USER WITH (ROWLOCK)
SET Foo = 'N', Bar = getDate()
WHERE ISNULL(email, '') = ''
AND Foo = 'Y'
IF @@ROWCOUNT > 0
SET @LOOPAGAIN = 1
ELSE
SET @LOOPAGAIN = 0
END
This did the trick. Our update did its work and other queries were able to get at the table. All is happiness and light.
The Mystery:
I understand how this improved the performance when there were many records in the table it had to update. By taking a quick run through the loop after every 100 updates, it gave other queries a chance to get at the table. The mystery is that this loop had the same effect even when there were no records affected by the update!
The second time we would run our original query, it would run for only a fraction of the time (say 30 seconds or so), but it would lock up the table during that time even though no records were being changed. But put the query in the loop with the "TOP (100)" clause, and even though it took just as long to do nothing, it freed up the table for other queries!
I'm very surprised at this. Can anyone tell me:
This sounds like a classic case of lock escalation.
In the first scenario you are updating what looks like it could be a lot of records from your 3,000,000 row table. There are two important things to consider:
So based on the above and your description, I would guess that your query, which is attempting to lock rows, is being escalated to a table level lock and is blocking all access to the User table. You notice this blocking because the update is taking a long time since the table is large.
Recommendations to avoid lock escalation are:
See How to resolve blocking problems that are caused by lock escalation in SQL Server for more details.
If you want to verify that lock escalation is what is going on you can use the the SQL Server Profiler and look at the Lock:Escalation event.