sqlsql-serversql-server-2005

Why Does SQL Update Top apparently reduce locking even when no records are updated?


I have a question about why some SQL (running on SQL Server 2005) is behaving the way it is. Specifically, I made a change to reduce lock contention during an update and it seems to be working in cases I didn't think it would.

Original code:

We had an update statement like this, that was being applied to a table with more than 3,000,000 records:

UPDATE  USER WITH (ROWLOCK)
SET Foo = 'N', Bar = getDate()
WHERE ISNULL(email, '') = ''
AND   Foo = 'Y'

As you can probably guess, this seemed to lock up the USER table for a while. Even with the ROWLOCK hint, other jobs running queries and updates against USER would block until this was done. That's not acceptable for this particular application, so I thought I'd apply a trick I read about by having the update statement only update 100 records at a time. This would give other queries a chance to get at the table occasionally.

Improved code:

DECLARE @LOOPAGAIN AS BIT;
SET @LOOPAGAIN = 1;

WHILE @LOOPAGAIN = 1
  BEGIN
    UPDATE TOP (100) USER WITH (ROWLOCK)
    SET Foo = 'N', Bar = getDate()
    WHERE ISNULL(email, '') = ''
    AND   Foo = 'Y'

    IF @@ROWCOUNT > 0
      SET @LOOPAGAIN = 1
    ELSE
      SET @LOOPAGAIN = 0
  END

This did the trick. Our update did its work and other queries were able to get at the table. All is happiness and light.

The Mystery:

I understand how this improved the performance when there were many records in the table it had to update. By taking a quick run through the loop after every 100 updates, it gave other queries a chance to get at the table. The mystery is that this loop had the same effect even when there were no records affected by the update!

The second time we would run our original query, it would run for only a fraction of the time (say 30 seconds or so), but it would lock up the table during that time even though no records were being changed. But put the query in the loop with the "TOP (100)" clause, and even though it took just as long to do nothing, it freed up the table for other queries!

I'm very surprised at this. Can anyone tell me:

  1. If what I just said is at all clear and,
  2. Why the second block of code allows other queries to get at the table even when there are no records being updated?

Solution

  • This sounds like a classic case of lock escalation.

    In the first scenario you are updating what looks like it could be a lot of records from your 3,000,000 row table. There are two important things to consider:

    1. SQL Server 2005 will escalate your lock when 5,000 locks are acquired on a single table or index. There are caveats and exceptions to this so see Lock Escalation (Database Engine) for more information.
    2. Lock hints such as ROWLOCK do not prevent lock escalation.
    3. "The Database Engine does not escalate row or key-range locks to page locks, but escalates them directly to table locks."

    So based on the above and your description, I would guess that your query, which is attempting to lock rows, is being escalated to a table level lock and is blocking all access to the User table. You notice this blocking because the update is taking a long time since the table is large.

    Recommendations to avoid lock escalation are:

    1. Break up large operations to smaller operations (you've done this!).
    2. Tune your query to be as efficient as possible.
    3. As a last resort you can set trace flag 1211 to disable lock escalation (not recommended!).

    See How to resolve blocking problems that are caused by lock escalation in SQL Server for more details.

    If you want to verify that lock escalation is what is going on you can use the the SQL Server Profiler and look at the Lock:Escalation event.