sql-servert-sqlsql-server-2012

Using WITH (NOLOCK) and Transactions


Say I have a simple query such as

Select * From MyTable WITH (NOLOCK)

And, at the same time this query is executing, another user is inserting 100 rows into this table within the context of a transaction.

Is it theoretically possible that the Select statement, because it's using NOLOCK, could read a subset of the 100 rows being inserted into the table, before the transaction is committed or rolled back? If I understand NOLOCK correctly, this seems like it could be a possibility, but wanted to verify.

SQL Server 2012


Solution

  • Certainly, you can read a subset or all of the uncommitted data that is affected between the beginning of a transaction and the commit or rollback. That's kind of the point of NOLOCK - to allow you to read data that is not committed so that you don't have to wait for writers, and to avoid placing most locks so that writers don't have to wait for you.

    Proof #1

    This is very easy to prove. In one window, create this table:

    CREATE TABLE dbo.what(id INT);
    

    In a second window, run this query:

    DECLARE @id INT;
    
    WHILE 1 = 1
    BEGIN
     SELECT @id = id FROM dbo.what WITH (NOLOCK) WHERE id = 2;
    
     IF @id = 2
     BEGIN
       PRINT @id;
       BREAK;
     END
    END
    

    Now go back to the first window, and start an intentionally long-running transaction, but roll it back:

    BEGIN TRANSACTION;
    GO
    
    INSERT dbo.what SELECT 2;
    GO 10000
    
    ROLLBACK TRANSACTION;
    

    As soon as you start this in the first window, the query in the second window will halt and will spit out the uncommitted value that was read.

    Proof #2

    This is primarily to contest @Blam's comment above, which I don't agree with:

    Actually I think you could read all 100 not just a subset prior to the commit or rollback.

    You most certainly can read subsets of rows affected by a transaction. Try the following, similar example, this time inserting sets of 100 into the table, 1000 times, and retrieving the count in the query using (NOLOCK). Window #1 (if you haven't already tested proof #1 above):

    CREATE TABLE dbo.what(id INT);
    

    Window #2:

    DECLARE @c INT;
    
    WHILE 1 = 1
    BEGIN
     SELECT @c = COUNT(*) FROM dbo.what WITH (NOLOCK) WHERE id = 2;
    
     IF @c > 0
       PRINT @c;
    
     IF @c > 10000
       BREAK;
    END
    

    Back in Window #1:

    BEGIN TRANSACTION;
    GO
    
    INSERT dbo.what SELECT TOP (100) 2 FROM sys.all_objects;
    GO 1000
    
    ROLLBACK TRANSACTION;
    

    Window #2 will spin until you start the transaction. As soon as you do, you'll start seeing counts trickle in. But they won't be in multiples of 100 (never mind 100,000, the all or nothing claim @Blam seems to be making). Here are my abridged results:

    1
    10
    12
    14
    17
    19
    23
    25
    29
    ...
    85
    87
    91
    95
    98
    100
    100
    ...
    9700
    9700
    9763
    9800
    9838
    9900
    9936
    10000
    10000
    10000
    10080
    

    The NOLOCK query clearly does not wait for any single statement to finish before reading the data, never mind the entire transaction. So you could get the data in any state of flux, regardless of how many rows each statement affects and regardless of how many statements are in the entire transaction.

    Other side effects

    There are also cases where NOLOCK can skip rows, or read the same row twice, depending on the type of scan and when another transaction is generating page splits. Essentially what happens is as the (NOLOCK) query is reading data, other writes can actually move data to a different location - because they can - either moving a row you've already read to a point further ahead in your scan, or moving a row you haven't read yet to a point earlier in your scan.

    Advice

    In general, it's bad news, and you should consider READ_COMMITTED_SNAPSHOT instead - it has the same benefit of allowing readers to not block writers and vice-versa, but gives you a consistent view of the data at a point in time, ignoring all subsequent data modifications (this has an impact to tempdb though, so be sure to test it). Very thorough information here.