sql-serverdeadlocklocks

Avoiding deadlocks - How to lock on row level


I am experiencing a deadlock which I am not able to replicate experimentally. I believe from my experiments I figured out how to solve it but I would like to learn the underlying principle

Here is the SQL Server provided graph of the lock. From the object ID you can infer, that both processes are trying to access the same table. enter image description here

The Table looks like this. (I used a global temp table for testing)

CREATE TABLE ##tMyTable(
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Fk_1_Id] [int] NOT NULL,
    [Fk_2_Id] [int] NULL,
    [Fk_3_Id] [int] NULL,
    [Day] [date] NOT NULL,
    [Quantity] [decimal](19, 3) NOT NULL,
 CONSTRAINT [Pk_MyTable_Id] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

+ foreign key stuff. 

The columns with the prefix Fk_ are foreign keys.

There is also an index on the table:

CREATE NONCLUSTERED INDEX [Idx_IndexName] ON ##tMyTable
(
    [Fk_1_Id] ASC,
    [Fk_2_Id] ASC--,
    [Day] ASC,
    [Id] ASC
)
INCLUDE([Fk_3_Id],[Quantity]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

The aborted process was a delete statement:

delete from ##tMyTable where Fk_2_Id = 86416

The process which was allowed to live was an insert. (dates are german format).

insert into ##tMyTable (Fk_1_Id, Fk_2_Id, Day, Quantity) select Fk_1_Id, 86415, Day, Quantity from tOtherTableWithSimilarColumns where Fk_4_Id = 22 and Day >= '01.07.2024' and Day < '01.08.2024'

For testing I set up two sessions with the following code.

Begin transaction

Select @@SPID as FirstTransactionProcessID

Query I want to test

commit

If I execute the code without including commit the transactions are locking their resources and I can simulate the deadlock situation.

With exec sp_lock I can check locks afterwards.

Before starting I inserted the following into the table:

insert into ##tMyTable (Fk_1_Id, Fk_2_Id, Fk_3_Id, Tag, Menge)
Select 1, 1, NULL, '2024-01-01', 22.404

In session 1 I executed this:

insert into ##tMyTable (Fk_1_Id, Fk_2_Id, Fk_3_Id, Tag, Menge)
Select 2, 2, NULL, '2024-01-01', 22.404

And then in session 2 I executed this:

delete from ##tMyTable where Fk_2_Id = 1

The result was this: enter image description here

spid 77 is the insert and spid 69 is the delete.

As you can see, the delete tries to get a U lock on a key but does not get it because the insert X locks the key. However, I expected to see the insert locking a page with an IX lock while the delete wants the same page for a U lock.

Question: What about my testsetup is wrong? Why am I getting different locks compared to my original deadlock? And why doesn't the delete just put an IX lock on the key and operates on the level of the row? Two IX locks are compatible and the queries shouldn't bother each other. They are aiming at different IDs after all and therefore different rows.

What seems to solve my problem: If I create an index that only includes Fk_2_Id there are no locking conflicts in my testsetup. However, I would like to avoid "fixing" something without understanding why my solution works.

Deadlock XML:

<deadlock>
 <victim-list>
  <victimProcess id="process1f4f20e5468" />
 </victim-list>
 <process-list>
  <process id="process1f4f20e5468" taskpriority="0" logused="0" waitresource="PAGE: 8:1:11652007 " waittime="2865" ownerId="213325556" transactionname="DELETE" lasttranstarted="2024-08-04T10:02:44.943" XDES="0x1f4e33c3aa0" lockMode="U" schedulerid="2" kpid="9392" status="suspended" spid="56" sbid="0" ecid="2" priority="0" trancount="0" lastbatchstarted="2024-08-04T10:02:44.937" lastbatchcompleted="2024-08-04T10:02:44.933" lastattention="1900-01-01T00:00:00.933" clientapp=".Net SqlClient Data Provider" hostname="Our-Host" hostpid="11672" isolationlevel="read committed (2)" xactid="213325556" currentdb="8" currentdbname="Our_DB" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">
   <executionStack>
    <frame procname="adhoc" line="1" stmtend="152" sqlhandle="0x02000000cff49033589c5bbc920f9e3e6c20dd52ec7b2a6b0000000000000000000000000000000000000000">
unknown    </frame>
    <frame procname="adhoc" line="1" stmtend="152" sqlhandle="0x02000000a28e5e12abc21669e1bfc6fb7c33f25bfee40b1c0000000000000000000000000000000000000000">
unknown    </frame>
   </executionStack>
   <inputbuf>
delete from tMyTable where Fk_2_Id = 86416   </inputbuf>
  </process>
  <process id="process1f4f5011c28" taskpriority="0" logused="9944" waitresource="PAGE: 8:1:11631539 " waittime="4125" ownerId="213327344" transactionname="INSERT" lasttranstarted="2024-08-04T10:02:46.687" XDES="0x1f4e45e4460" lockMode="IX" schedulerid="4" kpid="11516" status="suspended" spid="68" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2024-08-04T10:02:46.657" lastbatchcompleted="2024-08-04T10:02:46.653" lastattention="1900-01-01T00:00:00.653" clientapp=".Net SqlClient Data Provider" hostname="Our-Host" hostpid="11672" loginname="loginname" isolationlevel="read committed (2)" xactid="213327344" currentdb="8" currentdbname="Our_DB" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">
   <executionStack>
    <frame procname="adhoc" line="1" stmtend="568" sqlhandle="0x02000000679aaf081f4d1ffe0710dbf9b238a8db3e6adb6b0000000000000000000000000000000000000000">
unknown    </frame>
   </executionStack>
   <inputbuf>
       insert into tMyTable (Fk_1_Id, Fk_2_Id, Day, Quantity) select Fk_1t_Id, 86415, Day, Quantity from tOtherTableWithSimilarColumns where Fk_4_Id = 22 and Day &gt;= '01.07.2024' and Day &lt; '01.08.2024'   </inputbuf>
  </process>
 </process-list>
 <resource-list>
  <pagelock fileid="1" pageid="11652007" dbid="8" subresource="FULL" objectname="Our_DB.dbo.tMyTable" id="lock1f7bf84a200" mode="IX" associatedObjectId="72057661408477184">
   <owner-list>
    <owner id="process1f4f5011c28" mode="IX" />
   </owner-list>
   <waiter-list>
    <waiter id="process1f4f20e5468" mode="U" requestType="wait" />
   </waiter-list>
  </pagelock>
  <pagelock fileid="1" pageid="11631539" dbid="8" subresource="FULL" objectname="Our_DB.dbo.tMyTable" id="lock1f4e3647700" mode="U" associatedObjectId="72057661408477184">
   <owner-list>
    <owner id="process1f4f20e5468" mode="U" />
   </owner-list>
   <waiter-list>
    <waiter id="process1f4f5011c28" mode="IX" requestType="wait" />
   </waiter-list>
  </pagelock>
 </resource-list>
</deadlock>

Solution

  • Each Foreign Key should be supported by an index, and you're missing an index on FK_2_id.

    The DELETE can't make use of the Idx_IndexName index, because the leading key is Fk_1_Id which is not being seeked in the DELETE. So instead it's scanning the whole table, which causes massive deadlocking problems. It's a classic Bookmark Deadlock.

    So you need an additional index.