network-programmingwindbgkernel-modulewdkndis

DPC_WATCHDOG_VIOLATION BSOD with spilock waiting


I have NDIS FilterDriver, running on Windows 10 (BUILD_VERSION_STRING: 18362.1.amd64fre.19h1_release.190318-1202). With one of the test script on the certain test PC, it fails with DPC_WATCHDOG_VIOLATION BSOD. Callstack looks like below:

nt!KeBugCheckEx
nt!KeAccumulateTicks+0x1815bd
nt!KeClockInterruptNotify+0xc07
hal!HalpTimerClockIpiRoutine+0x21
nt!KiCallInterruptServiceRoutine+0xa5
nt!KiInterruptSubDispatchNoLockNoEtw+0xfa
nt!KiInterruptDispatchNoLockNoEtw+0x37
nt!KxWaitForSpinLockAndAcquire+0x33
nt!KeAcquireSpinLockAtDpcLevel+0x5b
MYDRV!FilterReceiveNetBufferLists+0x1c8 [c:\MYDRV_SRC\my_filter.cpp @ 1768]
ndis!ndisCallReceiveHandler+0x60
ndis!ndisInvokeNextReceiveHandler+0x206cf
ndis!NdisMIndicateReceiveNetBufferLists+0x104
e1i65x64!RECEIVE::RxIndicateNBLs+0x12f
e1i65x64!RECEIVE::RxProcessInterrupts+0x20a
e1i65x64!INTERRUPT::MsgIntDpcTxRxProcessing+0x124
e1i65x64!INTERRUPT::MsgIntMessageInterruptDPC+0x1ff
e1i65x64!INTERRUPT::MiniportMessageInterruptDPC+0x28
ndis!ndisInterruptDpc+0x19c
nt!KiExecuteAllDpcs+0x30a
nt!KiRetireDpcList+0x1ef
nt!KxRetireDpcList+0x5
nt!KiDispatchInterruptContinue
nt!KiDpcInterrupt+0x2ee
nt!RtlpHpSegPageRangeShrink+0x2d5
nt!ExFreeHeapPool+0x751
nt!ExFreePool+0x9
MYDRV!StartReqCancel+0x16f [c:\MYDRV_SRC\my_device.cpp @ 1420] 
nt!IoCancelIrp+0x71
nt!IopCancelIrpsInCurrentThreadList+0x104

MYDRV!StartReqCancel is Cancel routine to user's IRP. Execution stuck in FilterReceiveNetBufferLists here:

NdisDprAcquireSpinLock(&pFilter->startIrpLock);

In MYDRV!StartReqCancel cancel routine this spinlock is locked to free the associated resources, like:

    // Clear Cancel routine
    IoSetCancelRoutine(pIrp, NULL);
    // Release the cancel spinlock
    IoReleaseCancelSpinLock(pIrp->CancelIrql);

    NdisDprAcquireSpinLock(&pFilter->startIrpLock);

    // Clear capture data
    //...
    for (int i=0; i<pFilter->dataN; ++i )
       ExFreePool(pFilter->pCaptData[i]); //non-paged data!
    //...

    NdisDprReleaseSpinLock(&pFilter->startIrpLock);
    
    // Complete the request
    pIrp->IoStatus.Status = STATUS_CANCELLED;
    pIrp->IoStatus.Information = 0;
    IoCompleteRequest(pIrp, IO_NO_INCREMENT);//CAPTURE_START_IRP

It looks like during the ExFreePool call, HW driver gets interrupt for incoming packets, and eventually calls my filter driver's FilterReceiveNetBufferLists, where it tries to acquire the pFilter->startIrpLock spinlock (at DISPATCH_LEVEL), which was locked by MYDRV!StartReqCancel.
But it looks like StartReqCancel never returns from ExFreePool. Value of pFilter->dataN is not too big (<64).

Do you have any ideas why it can be?


Solution

  • Just found a reason of this strange accident, thanks to Driver Verifier:

    //....
    
    // Release the cancel spinlock
    // Here we have  IRQL = DISPATCH_LEVEL here
    IoReleaseCancelSpinLock(pIrp->CancelIrql);
    // Now we have  IRQL < DISPATCH_LEVEL !!!
    
    NdisDprAcquireSpinLock(&pFilter->startIrpLock); // !!! No real spinlock acquisition here!!!
    
    // NON LOCKED code here.
    // ....
    // NdisDprReleaseSpinLock(&pFilter->startIrpLock); // Nothing unlocked here