.netclrdeadlockwindbghung

Windbg !analyze -v hang and .NET not producing expected result


I'm trying to diagnose an application that is hanging randomly after long intervals.

I read about windbg !analyze -v -hang but I don't know how to make it work with .NET

Used this simulated deadlock as a target. https://dotnettutorials.net/lesson/deadlock-in-csharp/

But after attaching the debugger and running the command I get this output:

0:008> !analyze -v -hang
*******************************************************************************
*                                                                             *
*                        Exception Analysis                                   *
*                                                                             *
*******************************************************************************


KEY_VALUES_STRING: 1

    Key  : Analysis.CPU.Sec
    Value: 15

    Key  : Analysis.DebugAnalysisProvider.CPP
    Value: Create: 8007007e on MACHINENAME

    Key  : Analysis.DebugData
    Value: CreateObject

    Key  : Analysis.DebugModel
    Value: CreateObject

    Key  : Analysis.Elapsed.Sec
    Value: 15

    Key  : Analysis.Memory.CommitPeak.Mb
    Value: 135

    Key  : Analysis.System
    Value: CreateObject

    Key  : CLR.Engine
    Value: CLR

    Key  : CLR.Version
    Value: 4.0.30319.0

    Key  : Timeline.OS.Boot.DeltaSec
    Value: 79604

    Key  : Timeline.Process.Start.DeltaSec
    Value: 3498


NTGLOBALFLAG:  0

PROCESS_BAM_CURRENT_THROTTLED: 0

PROCESS_BAM_PREVIOUS_THROTTLED: 0

APPLICATION_VERIFIER_FLAGS:  0

CONTEXT:  (.cxr;r)
eax=010cb000 ebx=00000000 ecx=7740dd40 edx=7740dd40 esi=7740dd40 edi=7740dd40
eip=773d4e20 esp=0190f9c4 ebp=0190f9f0 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000244
ntdll!DbgBreakPoint:
773d4e20 cc              int     3

EXCEPTION_RECORD:  (.exr -1)
ExceptionAddress: 773d4e20 (ntdll!DbgBreakPoint)
   ExceptionCode: 80000003 (Break instruction exception)
  ExceptionFlags: 00000000
NumberParameters: 1
   Parameter[0]: 00000000

FAULTING_THREAD:  0000521c

PROCESS_NAME:  deadlockExample.exe

WATSON_BKT_EVENT:  AppHang

BLOCKING_THREAD:  0000521c

ERROR_CODE: (NTSTATUS) 0xcfffffff - <Unable to get error code text>

EXCEPTION_CODE_STR:  cfffffff

EXCEPTION_PARAMETER1:  00000000

MISSING_CLR_SYMBOL: 0

DERIVED_WAIT_CHAIN:  

Dl Eid Cid     WaitType
-- --- ------- --------------------------
   2   3328.521c (null)                 

WAIT_CHAIN_COMMAND:  ~2s;k;;

STACK_TEXT:  
0190f9c0 7740dd79 6a75cf20 7740dd40 7740dd40 ntdll!DbgBreakPoint
0190f9f0 75920099 00000000 75920080 0190fa5c ntdll!DbgUiRemoteBreakin+0x39
0190fa00 773c7b6e 00000000 6a75cc8c 00000000 KERNEL32!BaseThreadInitThunk+0x19
0190fa5c 773c7b3e ffffffff 773e8c85 00000000 ntdll!__RtlUserThreadStart+0x2f
0190fa6c 00000000 7740dd40 00000000 00000000 ntdll!_RtlUserThreadStart+0x1b


STACK_COMMAND:  ~2s ; .cxr ; kb

SYMBOL_NAME:  ntdll!DbgBreakPoint+190f9c0

MODULE_NAME: ntdll

IMAGE_NAME:  ntdll.dll

FAILURE_BUCKET_ID:  APPLICATION_HANG_cfffffff_ntdll.dll!DbgBreakPoint

OS_VERSION:  10.0.19041.1

BUILDLAB_STR:  vb_release

OSPLATFORM_TYPE:  x86

OSNAME:  Windows 10

FAILURE_ID_HASH:  {b7187eb3-1b5b-0cf1-721e-1b1a2f6e4ee5}

Followup:     MachineOwner
---------

If I'm reading this correctly, It's detecting the debugger breakpoint as the reason for the hang, and not the breakpoint. How can I make analyze ignore it and show the real reason.


Solution

  • How can I make !analyze ignore it and show the real reason.

    I don't know. Sometimes it works, sometimes not. I got used to that.

    Potential other solutions if it does not work.

    Handle analysis

    The general approach to analyze deadlock is the wait chain analysis. This can be really tedious, needs quite some knowledge and a sheet of paper. You'll be looking for special methods on the call stack, like WaitForSingleObject, WaitForMultipleObjects, etc. You'd then try to get information about the handles being used (!handle), which needs a crash dump with handle information. You draw all threads on a sheet of paper, along with the handles being waited for.

    If you find yourself in a situation where something is waiting for a handle and you can't find a thread owning that handle, consider that the owning thread may have died (exited via an exception, not freeing the handle). It's hard to see something which is not there.

    Other synchronization object analysis

    Some synchronization objects don't use handles (kernel objects) but work in user mode, like critical sections (someone correct me if I'm wrong). Luckily, WinDbg sometimes has its own commands for such objects, like the !locks command.

    .NET hangs

    Now, WinDbg is a native debugger by default, which means it's good at assembler, raw memory, Windows APIs and stuff that is really on the call stack.

    .NET is a bit different. It uses IL and has its own memory organization due to GC. Therefore, WinDbg knows almost nothing about .NET except that memory is allocated using VirtualAlloc() (the lowest possible level) and that eventually some code is compiled to machine code which is then executed by the CPU. All the .NET stuff is basically unknown to WinDbg and thus invisible.

    That's why you need a .NET specific extension called SOS, which comes with .NET Framework and can be obtained for .NET Core. SOS has some commands that can help you with synchronization objects, like !syncblk or !threads. Basically that will help you with the lock(...) statement of .NET.

    SOSEx has a !dlk command to detect deadlocks. As far as I can recall, it will detect .NET lock(...) statements, ReadWriterLock, ReadWriterLockSlim and Critical Sections. Unfortunately it's built for .NET Framework and I'm not sure it'll work with .NET Core.