I'm trying to diagnose an application that is hanging randomly after long intervals.
I read about windbg !analyze -v -hang
but I don't know how to make it work with .NET
Used this simulated deadlock as a target. https://dotnettutorials.net/lesson/deadlock-in-csharp/
But after attaching the debugger and running the command I get this output:
0:008> !analyze -v -hang
*******************************************************************************
* *
* Exception Analysis *
* *
*******************************************************************************
KEY_VALUES_STRING: 1
Key : Analysis.CPU.Sec
Value: 15
Key : Analysis.DebugAnalysisProvider.CPP
Value: Create: 8007007e on MACHINENAME
Key : Analysis.DebugData
Value: CreateObject
Key : Analysis.DebugModel
Value: CreateObject
Key : Analysis.Elapsed.Sec
Value: 15
Key : Analysis.Memory.CommitPeak.Mb
Value: 135
Key : Analysis.System
Value: CreateObject
Key : CLR.Engine
Value: CLR
Key : CLR.Version
Value: 4.0.30319.0
Key : Timeline.OS.Boot.DeltaSec
Value: 79604
Key : Timeline.Process.Start.DeltaSec
Value: 3498
NTGLOBALFLAG: 0
PROCESS_BAM_CURRENT_THROTTLED: 0
PROCESS_BAM_PREVIOUS_THROTTLED: 0
APPLICATION_VERIFIER_FLAGS: 0
CONTEXT: (.cxr;r)
eax=010cb000 ebx=00000000 ecx=7740dd40 edx=7740dd40 esi=7740dd40 edi=7740dd40
eip=773d4e20 esp=0190f9c4 ebp=0190f9f0 iopl=0 nv up ei pl zr na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000244
ntdll!DbgBreakPoint:
773d4e20 cc int 3
EXCEPTION_RECORD: (.exr -1)
ExceptionAddress: 773d4e20 (ntdll!DbgBreakPoint)
ExceptionCode: 80000003 (Break instruction exception)
ExceptionFlags: 00000000
NumberParameters: 1
Parameter[0]: 00000000
FAULTING_THREAD: 0000521c
PROCESS_NAME: deadlockExample.exe
WATSON_BKT_EVENT: AppHang
BLOCKING_THREAD: 0000521c
ERROR_CODE: (NTSTATUS) 0xcfffffff - <Unable to get error code text>
EXCEPTION_CODE_STR: cfffffff
EXCEPTION_PARAMETER1: 00000000
MISSING_CLR_SYMBOL: 0
DERIVED_WAIT_CHAIN:
Dl Eid Cid WaitType
-- --- ------- --------------------------
2 3328.521c (null)
WAIT_CHAIN_COMMAND: ~2s;k;;
STACK_TEXT:
0190f9c0 7740dd79 6a75cf20 7740dd40 7740dd40 ntdll!DbgBreakPoint
0190f9f0 75920099 00000000 75920080 0190fa5c ntdll!DbgUiRemoteBreakin+0x39
0190fa00 773c7b6e 00000000 6a75cc8c 00000000 KERNEL32!BaseThreadInitThunk+0x19
0190fa5c 773c7b3e ffffffff 773e8c85 00000000 ntdll!__RtlUserThreadStart+0x2f
0190fa6c 00000000 7740dd40 00000000 00000000 ntdll!_RtlUserThreadStart+0x1b
STACK_COMMAND: ~2s ; .cxr ; kb
SYMBOL_NAME: ntdll!DbgBreakPoint+190f9c0
MODULE_NAME: ntdll
IMAGE_NAME: ntdll.dll
FAILURE_BUCKET_ID: APPLICATION_HANG_cfffffff_ntdll.dll!DbgBreakPoint
OS_VERSION: 10.0.19041.1
BUILDLAB_STR: vb_release
OSPLATFORM_TYPE: x86
OSNAME: Windows 10
FAILURE_ID_HASH: {b7187eb3-1b5b-0cf1-721e-1b1a2f6e4ee5}
Followup: MachineOwner
---------
If I'm reading this correctly, It's detecting the debugger breakpoint as the reason for the hang, and not the breakpoint. How can I make analyze
ignore it and show the real reason.
How can I make
!analyze
ignore it and show the real reason.
I don't know. Sometimes it works, sometimes not. I got used to that.
The general approach to analyze deadlock is the wait chain analysis. This can be really tedious, needs quite some knowledge and a sheet of paper. You'll be looking for special methods on the call stack, like WaitForSingleObject, WaitForMultipleObjects, etc. You'd then try to get information about the handles being used (!handle
), which needs a crash dump with handle information. You draw all threads on a sheet of paper, along with the handles being waited for.
If you find yourself in a situation where something is waiting for a handle and you can't find a thread owning that handle, consider that the owning thread may have died (exited via an exception, not freeing the handle). It's hard to see something which is not there.
Some synchronization objects don't use handles (kernel objects) but work in user mode, like critical sections (someone correct me if I'm wrong). Luckily, WinDbg sometimes has its own commands for such objects, like the !locks
command.
Now, WinDbg is a native debugger by default, which means it's good at assembler, raw memory, Windows APIs and stuff that is really on the call stack.
.NET is a bit different. It uses IL and has its own memory organization due to GC. Therefore, WinDbg knows almost nothing about .NET except that memory is allocated using VirtualAlloc()
(the lowest possible level) and that eventually some code is compiled to machine code which is then executed by the CPU. All the .NET stuff is basically unknown to WinDbg and thus invisible.
That's why you need a .NET specific extension called SOS, which comes with .NET Framework and can be obtained for .NET Core. SOS has some commands that can help you with synchronization objects, like !syncblk
or !threads
. Basically that will help you with the lock(...)
statement of .NET.
SOSEx has a !dlk
command to detect deadlocks. As far as I can recall, it will detect .NET lock(...)
statements, ReadWriterLock, ReadWriterLockSlim and Critical Sections. Unfortunately it's built for .NET Framework and I'm not sure it'll work with .NET Core.