I am trying to find the source of an intermittent Stack Overflow exception on a live ASP.NET 4.0 site. I have captured some crash dumps using ADPlus (adplus_old.vbs, not the new adplus.exe) using a custom configuration that ignores all other types of exception (see 1st answer here: Help catching StackOverflowException with WinDbg and ADPlus).
I am running Windbg on the same server that the application runs on. The server is Intel based running 64 bit Win 2003. The WinDbg version is 6.12 for 64 bit. I generated PDB files for the assemblies that I suspect the exception is coming from and put them in the site's /bin folder (there are several assemblies I have no PDB files for but I assume they are not implicated in this problem). I pointed the environment variable _NT_SYMBOL_PATH to the /bin folder. WinDbg shows the symbol path as: C:\WINDOWS\Microsoft.NET\Framework64\v4.0.30319;D:\InetPub\LiveSites\MySite\bin
When I run WinDbg, after opening the crash dump I run .loadby sos clr then !clrstack. The output is pretty minimal - when I set up a deliberate S/O on a test site I got a clear indication of the method causing the exception. What is going wrong?
Microsoft (R) Windows Debugger Version 6.12.0002.633 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [F:\ADPlus\Crash_Mode__Date_02-03-2012__Time_14-36-11PM\PID-9212__W3WP.EXE_-MySite-__1st_chance_StackOverflow__full_0140_2012-02-04_00-24-45-123_23fc.dmp]
User Mini Dump File with Full Memory: Only application data is available
Comment: '1st_chance_StackOverflow_exception_in_W3WP.EXE_-MySite-_running_on_MyServer'
Symbol search path is: C:\WINDOWS\Microsoft.NET\Framework64\v4.0.30319;D:\InetPub\LiveSites\MySite\bin
Executable search path is:
Windows Server 2003 Version 3790 (Service Pack 2) MP (8 procs) Free x64
Product: Server, suite: Enterprise TerminalServer SingleUserTS
Machine Name:
Debug session time: Sat Feb 4 00:24:50.000 2012 (UTC + 0:00)
System Uptime: 185 days 15:16:43.314
Process Uptime: 0 days 9:49:58.000
................................................................
................................................................
................................................................
................................................................
.....................................
Loading unloaded module list
..
This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(23fc.21ec): Stack overflow - code c00000fd (first/second chance not available)
*** ERROR: Symbol file could not be found. Defaulted to export symbols for clr.dll -
clr!LogHelp_TerminateOnAssert+0x49e13:
00000644`7f14d883 894c2448 mov dword ptr [rsp+48h],ecx ss:00000000`0ee55fc8=00000000
0:054> .loadby sos clr
0:054> !clrstack
PDB symbol for clr.dll not loaded
OS Thread Id: 0x21ec (54)
Child SP IP Call Site
000000000eecf138 000006447f14d883 [GCFrame: 000000000eecf138]
000000000eecf178 000006447f14d883 [ContextTransitionFrame: 000000000eecf178]
000000000eecf1b8 000006447f14d883 [GCFrame: 000000000eecf1b8]
000000000eecf3a0 000006447f14d883 [ComMethodFrame: 000000000eecf3a0]
This is my ADPlus command:
adplus_old.vbs -p 12345 -c F:\ADPlus\DumpStackOverflow.cfg
(where 12345 is the PID of the w3wp.exe process I am attaching to.
This is my ADPlus config file:
<ADPlus>
<Settings>
<RunMode>CRASH</RunMode>
<OutputDir>F:\ADPlus</OutputDir>
</Settings>
<Exceptions>
<Option>FullDumpOnFirstChance</Option>
<Option>MiniDumpOnSecondChance</Option>
<Option>NoDumpOnFirstChance</Option>
<Option>NoDumpOnSecondChance</Option>
<Config>
<Code>AllExceptions</Code>
<Actions1>Void</Actions1>
<Actions2>Void</Actions2>
<ReturnAction1>GN</ReturnAction1>
<ReturnAction2>GN</ReturnAction2>
</Config>
<Config>
<!--
av = AccessViolation
ch = InvalidHandle
ii = IllegalInstruction
dz = IntegerDivide
c000008e = FloatingDivide
iov = IntegerOverflow
lsq = InvalidLockSequence
sov = StackOverflowException
eh = CPlusPlusEH
* = UnknownException
clr = NET_CLR
bpe = CONTRL_C_OR_Debug_Break
ld = DLL_Load
ud = DLL_UnLoad
epr = Process_Shut_Down
sbo = Stack_buffer_overflow
-->
<Code>sov;sbo</Code>
<Actions1>Log;Time;Stack;FullDump;EventLog</Actions1>
<CustomActions1>!runaway</CustomActions1>
<Actions2>Log;Time;Stack;FullDump;EventLog</Actions2>
<CustomActions2>!runaway</CustomActions2>
<!--
G = go
GN = go unhandled exception
GH = go handled exception
Q = quit
QD = quit and detach
-->
<ReturnAction1>GN</ReturnAction1>
<ReturnAction2>GN</ReturnAction2>
</Config>
<Config>
<Code>clr</Code>
<Actions1>Void</Actions1>
<Actions2>Log;Time;Stack;FullDump;EventLog</Actions2>
<ReturnAction1>GN</ReturnAction1>
<ReturnAction2>GN</ReturnAction2>
</Config>
<Config>
<Code>epr</Code>
<Actions1>Log;Time;Stack;FullDump;EventLog</Actions1>
<Actions2>Void</Actions2>
<ReturnAction1>GN</ReturnAction1>
<ReturnAction2>GN</ReturnAction2>
</Config>
</Exceptions>
</ADPlus>
Running "!analyze -v" may provide more information about the exception.