.netwindbg

.NET Core Dump - how to get the function param value if not on the stack?


I have a dump of the msbuild process which crashes with OutOfMemory exception in the following call stack:

0:053> !mk
Thread 53:
        SP       IP
00:U 0747e070 76fbf3ec ntdll!NtWaitForMultipleObjects+0xc
01:U 0747e074 74d64ae0 KERNELBASE!WaitForMultipleObjectsEx+0xf0
02:U 0747e208 74d649d8 KERNELBASE!WaitForMultipleObjects+0x18
03:U 0747e224 744971f2 kernel32!WerpReportFaultInternal+0x59d
04:U 0747e6a0 74496c36 kernel32!WerpReportFault+0x9b
05:U 0747e6bc 7446e629 kernel32!BasepReportFault+0x19
06:U 0747e6c4 74ded71a KERNELBASE!UnhandledExceptionFilter+0x25a
07:U 0747e75c 76fec6e1 ntdll!__RtlUserThreadStart+0x3be57
08:M 0747f584 718ab33b System.IO.MemoryStream.set_Capacity(Int32)(+0x57 IL,+0x5b Native)
09:M 0747f598 718ab3c3 System.IO.MemoryStream.EnsureCapacity(Int32)(+0x68 IL,+0x53 Native)
0a:M 0747f5a8 718b816a System.IO.MemoryStream.SetLength(Int64)(+0x60 IL,+0xba Native)
0b:M 0747f5c4 6ebe93e6 Microsoft.Build.BackEnd.NodeProviderOutOfProcBase+NodeContext.HeaderReadComplete(System.IAsyncResult)(+0x96 IL,+0x1f6 Native) [/_/src/Build/BackEnd/Components/Communications/NodeProviderOutOfProcBase.cs @ 959,17]
0c:M 0747f600 7032a6d6 System.IO.Pipes.PipeStream.AsyncPSCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)(+0xa6 Native)
0d:M 0747f618 7193565d System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)(+0x78 IL,+0x8d Native)
0e:U 0747fa70 76fb0884 ntdll!_RtlUserThreadStart+0x1b

Inspecting the stack frames does not help:

0:053> !mframe a
0:053> !mdv
Frame 0xa: (System.IO.MemoryStream.SetLength(Int64)):
[A0]:this: <UNAVAILABLE> (System.IO.MemoryStream)
[A1]:value: <UNAVAILABLE> (System.Int64)
[L0]: <UNAVAILABLE> (System.Int32)
[L1]: <UNAVAILABLE> (System.Boolean)

0:053> !mframe 9
0:053> !mdv
Frame 0x9: (System.IO.MemoryStream.EnsureCapacity(Int32)):
[A0]:this: <UNAVAILABLE> (System.IO.MemoryStream)
[A1]:value: <UNAVAILABLE> (System.Int32)
[L0]: <UNAVAILABLE> (System.Int32)

0:053> !mframe 8
0:053> !mdv
Frame 0x8: (System.IO.MemoryStream.set_Capacity(Int32)):
[A0]:this: <UNAVAILABLE> (System.IO.MemoryStream)
[A1]:value: <UNAVAILABLE> (System.Int32)
[L0]: <UNAVAILABLE> (System.Byte[])

There is nothing helpful, as far as I can see, on the stack:

0:053> !mdso
Thread 53:
Location          Object            Type
------------------------------------------------------------
0747e850  01a38ac0  System.Threading.Tasks.TplEtwProvider
0747ef54  01fe1ff0  System.Byte[]
0747ef58  01d09c04  System.Threading.OverlappedData
0747f20c  56301c60  Microsoft.Build.Framework.BuildEventContext
0747f2cc  55ff5128  System.OutOfMemoryException
0747f4ec  020fde88  System.IO.MemoryStream
0747f5ac  01c9f454  Microsoft.Build.BackEnd.NodeProviderOutOfProcBase+NodeContext
0747f730  01d09524  System.Threading.OverlappedData

Now I can inspect the System.IO.MemoryStream object to get its capacity BEFORE the attempt that causes the OOM (I assume that is the instance in question):

0:053> !mdt 020fde88
020fde88 (System.IO.MemoryStream)
    __identity:NULL (System.Object)
    _activeReadWriteTask:NULL (System.IO.Stream+ReadWriteTask)
    _asyncActiveSemaphore:NULL (System.Threading.SemaphoreSlim)
    _buffer:80011010 (System.Byte[])
    _origin:0x0 (System.Int32)
    _position:0x0 (System.Int32)
    _length:0x2d468088 (System.Int32)
    _capacity:0x2d468088 (System.Int32)
    _expandable:true (System.Boolean)
    _writable:true (System.Boolean)
    _exposable:true (System.Boolean)
    _isOpen:true (System.Boolean)
    _lastReadTask:NULL (System.Threading.Tasks.Task`1[[System.Int32, mscorlib]])

And it is already quite big - 0x2d468088 bytes, i.e. ~724MB and because its _length == _capacity it makes sense msbuild is trying to resize it even more and beyond the 724MB, which results in OOM.

What I am unable to understand is how can I get this new desired capacity from the dump, if at all?

I know it is probably insignificant in the analysis of this OOM, but for the sake of the science - is it possible to figure out this value from this dump?


Solution

  • Writing a MRE

    The following code should do as a minimal, reproducible example when compiled as x86.

    Why? A process loads some DLLs, like kernel32.dll, kernelbase.dll and ntdll.dll, as well as the app.exe. In case of .NET, also clr.dll and others. The available 4 GB of virtual memory (simplified; assuming a x64 OS) will be fragmented. It's quite likely that a 1GB block fits somewhere into memory. Then, another 2 GB block needs to be allocated, for a total of 3 GB, and that should cause problems due to the fragmentation.

    using System.IO;
    
    namespace MemoryStreamSizeOom_Framework
    {
        internal class Program
        {
            static void Main()
            {
                var ms = new MemoryStream();
                ms.Capacity = 1000 * 1024 * 1024;
                ms.Capacity = 2000 * 1024 * 1024;
            }
        }
    }
    

    .NET Framework

    I compiled this as .NET framework with explicit x86 architecture and ran it. Sure enough, an OutOfMemoryException happens.

    0:000> !pe
    Exception object: 02d820e4
    Exception type:   System.OutOfMemoryException
    [...]
    

    The question is now: is it the 1 GB or 2 GB allocation, which caused it? We are looking for the following hex values:

    0:000> ? 0n1000*0n1024*0n1024
    Evaluate expression: 1048576000 = 3e800000
    0:000> ? 0n2000*0n1024*0n1024
    Evaluate expression: 2097152000 = 7d000000
    

    For the debug build, I can find them with !clrstack:

    0:000> !clrstack -a
    OS Thread Id: 0x6a18 (0)
    Child SP       IP Call Site
    00cfecf0 779db532 [HelperMethodFrame: 00cfecf0] 
    00cfed7c 65d2175b System.IO.MemoryStream.set_Capacity(Int32)
        PARAMETERS:
            this (<CLR reg>) = 0x02d820a8
            value (<CLR reg>) = 0x7d000000
    [...]
    

    As expected, it's the 2 GB allocation.

    For the release build, the same command works for me:

    0:000> !clrstack -a
    OS Thread Id: 0x394 (0)
    Child SP       IP Call Site
    007df1b4 779db532 [HelperMethodFrame: 007df1b4] 
    007df240 65d2175b System.IO.MemoryStream.set_Capacity(Int32)
        PARAMETERS:
            this (<CLR reg>) = 0x02af20b0
            value (<CLR reg>) = 0x7d000000
    [...]
    

    .NET 8

    The same command also works for me with a .NET 8 app:

    0:000> !clrstack -a
    OS Thread Id: 0x7cf8 (0)
    Child SP       IP Call Site
    0299EF50 779db532 [HelperMethodFrame: 0299ef50] 
    0299EFBC 5cfc690f System.IO.MemoryStream.set_Capacity(Int32) [/_/src/libraries/System.Private.CoreLib/src/System/IO/MemoryStream.cs @ 276]
        PARAMETERS:
            this (<CLR reg>) = 0x04fca3fc
            value (<CLR reg>) = 0x7d000000
    [...]
    

    What does not work reliably?

    kb

    With kb in .NET 8, I saw both values in the args to child:

    0:000> kb L5
     # ChildEBP RetAddr      Args to Child              
    00 0299ee78 6105bb7f     e0434352 00000001 00000005 KERNELBASE!RaiseException+0x62
    01 0299eef8 6104733e     00000000 6336325b 0299efa8 coreclr!RaiseTheExceptionInternalOnly+0x177
    02 0299ef34 6112fc5f     633632db 7d000000 04fca3fc coreclr!UnwindAndContinueRethrowHelperAfterCatch+0x35
    03 0299efb4 5cfc690f     3e800000 04fca3fc 0299f05c coreclr!JIT_NewArr1+0x1292df
    04 0299efd0 081c1955     00000000 00000000 00000000 System_Private_CoreLib!System.IO.MemoryStream.set_Capacity+0x4f
    

    It seems like (I don't think that the calling convention guarantees this) the old size is passed as the first parameter to JIT_NewArr1 and the new size is passed as second argument to UnwindAndContinueRethrowHelperAfterCatch .

    That's not the case with .NET framework:

    0:000> kb L5
     # ChildEBP RetAddr      Args to Child              
    00 00afefc8 66e5b7ff     e0434352 00000001 00000005 KERNELBASE!RaiseException+0x62
    01 00aff064 66f86420     00000000 6a35be77 00aff12c clr!RaiseTheExceptionInternalOnly+0x27c
    02 00aff098 6700ef17     6a35bfd7 7d000000 02c720b0 clr!UnwindAndContinueRethrowHelperAfterCatch+0x7b
    03 00aff138 65d2175b     00aff1f4 02c724c0 00aff170 clr!JIT_NewArr1+0x191
    04 00aff14c 02a8087d     00000000 00000000 00000000 mscorlib_ni!System.IO.MemoryStream.set_Capacity+0x5b
    

    So this is not a reliable way of detecting the size.

    dv /t /v

    For .NET framework, I also find that dv /t /v worked:

    0:000> .frame 4
    04 00aff14c 02a8087d     mscorlib_ni!System.IO.MemoryStream.set_Capacity+0x5b
    
    0:000> dv /t /v
    @esi              System.IO.MemoryStream this = 0x02c720b0
    @edi              int value = 0n2097152000
    <unavailable>     byte[] slot0 = <value unavailable>
    
    0:000> ? 0n2097152000
    Evaluate expression: 2097152000 = 7d000000
    

    But in .NET 8, that gave wrong results:

    0:000> .frame 4
    04 029debc0 04d21955     System_Private_CoreLib!System.IO.MemoryStream.set_Capacity+0x4f [/_/src/libraries/System.Private.CoreLib/src/System/IO/MemoryStream.cs @ 276] 
    
    0:000> dv /t /v
    @esi              System.IO.MemoryStream this = 0x029de358
    @edi              int value = 0n43903896
    <unavailable>     byte[] newBuffer = <value unavailable>
    
    0:000> ? 0n43903896
    Evaluate expression: 43903896 = 029deb98