c++winapideadlockcoredumpdbghelp

MiniDumpWriteDump() hangs


MiniDumpWriteDump() of DbgHelp library hangs if heap allocation/deallocation/reallocation is in progress in another thread. Here is the call stack: DbgHelp pauses the other threads, then waits indefinitely for the mutexes obtained by those threads.

    ntdll.dll!NtWaitForAlertByThreadId()   Unknown
    ntdll.dll!RtlpWaitOnAddressWithTimeout()    Unknown
    ntdll.dll!RtlpWaitOnAddress()   Unknown
    ntdll.dll!RtlpWaitOnCriticalSection()   Unknown
    ntdll.dll!RtlpEnterCriticalSectionContended()  Unknown
    ntdll.dll!RtlEnterCriticalSection()    Unknown
    ntdll.dll!RtlpReAllocateHeap()  Unknown
    ntdll.dll!RtlpReAllocateHeapInternal()  Unknown
    ntdll.dll!RtlReAllocateHeap()   Unknown
    ntdll.dll!LdrpSetAlternateResourceModuleHandle()    Unknown
    ntdll.dll!LdrResGetRCConfig()   Unknown
    ntdll.dll!LdrpResSearchResourceMappedFile() Unknown
    ntdll.dll!LdrResSearchResource()    Unknown
    KernelBase.dll!FindVersionResourceSafe()   Unknown
>   KernelBase.dll!GetFileVersionInfoSizeExW()  Unknown
    dbgcore.dll!Win32LiveSystemProvider::GetImageVersionInfo(void *,unsigned short const *,unsigned __int64,struct tagVS_FIXEDFILEINFO *)   Unknown
    dbgcore.dll!GenAllocateModuleObject(struct _MINIDUMP_STATE *,struct _INTERNAL_PROCESS *,unsigned short *,unsigned __int64,unsigned long,struct _INTERNAL_MODULE * *)    Unknown
    dbgcore.dll!GenGetProcessInfo(unsigned long,struct _MINIDUMP_STATE *,struct _INTERNAL_PROCESS * *,struct _LIST_ENTRY *) Unknown
    dbgcore.dll!MiniDumpProvideDump()  Unknown
    dbgcore.dll!MiniDumpWriteDump()    Unknown

Do you know of an easy workaround for this situation? I can see a workaround of injecting checks to all other threads in my application, to see if a core dump is requested and then pause at a place where no mutexes are obtained. But this is a lot of change plus some threads of the application are out of my control because they are launched by libraries I use for internal use.


Solution

  • Broadly speaking, MiniDumpWriteDump performs two operations:

    1. Suspend all threads in the target process.
    2. When done, dump the target process' state.

    The first step suspends every thread, irrespective of what it's currently doing. If it happens to hold exclusive access to a shared resource, it will hold onto it indefinitely. There is only a single, reliable way to call MiniDumpWriteDump, as documented:

    MiniDumpWriteDump should be called from a separate process if at all possible, rather than from within the target process being dumped. This is especially true when the target process is already not stable. For example, if it just crashed. A loader deadlock is one of many potential side effects of calling MiniDumpWriteDump from within the target process.

    The documentation doesn't list all possible ways this API can cause a deadlock. In your case it appears that you suspended a thread that was in the middle of allocating memory from the heap. By default, the heap is synchronized. As MiniDumpWriteDump proceeds, it also tries to allocate heap memory. To do that, it requests the heap synchronization object. But that gets never released, because it just suspended the thread that holds exclusive access to it.

    Again, this is just a single way this API can deadlock, when called from within the same process it's instructed to dump. There are lots and lots and lots of other opportunities for this to happen.

    Solution: Put it in an external process.