winapiwinsockio-completion-ports

Why is my call to GetOverlappedResult hanging after cancelling an accept operation?


I have a thread which repeatedly calls AcceptEx on a listening port until it's told to exit, at which point it attempts to cancel the last initiated accept.

Note: To clarify, the TASSERT & TASSERTF macros abort the process if the first parameter implicitly converts to false (& fails to compile if they don't convert to a boolean)

My code for doing that looks like

        // Cancel the last initiated accept (if it's still in progress)
        if (!::CancelIoEx((HANDLE)g_unpipeListenSocket, &overlapped)) {
            const DWORD error(::GetLastError());

            // This should _never_ fail
            TASSERTF(ERROR_NOT_FOUND == error, "CancelIoEx() failed: %s", Win32ErrorToString(error).c_str());
        }
        else {
            // The accept was still in progress, wait until it's complete
            DWORD unused;
            TASSERTF(
                ::GetOverlappedResult((HANDLE)g_unpipeListenSocket, &overlapped, &unused, TRUE),
                "GetOverlappedResult() failed: %s",
                Win32ErrorToString(::GetLastError()).c_str());
        }

        sock_close(acceptingSocket);
        sock_close(g_unpipeListenSocket);

So what's happening is that CancelIoEx is succeeding, but then the call to GetOverlappedResult is simply hanging forever.

My understanding is that simply closing the socket & allowing overlapped to be deallocated (as it's on the stack of the thread which is about to return) would leave the door open for a use-after-free as that's not enough to cancel the I/O operation in progress (or at least, not enough to wait for the cancellation to take effect)

There's only a single accept operation happening on g_unpipeListenSocket at any one time; Here's the code which I use to start them & handle their associated completions:

static void
BeginAccept(SOCKET& io_acceptingSocket, OVERLAPPED& in_overlapped, AcceptExBuff& in_acceptExBuff) noexcept {
    memset(&in_overlapped, 0, sizeof(in_overlapped));

    if (INVALID_SOCKET == io_acceptingSocket) {
        io_acceptingSocket = sockit(g_unpipeInAddr.inx_family, SOCK_STREAM, IPPROTO_TCP);
        if (INVALID_SOCKET == io_acceptingSocket) {
            char errorString[MAX_SOCK_ERROR_LEN];
            sock_error_string(getSockErr(), errorString);

            // TODO: Don't abort (exit thread & notify waiters somehow so they can start it again next time?)
            TABORT("sockit() failed: %s", errorString);
        }
    }

    DWORD bytesReceived;
    TASSERTF(!AcceptEx(
        g_unpipeListenSocket,
        io_acceptingSocket,
        &in_acceptExBuff,
        0,
        sizeof(in_acceptExBuff) / 2,
        sizeof(in_acceptExBuff) / 2,
        &bytesReceived,
        &in_overlapped),
        "AcceptEx() completed synchronously?");

    const DWORD error(::WSAGetLastError());

    // TODO: Don't abort (exit thread & notify waiters somehow so they can start it again next time?)
    TASSERTF(
        ERROR_IO_PENDING == error,
        "AcceptEx() failed synchronously: (%u) %s",
        error,
        Win32ErrorToString(error).c_str());
}

static void
HandleAccept(
    SOCKET& io_acceptingSocket,
    OVERLAPPED& in_overlapped,
    AcceptExBuff& in_acceptExBuff) noexcept {
    TASSERT(io_acceptingSocket != INVALID_SOCKET);

    const SOCKET connectedSocket(io_acceptingSocket);
    io_acceptingSocket = INVALID_SOCKET;

    // Begin the next accept...
    BeginAccept(io_acceptingSocket, in_overlapped, in_acceptExBuff);

    // Finish setting up accepted socket.
    // TODO: Don't abort (exit thread & notify waiters somehow so they can start it again next time?)
    TASSERTF(
        !::setsockopt(connectedSocket, SOL_SOCKET, SO_UPDATE_ACCEPT_CONTEXT, (PCHAR)&g_unpipeListenSocket, sizeof(SOCKET)),
        "setsockopt(SO_UPDATE_ACCEPT_CONTEXT) failed:");

    // Consume the connected socket
}

I call BeginAccept once at the beginning of the thread, and then HandleAccept every time I get a completion for a previous accept. Here's how I call GetQueuedCompletionStatus if my error-handling code is the problem:

if (!::GetQueuedCompletionStatus(
            g_unpipeListenThreadIOCP,
            &numBytes,
            &completionKey,
            &overlappedPtr,
            INFINITE))
        {
            const DWORD error(::GetLastError());

            TASSERTF(
                (&overlapped == overlappedPtr) && (ACCEPT_THREAD_ACCEPT_COMPLETION_KEY == completionKey),
                "&overlapped=%p overlappedPtr=%p completionKey=%" FSIZE "u GetLastError()='%s'",
                &overlapped,
                overlappedPtr,
                completionKey,
                Win32ErrorToString(error).c_str());
            SIMBA_TRACE(
                "Retrieved completion status about error on accept operation: %s",
                Win32ErrorToString(error).c_str());

            // Start another accept (Should we consider this a fatal error?)
            BeginAccept(acceptingSocket, overlapped, acceptExBuff);

            continue;
        }

Solution

  • It looks like the issue was that I was using GetOverlappedResult instead of WSAGetOverlappedResult; when I switched to the latter (& set a handle to an event to the OVERLAPPED structure I was using, as documented is necessary for passing fWait==TRUE), things started working.

    My current code on thread cleanup looks like

            // Cancel the last initiated accept (if it's still in progress)
            if (!::CancelIoEx((HANDLE)g_unpipeListenSocket, &overlapped)) {
                const DWORD error(::GetLastError());
    
                // This should _never_ fail
                TASSERTF(ERROR_NOT_FOUND == error, "CancelIoEx() failed: %s", Win32ErrorToString(error).c_str());
            }
            else {
                // The accept was still in progress, wait until it's complete
                DWORD bytesTransferred;
                DWORD flags;
    
                if (!::WSAGetOverlappedResult(g_unpipeListenSocket, &overlapped, &bytesTransferred, TRUE, &flags))
                    // TODO: Figure out how to distinguish between errors in WSAGetOverlappedResult vs AcceptEx
                    SIMBA_TRACE(
                        "WSAGetOverlappedResult() failed, assume it's because we just cancelled the accept: %s",
                        Win32ErrorToString(::WSAGetLastError()).c_str());
            }
    
            sock_close(acceptingSocket);
            sock_close(g_unpipeListenSocket);
            TASSERTF(
                ::CloseHandle(event),
                "CloseHandle(event) failed: %s",
                Win32ErrorToString(::GetLastError()).c_str());