pythonpython-asynciocancellationresource-leakresource-cleanup

Do asynchronous context managers need to protect their cleanup code from cancellation?


The problem (I think)

The contextlib.asynccontextmanager documentation gives this example:

@asynccontextmanager
async def get_connection():
    conn = await acquire_db_connection()
    try:
        yield conn
    finally:
        await release_db_connection(conn)

It looks to me like this can leak resources. If this code's task is cancelled while this code is on its await release_db_connection(conn) line, the release could be interrupted. The asyncio.CancelledError will propagate up from somewhere within the finally block, preventing subsequent cleanup code from running.

So, in practical terms, if you're implementing a web server that handles requests with a timeout, a timeout firing at the exact wrong time could cause a database connection to leak.

Full runnable example

import asyncio
from contextlib import asynccontextmanager

async def acquire_db_connection():
    await asyncio.sleep(1)
    print("Acquired database connection.")
    return "<fake connection object>"

async def release_db_connection(conn):
    await asyncio.sleep(1)
    print("Released database connection.")

@asynccontextmanager
async def get_connection():
    conn = await acquire_db_connection()
    try:
        yield conn
    finally:
        await release_db_connection(conn)

async def do_stuff_with_connection():
    async with get_connection() as conn:
        await asyncio.sleep(1)
        print("Did stuff with connection.")

async def main():
    task = asyncio.create_task(do_stuff_with_connection())

    # Cancel the task just as the context manager running
    # inside of it is executing its cleanup code.
    await asyncio.sleep(2.5)
    task.cancel()
    try:
        await task
    except asyncio.CancelledError:
        pass

    print("Done.")

asyncio.run(main())

Output on Python 3.7.9:

Acquired database connection.
Did stuff with connection.
Done.

Note that Released database connection is never printed.

My questions


Solution

  • Focusing on protecting the cleanup from cancellation is a red herring. There is a multitude of things that can go wrong and the context manager has no way to know

    It is the responsibility of the resource handling utilities to properly handle errors.

    async def release_db_connection(conn):
        """
        Cancellation safe variant of `release_db_connection`
    
        Internally protects against cancellation by delaying it until cleanup.
        """
        # cleanup is run in separate task so that it
        # cannot be cancelled from the outside.
        shielded_release = asyncio.create_task(asyncio.sleep(1))
        # Wait for cleanup completion – unlike `asyncio.shield`,
        # delay any cancellation until we are done.
        try:
            await shielded_release
        except asyncio.CancelledError:
            await shielded_release
            # propagate cancellation when we are done
            raise
        finally:
            print("Released database connection.")
    

    Note: Asynchronous cleanup is tricky. For example, a simple asyncio.shield is not sufficient if the event loop does not wait for shielded tasks. Avoid inventing your own protection and rely on the underlying frameworks to do the right thing.


    The cancellation of a task is a graceful shutdown that a) still allows async operations and b) may be delayed/suppressed. Coroutines being prepared to handle the CancelledError for cleanup is explicitly allowed.

    Task.cancel

    The coroutine then has a chance to clean up or even deny the request by suppressing the exception with a try … … except CancelledError … finally block. […] Task.cancel() does not guarantee that the Task will be cancelled, although suppressing cancellation completely is not common and is actively discouraged.

    A forceful shutdown is coroutine.close/GeneratorExit. This corresponds to an immediate, synchronous shutdown and forbids suspension via await, async for or async with.

    coroutine.close

    […] it raises GeneratorExit at the suspension point, causing the coroutine to immediately clean itself up.