pythoniteratorpython-asyncioasync-iterator

what happens to uniterated async iterators?


Say I have the following function

async def f1():
    async for item in asynciterator():
        return

What happens to the async iterator after

await f1()

? Should I worry about cleaning up or will the generator be somehow garbage collected when it goes out of sight?


Solution

  • Should I worry about cleaning up or will the generator be somehow garbage collected when it goes out of sight?

    TL;DR Python's gc and asyncio will ensure eventual cleanup of incompletely iterated async generators.

    "Cleanup" here refers to running the code specified by a finally around the yield, or by the __aexit__ part of the context manager used in a with statement around the yield. For example, the print in this simple generator is invoked by the same mechanism used by a aiohttp.ClientSession to close its resources:

    async def my_gen():
        try:
            yield 1
            yield 2
            yield 3
        finally:
            await asyncio.sleep(0.1)  # make it interesting by awaiting
            print('cleaned up')
    

    If you run a coroutine that iterates through the whole generator, the cleanup will be executed immediately:

    >>> async def test():
    ...     gen = my_gen()
    ...     async for _ in gen:
    ...         pass
    ...     print('test done')
    ... 
    >>> asyncio.get_event_loop().run_until_complete(test())
    cleaned up
    test done
    

    Note how the cleanup is executed immediately after the loop, even though the generator was still in scope without the chance to get garbage collected. This is because the async for loop ensures the async generator cleanup on loop exhaustion.

    The question is what happens when the loop is not exhausted:

    >>> async def test():
    ...     gen = my_gen()
    ...     async for _ in gen:
    ...         break  # exit at once
    ...     print('test done')
    ... 
    >>> asyncio.get_event_loop().run_until_complete(test())
    test done
    

    Here gen got out of scope, but the cleanup simply didn't occur. If you tried this with an ordinary generator, the cleanup would get called by the reference countered immediately (though still after the exit from test, because that's when the running generator is no longer referred to), this being possible because gen does not participate in a cycle:

    >>> def my_gen():
    ...     try:
    ...         yield 1
    ...         yield 2
    ...         yield 3
    ...     finally:
    ...         print('cleaned up')
    ... 
    >>> def test():
    ...     gen = my_gen()
    ...     for _ in gen:
    ...         break
    ...     print('test done')
    ... 
    >>> test()
    test done
    cleaned up
    

    With my_gen being an asynchronous generator, its cleanup is asynchronous as well. This means it can't just be executed by the garbage collector, it needs to be run by an event loop. To make this possible, asyncio registers the asyncgen finalizer hook, but it never gets a chance to execute because we're using run_until_complete which stops the loop immediately after executing a coroutine.

    If we tried to spin the same event loop some more, we'd see the cleanup executed:

    >>> asyncio.get_event_loop().run_until_complete(asyncio.sleep(0))
    cleaned up
    

    In a normal asyncio application this does not lead to problems because the event loop typically runs as long as the application. If there is no event loop to clean up the async generators, it likely means the process is exiting anyway.