pythonasync-awaitpython-asynciocoroutine

Why only creating a task will run the coroutine in python?


There is something I can't understand in this code

import asyncio


async def fetch_data(param):
    print(f"Do something with {param}...")
    await asyncio.sleep(param)
    print(f"Done with {param}")
    return f"Result of {param}"


async def main():
    task1 = asyncio.create_task(fetch_data(1))
    task2 = asyncio.create_task(fetch_data(2))
    result2 = await task2
    print("Task 2 fully completed")
    result1 = await task1
    print("Task 1 fully completed")
    return [result1, result2]


results = asyncio.run(main())
print(results)

The output is

Do something with 1...
Do something with 2...
Done with 1
Done with 2
Task 2 fully completed
Task 1 fully completed
['Result of 1', 'Result of 2']

I expected to see

Do something with 2...

In the first line, but it outputs Do something with 1... first. It seems just creating tasks will run the coroutine, while from what I read and saw, it only registers it in the event loop. The flow should be

From this flow, I expect to see Do something with 2... in the first line. Why does this output the Do something with 1...? first?

To verify, I run this

import asyncio


async def fetch_data(param):
    print(f"Do something with {param}...")
    await asyncio.sleep(param)
    print(f"Done with {param}")
    return f"Result of {param}"


async def main():
    task1 = asyncio.create_task(fetch_data(1))
    task2 = asyncio.create_task(fetch_data(2))

results = asyncio.run(main())
print(results)

and the output is

Do something with 1...
Do something with 2...
None

Why are the couroutines running even without awaiting the tasks? Why does print(f"Done with {param}") not run in this version?


Solution

  • Let's insert a new line in main following the creation of task2:

    import asyncio
    
    async def fetch_data(param):
        print(f"Do something with {param}...")
        await asyncio.sleep(param)
        print(f"Done with {param}")
        return f"Result of {param}"
    
    
    async def main():
        task1 = asyncio.create_task(fetch_data(1))
        task2 = asyncio.create_task(fetch_data(2))
        print('tasks created')  # New statement
        result2 = await task2
        print("Task 2 fully completed")
        result1 = await task1
        print("Task 1 fully completed")
        return [result1, result2]
    
    
    results = asyncio.run(main())
    print(results)
    

    The first line of output will be:

    tasks created
    

    By default a call such as asyncio.create_task(fetch_data(1)) creates the task and it can be now scheduled to run -- but it is not automatically started. The current task (main) continues to run and any other task that has been created cannot run until main either issues an await statement or completes. Thus, the newly created tasks cannot start running until main issues the call to await task2. As a result of this await call, other tasks can now run one at a time. Since task1 was created before task2 it is the next task to run. So the task for coro fetch_data(1) will start executing and continue executing until it issues its first await statement. The next task to run will be task2 (coro fetch_data(2)) since the main task is blocking until task2 completes. Thus the complete output will be:

    tasks created
    Do something with 1...
    Do something with 2...
    Done with 1
    Done with 2
    Task 2 fully completed
    Task 1 fully completed
    ['Result of 1', 'Result of 2']
    

    Update

    If you are running Python >= 3.12, then tasks can be created eagerly, in which case they start running immediately:

    import asyncio
    
    async def fetch_data(param):
        print(f"Do something with {param}...")
        await asyncio.sleep(param)
        print(f"Done with {param}")
        return f"Result of {param}"
    
    
    async def main():
        loop = asyncio.get_running_loop()
        loop.set_task_factory(asyncio.eager_task_factory)
    
        task1 = asyncio.create_task(fetch_data(1))
        task2 = asyncio.create_task(fetch_data(2))
        print('tasks created')
        result2 = await task2
        print("Task 2 fully completed")
        result1 = await task1
        print("Task 1 fully completed")
        return [result1, result2]
    
    
    results = asyncio.run(main())
    print(results)
    

    Prints:

    Do something with 1...
    Do something with 2...
    tasks created
    Done with 1
    Done with 2
    Task 2 fully completed
    Task 1 fully completed
    ['Result of 1', 'Result of 2']