pythonpython-asyncioaiohttp

How to time an aiohttp request


I have an async script that looks partially like:

def main(endpoints):
    loop = asyncio.get_event_loop()
    loop.run_until_complete(_main(endpoints, loop))
    loop.close()

async def _main(endpoints, loop):
    connector = aiohttp.TCPConnector(limit=3)
    async with aiohttp.ClientSession(connector=connector, loop=loop) as session:
        await asyncio.gather(*[process(session, endpoint, uid) for endpoint, uid in endpoints])

async process(session, endpoint, uid):
    timer = time.perf_counter()
    print(f'{uid}: Start request at {timer}')
    async with session.get(endpoint) as r:
        do_some_quick_stuff(r)
    print(f'{uid}: Finished in {time.perf_counter() - timer} seconds')

As you can see, I limit simultaneous connections to 3. If I run the script with an endpoints list of length 5, I get:

1: Start request at 23.726745857
2: Start request at 23.770105453
3: Start request at 23.770374826
4: Start request at 23.770555491
5: Start request at 23.770689525
1: Finished in 6.4488191379999975 seconds
2: Finished in 6.409452765000001 seconds
3: Finished in 6.410782424999997 seconds
5: Finished in 12.524688972999996 seconds
4: Finished in 12.526368037999998 seconds

The endpoints in this example are to "https://httpstat.us/200?sleep=6000", so each request should take exactly 6 seconds to return a response. This output makes sense, since the timer is started (and messages printed) for all 5 requests at once, then the first 3 requests are made, they finish in 6 seconds, then the final 2 requests are made, finishing in 6 seconds plus the initial 6 second delay. However, I want all the timers to start just before their respective request, so that the final value should be around 6 for all endpoints.

If I move the timer and print statements to within the session.get block, I get this output:

1: Start request at 33.393075797
1: Finished in 0.0001522129999997901 seconds
2: Start request at 33.39833408
2: Finished in 3.703800000209867e-05 seconds
3: Start request at 33.399544624
3: Finished in 3.701299999647745e-05 seconds
4: Start request at 39.511201437
4: Finished in 8.197600000414695e-05 seconds
5: Start request at 39.513046895
5: Finished in 4.47689999987233e-05 seconds

As you can see, the timers for endpoints 4 and 5 correctly start about 6 seconds after those of 1-3, but the completion time for each is well under a second, meaning the timers didn't start until after the request completed.

How can I ensure that the timers begin only immediately before their respective request is made?


Solution

  • I would use a asyncio.Semaphore instance initialized to the number of concurrent connections you want to have. In that way your process task instancess will not issue a get request until a connection is available.

    import asyncio
    import time
    
    import aiohttp
    
    LIMIT = 3
    
    limiter = asyncio.Semaphore(value=LIMIT)
    
    def main(endpoints):
        asyncio.run(_main(endpoints))
    
    async def _main(endpoints):
        loop = asyncio.get_running_loop()
        connector = aiohttp.TCPConnector(limit=LIMIT)
        async with aiohttp.ClientSession(connector=connector, loop=loop) as session:
            await asyncio.gather(*[process(session, endpoint, uid) for endpoint, uid in endpoints])
    
    def do_some_quick_stuff(r):
        pass
    
    async def process(session, endpoint, uid):
        async with limiter:
            timer = time.perf_counter()
            print(f'{uid}: Start request at {timer}')
            async with session.get(endpoint) as r:
                do_some_quick_stuff(r)
            print(f'{uid}: Finished in {time.perf_counter() - timer} seconds')
    
    if __name__ == '__main__':
        endpoints = [
            ('https://httpstat.us/200?sleep=6000', 1),
            ('https://httpstat.us/200?sleep=6000', 2),
            ('https://httpstat.us/200?sleep=6000', 3),
            ('https://httpstat.us/200?sleep=6000', 4),
            ('https://httpstat.us/200?sleep=6000', 5)
        ]
        main(endpoints)
    

    Prints:

    1: Start request at 525942.124032
    2: Start request at 525942.2952959
    3: Start request at 525942.2957118
    3: Finished in 6.206655899994075 seconds
    4: Start request at 525948.502542
    2: Finished in 6.2612683000043035 seconds
    5: Start request at 525948.5568499
    1: Finished in 6.433378600049764 seconds
    4: Finished in 6.058796099969186 seconds
    5: Finished in 6.0536968999076635 seconds