pythonaiohttp

How to property handle request with aiohttp


I use aiohttp and send a request like this:

async with session.get(url=url, headers=headers) as res:
    src = await res.text()

After that, I handle html code with BeautifulSoup4.

Is it necessary to put the request processing inside the with block? So I have 2 options.

First:

async with session.get(url=url, headers=headers) as res:
    src = await res.text()
    soup = BeautifulSoup(src, "lxml")
    # handle further

Second:

async with session.get(url=url, headers=headers) as res:
    src = await res.text()
soup = BeautifulSoup(src, "lxml")
# handle further

Solution

  • No, you don't need it in the context manager.

    The context manager handles the lifecycle of the response object and the underlying connection. You don't reference the response object again in your code, which means you must have finished with it.

    async with session.get(url=url, headers=headers) as res:
        # This reads the entire body from the connection.
        src = await res.text()
        # At this point, src is just a string and you have no further need for res.
    # The only thing used here is a str. Obviously, a str does not need a context manager.
    soup = BeautifulSoup(src, "lxml")
    

    If BeautifulSoup has a streaming API, then you could instead use res.content to have it parse the data in chunks from the connection. This would be more efficient and reduce the chances of running out of memory. In this case, you'd need it to be inside the context manager, as you'd be passing the response's stream in, rather than just an immutable string.