pythonconnection-poolinghttpx

Connection pooling with httpx.Client


The clients section in the httpx docs mention connection pooling and generally recommend the use of httpx.Client.

I cannot read from the docs or from anywhere else however, if connection pooling is simply tied to an httpx.Client instance or if calling open/close or entering/exiting a client context manager affects connection pooling.

E.g. if connection pooling was simply tied to a client instance the following should be able to benefit from pooling:

shared_client = httpx.Client()

with shared_client as client:
    # do stuff

with shared_client as client:
    # do more stuff

If connection pooling was affected by close or exiting a client context, the above would not be able to utilize pooling.

I would appreciate any help on this.

edit

The above example violates a very fundamental restriction in httpx clients, that is that a client cannot be reopened. Sorry, I should have tried running something like this before posting.

My actual use case is that I would like to be able to allow users of a class that uses an httpx.Client/httpx.AsyncClient internally to provide the client themselves, in which case they would be able to reuse the client but are also responsible for closing it.

The following is a rough idea for accomplishing this:

import warnings

import httpx


warnings.simplefilter("always")


class _ClientWrapper:
    def __init__(self, client: httpx.Client) -> None:
        self.client = client

    def __getattr__(self, value):
        return getattr(self.client, value)

    def __enter__(self):
        return self.client.__enter__()

    def __exit__(self, exc_type, exc_value, traceback):
        if not self.client.is_closed:
            warnings.warn(f"httpx.Client instance '{self.client}' is still open. ")


class Getter:
    def __init__(self, client: httpx.Client | None = None) -> None:
        self.client = httpx.Client() if client is None else _ClientWrapper(client)

    def get(self, url: str) -> httpx.Response:
        with self.client:
            response = self.client.get(url)
            response.raise_for_status()

        return response


client = httpx.Client()
getter = Getter(client=client)

response = getter.get("https://www.example.com")
print(response)  # 200

print(getter.client.is_closed)  # False
client.close()
print(getter.client.is_closed)  # True

The idea of the wrapper is to delegate all attribute access to a client component but overwrite __exit__ to just warn and not actually close the client; so if the client is provided, users are responsible for managing/closing the client.

Another option to achieve this would of course be subclassing, but with that users would need to use my subclassed httpx.Client and not any client.


Solution

  • You should create and open client only once an then re-use it everywhere:

    import concurrent.futures
    
    import httpx
    
    URLS = [f"http://example.com/{i}" for i in range(20)]
    
    def worker(client: httpx.Client, url: str):
        response = client.get(url)
        return response.text
    
    
    with httpx.Client() as client:
        with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
            futures = {executor.submit(worker, client=client, url=url): url for url in URLS}
            for future in concurrent.futures.as_completed(futures):
                future.result()
                print(f"{futures[future]} done")
    
    

    You can see that the idea of re-using shared client in multiple context managers is wrong using this example:

    import concurrent.futures
    
    import httpx
    
    URLS = [f"http://example.com/{i}" for i in range(20)]
    
    def worker(shared_client: httpx.Client, url: str):
        with shared_client as client:
            response = client.get(url)
            return response.text
    
    
    shared_client = httpx.Client()
    
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        futures = {executor.submit(worker, shared_client=shared_client, url=url): url for url in URLS}
        for future in concurrent.futures.as_completed(futures):
            future.result()
            print(f"{futures[future]} done")
    
    # The code above raises RuntimeError:
    # RuntimeError: Cannot open a client instance more than once.
    
    

    Or, even simplier:

    import httpx
    
    shared_client = httpx.Client()
    
    with shared_client as client:
        with shared_client as client_2:
            pass
    
    # The code above raises RuntimeError:
    # RuntimeError: Cannot open a client instance more than once.