Consider this function that makes a simple GET request to an API endpoint:
import httpx
def check_status_without_session(url : str) -> int:
response = httpx.get(url)
return response.status_code
Running this function will open a new TCP connection every time the function check_status_without_session
is called. Now, this section of HTTPX documentation recommends using the Client
API while making multiple requests to the same URL. The following function does that:
import httpx
def check_status_with_session(url: str) -> int:
with httpx.Client() as client:
response = client.get(url)
return response.status_code
According to the docs using Client
will ensure that:
... a Client instance uses HTTP connection pooling. This means that when you make several requests to the same host, the Client will reuse the underlying TCP connection, instead of recreating one for every single request.
My question is, in the second case, I have wrapped the Client
context manager in a function. If I call check_status_with_session
multiple times with the same URL, wouldn't that just create a new pool of connections each time the function is called? This implies it's not actually reusing the connections. As the function stack gets destroyed after the execution of the function, the Client
object should be destroyed as well, right? Is there any advantage in doing it like this or is there a better way?
Is there any advantage in doing it like this or is there a better way?
No, there is no advantage using httpx.Client
in the way you've shown. In fact the httpx.<method>
API, e.g. httpx.get
, does exactly the same thing!
The "pool" is a feature of the transport manager held by Client
, which is HTTPTransport
by default. The transport is created at Client
initialization time and stored as the instance property self._transport
.
Creating a new Client
instance means a new HTTPTransport
instance, and transport instances have their own TCP connection pool. By creating a new Client
instance each time and using it only once, you get no benefit over using e.g. httpx.get
directly.
And that might be OK! Connection pooling is an optimization over creating a new TCP connection for each request. Your application may not need that optimization, it may be performant enough already for your needs.
If you are making many requests to the same endpoint in a tight loop, iterating within the context of the loop may give you some throughput gains, e.g.
with httpx.Client(base_url="https://example.com") as client:
results = [client.get(f"/api/resource/{idx}") for idx in range(100)]
For such I/O-heavy workloads you may do even better by executing results in parallel, e.g. using httpx.AsyncClient
.