pythonpython-requestsgoogle-cloud-storagehttpx

Refactor from requests HTTPAdapter to httpx HTTPTransport


Previously, we were using requests.adapters.HTTPAdapter per this answer with requests==2.32.3 and google-cloud-storage==2.16.0:

from google.cloud import storage
from requests.adapters import HTTPAdapter

gcs_client = storage.Client()

adapter = HTTPAdapter(pool_connections=30, pool_maxsize=30)
gcs_client._http.mount("https://", adapter)
gcs_client._http._auth_request.session.mount("https://", adapter)

We are migrating our code base to httpx. This GitHub issue comment instructs to use custom transports. I have tried to perform something like the below with httpx==0.27.0, but it doesn't work:

import google.auth
import httpx
from google.cloud import storage

transport = httpx.HTTPTransport(
    limits=httpx.Limits(
        max_connections=30, max_keepalive_connections=30
    )
)
http = httpx.Client(transport=transport)
http.is_mtls = False  # Emulating https://github.com/googleapis/google-auth-library-python/blob/v2.29.0/google/auth/transport/requests.py#L400
return Client(
    _http=http,
    # Emulating https://github.com/googleapis/python-cloud-core/blob/v2.4.1/google/cloud/client/__init__.py#L178
    credentials=google.auth.default(scopes=Client.SCOPE)[0],
)

This implementation throws an Unauthorized error:

google.api_core.exceptions.Unauthorized: 401 GET https://storage.googleapis.com/storage/v1/b/foo?projection=noAcl&prettyPrint=false: Anonymous caller does not have storage.buckets.get access to the Google Cloud Storage bucket. Permission 'storage.buckets.get' denied on resource (or it may not exist).

How can one move from requests.adapters.HTTPAdapter to httpx.HTTPTransport?


This question is similar to How to refactor a request HTTPAdapter for use with aiohttp?, but for httpx as opposed to aiohttp.


Solution

  • Unfortunately I have no ability to test this solution at the moment. But you can try to make "custom" transport which will update auth headers when any request is sent using this transport.

    import google.auth
    import google.auth.transport.requests
    import httpx
    from google.cloud import storage
    
    credentials, project = google.auth.default(scopes=storage.Client.SCOPE)
    
    
    def get_httpx_client(credentials):
        class AuthenticatedTransport(httpx.BaseTransport):
            def __init__(self, transport):
                self.transport = transport
                self.auth_request = google.auth.transport.requests.Request()
    
            def handle_request(self, request):
                headers = dict(request.headers)
    
                credentials.before_request(
                    self.auth_request, request.method, request.url, headers
                )
                request.headers.update(headers)
                return self.transport.handle_request(request)
    
        transport = httpx.HTTPTransport(
            limits=httpx.Limits(max_connections=30, max_keepalive_connections=30)
        )
        authenticated_transport = AuthenticatedTransport(transport)
    
        return httpx.Client(transport=authenticated_transport)
    
    
    custom_client = get_httpx_client(credentials)
    
    gcs_client = storage.Client(_http=custom_client, project=project)
    

    UPDATE

    For streaming responses you should follow https://www.python-httpx.org/quickstart/#streaming-responses

    Something like:

    def fetch_stream_file(url):
        with custom_client.stream("GET", url) as response:
            response.raise_for_status()
            for chunk in response.iter_bytes():
                yield chunk