pythondockernetwork-programminggoogle-cloud-storagedocker-for-mac

Request with IPv4 from python to gcs emulator


I'm trying to make a request from a python application to a gcs emulator in the docker-compose bridge network in docker for mac. When I tried, I found out that somehow the gcs client library is trying to make a request to the gcs emulator using IPv6 and failing because IPv6 is not supported by docker for mac.

I've implemented the following answer to correct IPv4, but it still seems to be trying to make requests via IPv6.

How can I make a successful request to the gcs emulator from python in a docker-compose network?

I have confirmed that requests from a local Python script to gcs emulator without docker-compose are successful.

docker-for-mac problem: https://github.com/docker/for-mac/issues/1432

referenced answer: Force requests to use IPv4 / IPv6

gcs emulator: https://github.com/fsouza/fake-gcs-server

sample docker-compose.yaml

version: '3'
services:
  run:
    build: .
    container_name: run
    ports:
      - 9090:8080
    env_file: 
      - ./.env
    environment:
      - PORT=8080
  gcs:
    image: fsouza/fake-gcs-server:latest
    container_name: fake-gcs-server
    ports:
      - 4443:4443
    env_file: 
      - ./.env    

sample implementation:

from google.cloud import storage
from google.api_core.client_options import ClientOptions
from google.auth.credentials import AnonymousCredentials
from unittest.mock import patch
from multijob_sample import variables as vs
import requests
import urllib3
import urllib3.util.connection
import traceback

import socket
orig_getaddrinfo = socket.getaddrinfo
def getaddrinfoIPv4(host, port, family=0, type=0, proto=0, flags=0):
    print(f'running patched getaddrinfo')
    return orig_getaddrinfo(host=host, port=port, family=socket.AF_INET, type=type, proto=proto, flags=flags)
patcher = patch('socket.getaddrinfo', side_effect=getaddrinfoIPv4)
patcher.start()


# for fake-gcs-emulator
http_ssl_disabled = requests.Session()
http_ssl_disabled.verify = False
urllib3.disable_warnings(
       urllib3.exceptions.InsecureRequestWarning
)  # disable https warnings for https insecure certs

client = storage.Client(
    credentials=AnonymousCredentials(),
    project=vs.project_id,
    client_options=ClientOptions(api_endpoint='https://gcs:4443'), 
    _http=http_ssl_disabled,
)

def put_file(bucket_id: str, file, blobname: str):
    file.seek(0)
    try:
        client.get_bucket(bucket_id).blob(blob_name=blobname).upload_from_file(file)
        print(f'file {blobname} uploaded')
    except Exception as e:
        print(f'failed to put file: {blobname}')
        print(f'error: {e}')
        print(f'trace: {traceback.format_exc()}')


put_file("bucketid", file, "blobname") # do put_file

error message:

run              | running patched getaddrinfo
run              | failed to put file: test.csv
run              | error: HTTPSConnectionPool(host='::', port=4443): Max retries exceeded with url: /upload/resumable/efbbcde9c49cda2ff78e8da24371ea03 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f8fb0765be0>: Failed to establish a new connection: [Errno -9] Address family for hostname not supported'))
run              | trace: Traceback (most recent call last):
run              |   File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 169, in _new_conn
run              |     conn = connection.create_connection(
run              |   File "/usr/local/lib/python3.9/site-packages/urllib3/util/connection.py", line 73, in create_connection
run              |     for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
run              |   File "/usr/local/lib/python3.9/unittest/mock.py", line 1093, in __call__
run              |     return self._mock_call(*args, **kwargs)
run              |   File "/usr/local/lib/python3.9/unittest/mock.py", line 1097, in _mock_call
run              |     return self._execute_mock_call(*args, **kwargs)
run              |   File "/usr/local/lib/python3.9/unittest/mock.py", line 1158, in _execute_mock_call
run              |     result = effect(*args, **kwargs)
run              |   File "/app/multijob_sample/storage.py", line 26, in getaddrinfoIPv4
run              |     return orig_getaddrinfo(host=host, port=port, family=socket.AF_INET, type=type, proto=proto, flags=flags)
run              |   File "/usr/local/lib/python3.9/socket.py", line 954, in getaddrinfo
run              |     for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
run              | socket.gaierror: [Errno -9] Address family for hostname not supported

Solution

  • This was, by far, one of the most annoying issues I've had to troubleshoot in a while. The solution is to run the emulator with the -external-url http://<your docker compose service name>:<port> option.

    This issue occurs only with file uploads because it only happens for resumable uploads. For resumable uploads, the GCS client first "initiates" the resumable upload with the server, and in the response the server includes a URL for future requests to go to (not sure why, but it seems like a reasonable part of a complex API). The issue is that the emulator doesn't know it's own url! In fact, if you look at the logs for the emulator you'll see it prints out stuff like server started at http://[::]:4443. That :: is the same :: as you see in the error. So the emulator responds with it's :: URL, and then a while later the client crashes trying to parse that URL.

    I'm still not sure why running outside of docker-compose it works, I guess there's some special casing somewhere around "", "localhost" or "::"`.