pythonpython-3.xgoogle-bigquerygoogle-cloud-platformhttplib2

Python3 BigQuery or Google Cloud Python through HTTP Proxy


How to route BigQuery client calls through HTTP Proxy ?

Before Posting this, I tried following but it is still not routing through http proxy. And the Google Cloud service credentials are set through shell environment variable GOOGLE_APPLICATION_CREDENTIALS

import httplib2
import socks
import google.auth

credentials, _ = google.auth.default()
http_client = httplib2.Http(proxy_info = httplib2.ProxyInfo(socks.PROXY_TYPE_HTTP, 'someproxy', 80));

bigquery_client = bigquery.Client(credentials=credentials, _http=http_client)

Outgoing traffic ( 172.217.x.x belongs to googleapis.com ) not routing through HTTP Proxy ,

$ netstat -nputw
Local Address           Foreign Address
x.x.x.x                 172.217.6.234:443       SYN_SENT

Solution

  • Answering the question myself as I found the reason/solution.

    Reason:

    google-cloud-python library uses httplib2, As of this writing httplib2 has two code bases for python 2 and python 3. The Python 3 version of httplib2 is not implemented with socks/proxy support. Please refer to httplib2's repo#init_py.

    Work Around:

    There is a discussion to move google-cloud-python from httplib2 to urllib3, but in the mean time one can use httplib2shim

    import google.auth
    import httplib2shim
    import google_auth_httplib2
    
    // More declarative way exists, but left for simplicity
    os.environ["HTTP_PROXY"] = "someproxy:80"
    os.environ["HTTPS_PROXY"] = "someproxy:80"
    http_client = httplib2shim.Http()
    credentials, _ = google.auth.default()
    
    # IMO, Following 2 lines should be done at the google-cloud-python
    # This exposes client speicific logic, and it already does that
    credentials = google.auth.credentials.with_scopes_if_required
                  (credentials, bigquery.Client.SCOPE)
    authed_http = google_auth_httplib2.AuthorizedHttp(credentials,http_client)
    
    bigquery_client = bigquery.Client(credentials=credentials, _http=authed_http)