python-requestsfiddlerntlm-authentication

Using Python requests library through Fiddler and NTLM authentication is inconsistently successful


I have been working on and researching this problem for almost 20 hours now.

I have all web traffic routed through Fiddler on my machine which then connects to our corporate proxy. Everything works fine except Python applications trying to access remote servers using https (http always works fine).

I exported the corporate certificate and pasted it into the file: C:\anaconda2\envs\py36\Lib\site-packages\certifi\cacert.pem. I also explicitly set it in my requests.get call using verify=. No difference in behaviour.

I set up the local fiddler proxy information as environment variables. Fiddler is also configured to Automatically Authenticate. Using http works without any problems.

I seem to only be able to connect to remote servers via https if I go to http://www.google.com first and then quickly try to connect using https. A subsequent try yields the error below

requests.get('http://www.google.com') # always works for any website
<Response [200]>

requests.get('https://www.anaconda.com') # works after visiting http://www.google.com
<Response [200]>

requests.get('https://www.anaconda.com') # always fails, unless visiting http://www.google.com first

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
C:\anaconda2\envs\py36\lib\site-packages\urllib3\connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    593             if is_new_proxy_conn:
--> 594                 self._prepare_proxy(conn)
    595

C:\anaconda2\envs\py36\lib\site-packages\urllib3\connectionpool.py in _prepare_proxy(self, conn)
    804         conn.set_tunnel(self._proxy_host, self.port, self.proxy_headers)
--> 805         conn.connect()
    806

C:\anaconda2\envs\py36\lib\site-packages\urllib3\connection.py in connect(self)
    307             # self._tunnel_host below.
--> 308             self._tunnel()
    309             # Mark this connection as not reusable

C:\anaconda2\envs\py36\lib\http\client.py in _tunnel(self)
    918             raise OSError("Tunnel connection failed: %d %s" % (code,
--> 919                                                                message.strip()))
    920         while True:

OSError: Tunnel connection failed: 407 Proxy Authentication Required

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
C:\anaconda2\envs\py36\lib\site-packages\requests\adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    448                     retries=self.max_retries,
--> 449                     timeout=timeout
    450                 )

C:\anaconda2\envs\py36\lib\site-packages\urllib3\connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    637             retries = retries.increment(method, url, error=e, _pool=self,
--> 638                                         _stacktrace=sys.exc_info()[2])
    639             retries.sleep()

C:\anaconda2\envs\py36\lib\site-packages\urllib3\util\retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
    397         if new_retry.is_exhausted():
--> 398             raise MaxRetryError(_pool, url, error or ResponseError(cause))
    399

MaxRetryError: HTTPSConnectionPool(host='www.google.com', port=443): Max retries exceeded with url: / (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',)))

During handling of the above exception, another exception occurred:

ProxyError                                Traceback (most recent call last)
<ipython-input-49-df48f2544f7e> in <module>
----> 1 requests.get('https://www.google.com')

C:\anaconda2\envs\py36\lib\site-packages\requests\api.py in get(url, params, **kwargs)
     73
     74     kwargs.setdefault('allow_redirects', True)
---> 75     return request('get', url, params=params, **kwargs)
     76
     77

C:\anaconda2\envs\py36\lib\site-packages\requests\api.py in request(method, url, **kwargs)
     58     # cases, and look like a memory leak in others.
     59     with sessions.Session() as session:
---> 60         return session.request(method=method, url=url, **kwargs)
     61
     62

C:\anaconda2\envs\py36\lib\site-packages\requests\sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    531         }
    532         send_kwargs.update(settings)
--> 533         resp = self.send(prep, **send_kwargs)
    534
    535         return resp

C:\anaconda2\envs\py36\lib\site-packages\requests\sessions.py in send(self, request, **kwargs)
    644
    645         # Send the request
--> 646         r = adapter.send(request, **kwargs)
    647
    648         # Total elapsed time of the request (approximately)

C:\anaconda2\envs\py36\lib\site-packages\requests\adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    508
    509             if isinstance(e.reason, _ProxyError):
--> 510                 raise ProxyError(e, request=request)
    511
    512             if isinstance(e.reason, _SSLError):

ProxyError: HTTPSConnectionPool(host='www.google.com', port=443): Max retries exceeded with url: / (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',)))

One of the network folks was watching the corporate proxy log as I made the requests. When a failed https request was made he did not see a connection to the corporate proxy in its log.

Other things tried:

Thanks.


Solution

  • For those having the same issue.

    Further research led me to download the Python application Px (px.exe) Px on GitHub and turf Fiddler, which only worked intermittently for Python apps trying to get out to the Internet.

    PX itself required ZERO configuration in my case. I just had to set the http_proxy and https_proxy environment variables so that any Python apps would know where to funnel their traffic. Then I just ran Px and everything worked.

    Hope this helps people out.