pythonhttphttp-headerspython-requests

Python requests library added an additional header "Accept-Encoding: identity"


This is my code.

import requests
from sys import exit
proxies = {
    "http": "127.0.0.1:8888",
    "https": "127.0.0.1:8888",
}

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0",
    "Accept-Encoding": "gzip, deflate",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.5",
    "Connection": "keep-alive"
}


login_page = "http://www.test.com/login/"
r = requests.get(login_page, proxies = proxies, headers = headers)
original_cookies = r.cookies
exit(0)

This is what I got from fiddler2. As you can see, it added an additional header Accept-Encoding: identity.

GET http://www.test.com/login/ HTTP/1.1
Accept-Encoding: identity
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Host: www.test.com
Connection: keep-alive
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0

I'm using Python 3.3.2 on Windows 7 64 bit and requests 1.2.3.

Anyone can give some suggestions?

Thanks.


Solution

  • This originates deep within the bowels of http.client, which is used by urllib3 which is used by requests.

    http.client actually checks if there is already an accept-encoding in the headers dictionary passed, and if there is it skips adding the identity header - the only problem is that what is passed as headers dictionary is something like this:

    CaseInsensitiveDict({b'Accept-Encoding': 'gzip, deflate, compress', ...})
    

    So why is it not working? requests encodes the header names, and as in python3 a str object compared to a bytes object always is False, the check performed in http.client fails...

    If you really want to get rid of the additional header, the quickest way would be to either comment out line 340 in requests/models.py, or monkeypatch requests.models.PreparedRequest.prepare_headers

    edit:
    this seems to be fixed in the (not yet released) 2.0 branch of requests