pythonencodingutf-8python-requestscp1251

Python (requests) encoding trouble (UTF-8 - CP1251)


I trying to get this kind of URL http://example.com/?param=%DD%CC%C0-15 with requests python extension like this:

group = "ЭМА-15".encode('cp1251')
r = requests.get('http://example.com/?param=' + group)
r.encoding = "cp1251"

(because site works with windows-1251 (cp1251) encoding)

And getting errorat line 2: UnicodeDecodeError: 'utf8' codec can't decode byte 0xdd in position 82: invalid continuation byte But this sequence of bytes (0xDD (%DD)...) is exactly what I need. How can I fix that?


Solution

  • There are two things. 1. Python interpreter needs to know the encoding of "ЭМА-15" string in the source 2. query parameter is usually handled by requests but since you are constructing the URL manually, it's best to quote it by yourself.

    # -*- coding: utf-8 -*-
    import urllib
    import requests
    
    group = u"ЭМА-15".encode('cp1251')
    param = urllib.quote_plus(group)
    print(param)
    r = requests.get('http://example.com/?param=' + param)
    

    Output

    %DD%CC%C0-15