python-3.xpython-3.8brotli

python3.8 brotli brotli.error: BrotliDecompress failed?


mycode python3.8

brotli=1.0.9

use request get url,headers use Accept-Encoding="br" i need use decode br, because i think use accept-encoding=br more good

import brotli
import requests 
headers = {}
headers['Accept'] = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
headers['Accept-Encoding'] = "gzip, deflate, br"
headers['Host'] = "book.douban.com"
headers['Referer'] = "book.douban.com"
headers['Sec-Fetch-Dest'] = "document"
headers['Sec-Fetch-Mode'] = "navigate"
headers['Upgrade-Insecure-Requests'] = "1"

s=requests.Session()
url="https://book.douban.com/tag/%E5%B0%8F%E8%AF%B4"
try:
    response = s.get(url, headers=headers)
except:
    return ""
if response.status_code == 200:
    print(response.headers)
    if response.headers.get('Content-Encoding') == 'br':
        data = brotli.decompress(response.content)
        data1 = data.decode('utf-8')
        return data1
    else:
        return response.text
return ""

raise error

data = brotli.decompress(response.content)
brotli.error: BrotliDecompress failed

Solution

  • This is mentioned nowhere in the documentation of requests but once brotli is installed, it is directly handled by Requests.

    This means that response.content will be automatically decoded (similarly to gzip). You don't need to do brotli.decompress(response.content)

    if brotli is not installed, you won't get any error message. Instead, response.content will stay encoded...

    edit:

    digging in Requests code, I found out that Requests use urllib3.response which implements usage of Brotli.

    upon loading, urllib3.response will look for an import of Brotli:

    try:
        import brotli
    except ImportError:
        brotli = None
    

    then when decoding a response, it will use appropriate decoder:

    def _get_decoder(mode):
        if "," in mode:
            return MultiDecoder(mode)
    
        if mode == "gzip":
            return GzipDecoder()
    
        if brotli is not None and mode == "br":
            return BrotliDecoder()
    
        return DeflateDecoder()
    

    thus if Brotli is installed, decoding will occur else nothing will happen and no warning to the user.

    edit2 In fact, this is mentioned in https://docs.python-requests.org/en/latest/user/quickstart/#binary-response-content