mycode python3.8
brotli=1.0.9
use request get url,headers use Accept-Encoding="br" i need use decode br, because i think use accept-encoding=br more good
import brotli
import requests
headers = {}
headers['Accept'] = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
headers['Accept-Encoding'] = "gzip, deflate, br"
headers['Host'] = "book.douban.com"
headers['Referer'] = "book.douban.com"
headers['Sec-Fetch-Dest'] = "document"
headers['Sec-Fetch-Mode'] = "navigate"
headers['Upgrade-Insecure-Requests'] = "1"
s=requests.Session()
url="https://book.douban.com/tag/%E5%B0%8F%E8%AF%B4"
try:
response = s.get(url, headers=headers)
except:
return ""
if response.status_code == 200:
print(response.headers)
if response.headers.get('Content-Encoding') == 'br':
data = brotli.decompress(response.content)
data1 = data.decode('utf-8')
return data1
else:
return response.text
return ""
raise error
data = brotli.decompress(response.content)
brotli.error: BrotliDecompress failed
This is mentioned nowhere in the documentation of requests but once brotli is installed, it is directly handled by Requests.
This means that response.content will be automatically decoded (similarly to gzip). You don't need to do brotli.decompress(response.content)
if brotli is not installed, you won't get any error message. Instead, response.content will stay encoded...
edit:
digging in Requests code, I found out that Requests use urllib3.response which implements usage of Brotli.
upon loading, urllib3.response will look for an import of Brotli:
try:
import brotli
except ImportError:
brotli = None
then when decoding a response, it will use appropriate decoder:
def _get_decoder(mode):
if "," in mode:
return MultiDecoder(mode)
if mode == "gzip":
return GzipDecoder()
if brotli is not None and mode == "br":
return BrotliDecoder()
return DeflateDecoder()
thus if Brotli is installed, decoding will occur else nothing will happen and no warning to the user.
edit2 In fact, this is mentioned in https://docs.python-requests.org/en/latest/user/quickstart/#binary-response-content