How can I download a webpage with a user agent other than the default one on urllib2.urlopen
?
urllib2.urlopen
is not available in Python 3.x; the 3.x equivalent is urllib.request.urlopen
. See Changing User Agent in Python 3 for urrlib.request.urlopen to set the user agent in 3.x with the standard library HTTP facilities.
Setting the User-Agent from everyone's favorite Dive Into Python.
The short story: You can use Request.add_header to do this.
You can also pass the headers as a dictionary when creating the Request itself, as the docs note:
headers should be a dictionary, and will be treated as if
add_header()
was called with each key and value as arguments. This is often used to “spoof” theUser-Agent
header, which is used by a browser to identify itself – some HTTP servers only allow requests coming from common browsers as opposed to scripts. For example, Mozilla Firefox may identify itself as"Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11"
, whileurllib2
‘s default user agent string is"Python-urllib/2.6"
(on Python 2.6).