I'm writing a battle.net screen scraper in python, and I want to scrape this page.
Problem is, I get a 404 when I try to download it using my script. However, viewing it in a web browser works just fine.
Here is the code I'm using if it helps (requests needed):
def download(url, max_retries=10):
for i in range(max_retries):
print('Downloading: ' + url)
r = requests.get(url)
print('Status code: ' + str(r.status_code))
if r.status_code == requests.codes.ok: return r.content
return None
download('http://us.battle.net/sc2/en/game/unit')
Thanks for any answers.
Try this. Apparently the ending slash is necessary.
def download(url, max_retries=10):
for i in range(max_retries):
print('Downloading: ' + url)
r = requests.get(url)
print('Status code: ' + str(r.status_code))
if r.status_code == requests.codes.ok: return r.content
return None
download('http://us.battle.net/sc2/en/game/unit/')