pythonurllib

python urllib, returns empty page for specific urls


I am having trouble with specific links with urllib. Below is the code sample I use:

from urllib.request import Request, urlopen
import re

url = ""
req = Request(url)
html_page = urlopen(req).read()

print(len(html_page))

Here are the results I get for two links:

url = "https://www.dafont.com"
Length: 0

url = "https://www.stackoverflow.com"
Length: 196673

Anyone got any idea why this happens?


Solution

  • Try using. You will get the response. Certain websites are secured and only respond to certain user-agents only.

    from urllib.request import Request, urlopen
    
    url = "https://www.dafont.com"
    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"}
    req = Request(url, headers=headers)
    html_page = urlopen(req).read()
    
    print(len(html_page))