I am having trouble with specific links with urllib. Below is the code sample I use:
from urllib.request import Request, urlopen
import re
url = ""
req = Request(url)
html_page = urlopen(req).read()
print(len(html_page))
Here are the results I get for two links:
url = "https://www.dafont.com"
Length: 0
url = "https://www.stackoverflow.com"
Length: 196673
Anyone got any idea why this happens?
Try using. You will get the response. Certain websites are secured and only respond to certain user-agents only.
from urllib.request import Request, urlopen
url = "https://www.dafont.com"
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"}
req = Request(url, headers=headers)
html_page = urlopen(req).read()
print(len(html_page))