I was trying to Web Scraping by the following code:
from bs4 import BeautifulSoup
import requests
import pandas as pd
page = requests.get('https://www.google.com/search?q=phagwara+weather')
soup = BeautifulSoup(page.content, 'html-parser')
day = soup.find(id='wob_wc')
print(day.find_all('span'))
But constantly getting the following error:
File "C:\Users\myname\Desktop\webscraping.py", line 6, in <module>
soup = BeautifulSoup(page.content, 'html-parser')
File "C:\Users\myname\AppData\Local\Programs\Python\Python38-32\lib\site-packages\bs4\__init__.py", line 225, in __init__
raise FeatureNotFound(
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: html-parser. Do you need to install a parser library?
I installed lxml and html5lib still this issue is persisting.
You need to mention the tag, so instead of soup.find(id="wob_wc")
, it's should be soup.find("div", id="wob_wc"))
And the parser name is html.parser
not html-parser
the difference is the dot.
Also by default, Google
will give you usually a response of 200
to prevent you from getting to know if you blocked or not. usually you've to check r.content
.
I've included the headers
and now it's works.
import requests
from bs4 import BeautifulSoup
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:74.0) Gecko/20100101 Firefox/74.0'}
r = requests.get(
"https://www.google.com/search?q=phagwara+weather", headers=headers)
soup = BeautifulSoup(r.content, 'html.parser')
print(soup.find("div", id="wob_wc"))