pythonpython-3.xbeautifulsouphtml-parserhtml-treebuilder

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: html-parser. Do you need to install a parser library?


I was trying to Web Scraping by the following code:

from bs4 import BeautifulSoup
import requests
import pandas as pd

page = requests.get('https://www.google.com/search?q=phagwara+weather')
soup = BeautifulSoup(page.content, 'html-parser')
day = soup.find(id='wob_wc')

print(day.find_all('span'))

But constantly getting the following error:

 File "C:\Users\myname\Desktop\webscraping.py", line 6, in <module>
    soup = BeautifulSoup(page.content, 'html-parser')
  File "C:\Users\myname\AppData\Local\Programs\Python\Python38-32\lib\site-packages\bs4\__init__.py", line 225, in __init__
    raise FeatureNotFound(
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: html-parser. Do you need to install a parser library?

I installed lxml and html5lib still this issue is persisting.


Solution

  • You need to mention the tag, so instead of soup.find(id="wob_wc"), it's should be soup.find("div", id="wob_wc"))

    And the parser name is html.parser not html-parser the difference is the dot.

    Also by default, Google will give you usually a response of 200 to prevent you from getting to know if you blocked or not. usually you've to check r.content.

    I've included the headers and now it's works.

    import requests
    from bs4 import BeautifulSoup
    
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:74.0) Gecko/20100101 Firefox/74.0'}
    r = requests.get(
        "https://www.google.com/search?q=phagwara+weather", headers=headers)
    soup = BeautifulSoup(r.content, 'html.parser')
    
    print(soup.find("div", id="wob_wc"))