pythonweb-scrapingbotstracker

Unable to get the price of a product on Amazon when using Beautiful Soup in python


I was trying to track the price of a product using beautiful soup but whenever I try to run this code, I get a 6 digit code which I assume has something to do with recaptcha. I have tried numerous times, checked the headers, the url and the tags but nothing seems to work.

from bs4 import BeautifulSoup
import requests
from os import environ
import lxml


headers = {
    "User-Agent": environ.get("User-Agent"),
    "Accept-Language": environ.get("Accept-Language")
}

amazon_link_address = "https://www.amazon.in/Razer-Basilisk-Wired- 
Gaming-RZ01-04000100-R3M1/dp/B097F8H1MC/? 
_encoding=UTF8&pd_rd_w=6H9OF&content-id=amzn1.sym.1f592895-6b7a-4b03- 
9d72-1a40ea8fbeca&pf_rd_p=1f592895-6b7a-4b03-9d72-1a40ea8fbeca&pf_rd_r=1K6KK6W05VTADEDXYM3C&pd_rd_wg=IobLb&pd_rd_r=9fcac35b 
-b484-42bf-ba79-a6fdd803abf8&ref_=pd_gw_ci_mcx_mr_hp_atf_m"
response = requests.get(url=amazon_link_address, headers=headers)

soup = BeautifulSoup(response.content, features="lxml").prettify()

price = soup.find("a-price-whole")
print(price)

Solution

  • The "a-price-whole" class in inside the tags so BS4 is not able to find it. This solution worked for me, I just changed your "find" to a "find_all" and made it scan through all of the spans until you find the class you are searching for then used the iterator.get_text() to print the price. Hope this helps!

    soup = BeautifulSoup(response.content, features="lxml")
    
    price = soup.find_all("span")
    for i in price:
        try:
            if i['class'] == ['a-price-whole']:
                itemPrice = f"${str(i.get_text())[:-1]}"
                break
        except KeyError:
            continue
    
    print(itemPrice)