I am creating a Python program that uses web scraping to check if an item is in stock. The code is a Python 3.9 script, using Beautiful Soup 4 and requests to scrape for the item's availability. I would eventually like to make the program search multiple websites and multiple links within each site so I don't have to have a bunch of scripts running at once. The expected result of the program is this:
200
0
In Stock
But I am getting:
200
[]
Out Of Stock
The '200' represents if the code can access the server, 200 is the expected result. The '0' is a boolean to see if the item is in stock, the expected response is either '0' for In Stock. I have given it both in-stock items and out of stock items and they both give the same response of 200 [] Out Of Stock
. I have a feeling there is something wrong with the out_of_stock_divs
within the def check_item_in_stock
because that's where I am getting the []
result of it finding the availability of the item
I had the code working correctly earlier yesterday, and I kept adding features (like it scraping multiple links and different websites) and that broke it, and I can't get it back to a working condition
Here's the program code. (I did base this code off of Mr. Arya Boudaie's code on his website, https://aryaboudaie.com/ I got rid of his text notifications though because I plan on just having this running on a spare computer next to me and have it play a loud sound, that will later be implemented.)
from bs4 import BeautifulSoup
import requests
def get_page_html(url):
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36"}
page = requests.get(url, headers=headers)
print(page.status_code)
return page.content
def check_item_in_stock(page_html):
soup = BeautifulSoup(page_html, 'html.parser')
out_of_stock_divs = soup.findAll("text", {"class": "product-inventory"})
print(out_of_stock_divs)
return len(out_of_stock_divs) != 0
def check_inventory():
url = "https://www.newegg.com/hp-prodesk-400-g5-nettop-computer/p/N82E16883997492?Item=9SIA7ABC996974"
page_html = get_page_html(url)
if check_item_in_stock(page_html):
print("In stock")
else:
print("Out of stock")
while True:
check_inventory()
time.sleep(60)```
The product inventory status is located inside a <div>
tag, not a <text>
tag:
import requests
from bs4 import BeautifulSoup
def get_page_html(url):
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36"}
page = requests.get(url, headers=headers)
print(page.status_code)
return page.content
def check_item_in_stock(page_html):
soup = BeautifulSoup(page_html, 'html.parser')
out_of_stock_divs = soup.findAll("div", {"class": "product-inventory"}) # <--- change "text" to div
print(out_of_stock_divs)
return len(out_of_stock_divs) != 0
def check_inventory():
url = "https://www.newegg.com/hp-prodesk-400-g5-nettop-computer/p/N82E16883997492?Item=9SIA7ABC996974"
page_html = get_page_html(url)
if check_item_in_stock(page_html):
print("In stock")
else:
print("Out of stock")
check_inventory()
Prints:
200
[<div class="product-inventory"><strong>In stock.</strong></div>]
In stock
Note: The HTML markup of that site probably changed in the past, I'd modify the check_item_in_stock
function:
def check_item_in_stock(page_html):
soup = BeautifulSoup(page_html, 'html.parser')
out_of_stock_div = soup.find("div", {"class": "product-inventory"})
return out_of_stock_div.text == "In stock."