I am using the code from this topic to try and get the HTML of a website, using a headless browser. Here is my full code:
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
binary = r'C:\Program Files\Mozilla Firefox\firefox.exe'
options = Options()
options.add_argument("-headless")
options.add_argument("headless")
options.add_argument("--headless")
options.headless = True
options.binary = binary
cap = DesiredCapabilities().FIREFOX
driver = webdriver.Firefox(options=options, executable_path="C:\\Users\\Aviv\\Downloads\\geckodriverv0.26.0-win64\\geckodriver.exe")#,capabilities=cap)
driver.get("https://coronaupdates.health.gov.il/corona-updates/grid/public-transport")
print(driver.find_elements_by_class_name("mat-row"))
When I just let the code run and print, the result is an empty list: [] But if I use Pycharm evaluator I get a list with 20 elements.
It looks like the evaluator and the actual command in the code executed differently, or the assignment did not go well.
I will appreciate any help. Thank you!
I suspect it prints []
since the required class elements aren't loaded in the DOM
and when you use the evaluator it may wait until the full DOM
update is complete. This is how in the latter case it returns the elements that match the given class.
Try adding a sleep call in your script
import time
options.headless = True
options.binary = binary
cap = DesiredCapabilities().FIREFOX
driver = webdriver.Firefox(options=options, executable_path="C:\\Users\\Aviv\\Downloads\\geckodriverv0.26.0-win64\\geckodriver.exe")#,capabilities=cap)
driver.get("https://coronaupdates.health.gov.il/corona-updates/grid/public-transport")
time.sleep(5) # Delays for 5 seconds.
print(driver.find_elements_by_class_name("mat-row"))
Also do evaluate with Explicit vs Implicit wait calls in headless and non-headless mode.