javascriptpythonselenium-webdriverscreen-scraping

StaleElementReferenceException - Page randomly scrolls back up?


url: https://www.wunderground.com/history/monthly/at/vienna/LOWW/date/2025-1

It seems like I'm getting this exception because for some reason at a random point whilst iterating through the table the page scrolls back up so the table is not in the DOM anymore and so the table element can't be found.

Any advice on how to avoid this? Thanks

Code

element = driver2.find_element(By.XPATH, '//*[@id="inner-content"]/div[2]/div[1]/div[5]/div[1]/div/lib-city-history-observation/div/div[2]/table')
driver2.execute_script('arguments[0].scrollIntoView()', element)
t = driver2.find_element(By.XPATH, '//*[@id="inner-content"]/div[2]/div[1]/div[5]/div[1]/div/lib-city-history-observation/div/div[2]/table/tbody/tr/td[1]/table')
rows = t.find_elements(By.XPATH, './/tr')

for row in rows:
    days = row.find_element(By.XPATH, './/td').text
    print(days)
    data.append({'days':days})

Error

StaleElementReferenceException            Traceback (most recent call last)
Cell In[14], line 32
     26 data = []
     31 for row in rows:
---> 32     days = row.find_element(By.XPATH, './/td').text
     33     driver2.execute_script('arguments[0].scrollIntoView()', element)
     34     print(days)

Solution

  • See the below working code with explanation in the code comments:

    I have used selenium's waits to effectively locate elements. And also have used JS to scroll down the page until all the contents are loaded.

    import time
    
    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    driver = webdriver.Chrome()
    driver.maximize_window()
    
    driver.get("https://www.wunderground.com/history/monthly/at/vienna/LOWW/date/2025-1")
    wait = WebDriverWait(driver, 20)
    
    # Below 2 lines of code switched into iframe and clicks on `Accept All` button.
    # If you do not see this pop-up then don't use the below lines.
    wait.until(EC.frame_to_be_available_and_switch_to_it((By.ID, "sp_message_iframe_1225696")))
    wait.until(EC.element_to_be_clickable((By.XPATH, "//button[text()='Accept all']"))).click()
    
    # Scroll to the bottom of the page to load all the data
    lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
    match=False
    while(match==False):
        lastCount = lenOfPage
        time.sleep(3)
        lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
        if lastCount==lenOfPage:
            match=True
    
    # Capture the desired data in an array
    days = wait.until(EC.visibility_of_all_elements_located((By.XPATH, "(//table[@aria-labelledby='Days data'])[1]//td")))
    
    # Print the days
    for day in days:
        print(day.text)
    

    Result:

    Jan
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    
    Process finished with exit code 0