Currently I am practicing using selenium for Web Scraping and encountering StaleElementReferenceException Error. I tried to scrap the phone information from a retailed website and tried to make it for 3 pages. I used a for loop and it worked fine for the 1st page. Then I encountered the error. I have tried WebDriverWait, time.sleep, etc. but it didn't work. Please help me with this. Below is my code:
driver = webdriver.Chrome()
driver.get('https://tiki.vn/')
category = driver.find_element(By.XPATH,"//a[@title='Điện Thoại - Máy Tính Bảng']").click()
phone_information = []
for page in range(1,4):
next_page = driver.find_element(By.XPATH, '//a[@data-view-label="{}"]'.format(page)).get_attribute('href')
driver.get(next_page)
element = (By.XPATH, '//div[@class="inner"]')
WebDriverWait(driver, 30).until(EC.visibility_of_element_located(element))
phone_names = driver.find_elements(By.XPATH , '//div[@class="info"]')
for phone in phone_names:
print(phone.text)
WebDriverWait(driver, 60)
driver.quit()
This is the output:
StaleElementReferenceException Traceback (most recent call last)
Cell In[7], line 16
14 time.sleep(10)
15 for phone in phone_names:
---> 16 print(phone.text)
17 time.sleep(20)
19 WebDriverWait(driver, 60)
Just enclose it in a try-catch
block and re-find the element. Quick and dirty solution. Error-prone.
from selenium.common.exceptions import StaleElementReferenceException # import the exception type
// ...
for phone in phone_names:
try:
print(phone.text)
except StaleElementReferenceException:
element = (By.XPATH, '//div[@class="inner"]')
WebDriverWait(driver, 30).until(EC.visibility_of_element_located(element))
phone_names = driver.find_elements(By.XPATH , '//div[@class="info"]')
continue
// ...
Hi. The problem is that the reference to the element
you created when calling find_element()
is pointing to an element that not longer present in the DOM. Note that the CSS.Selector
or XPATH
that you pass to describe the structure of an element is distinct from the element per se. As stated in Selenium's docs for the WebDriver API, the error you're getting:
exception
selenium.common.exceptions.StaleElementReferenceException(msg:
Optional[str] = None, screen: Optional[str] = None, stacktrace:
Optional[Sequence[str]] = None)
Bases:selenium.common.exceptions.WebDriverException
Thrown when a reference to an element is now “stale”.
Stale means the element no longer appears on the DOM of the page.
Possible causes of StaleElementReferenceException include, but not limited to:
- You are no longer on the same page, or the page may have refreshed since the element was located.
- The element may have been removed and re-added to the screen, since it was located. Such as an element being relocated.
- Element may have been inside an iframe or another context which was refreshed.
Althought a more robust solution cuold be achieved with some refactorig, knowing that the problem is in the reference provides you with the insight that you should only look up the element again and it should work fine. You've stated that the problem occurs when refreshing, and as you haven't provided the relevant HTML I'll just assume that the XPATH for the new element
stays the same.
To make this easier to work with we should try to divide the task in it's constituent parts. i.e. finding the element, and then operating over it. We know the exception is raised in the finding bits. Furthermore, as long as the element you are finding is the phone_names
collection and not each phone
per se, you don't need to check for every phone
, just the collection.
def print_phones(phone_names_element):
for phone in phone_names_element:
print(phone.text)
driver = webdriver.Chrome()
driver.get('https://tiki.vn/')
category = driver.find_element(By.XPATH,"//a[@title='Điện Thoại - Máy Tính Bảng']").click()
phone_information = []
for page in range(1,4):
next_page = driver.find_element(By.XPATH, '//a[@data-view-label="{}"]'.format(page)).get_attribute('href')
driver.get(next_page)
element = WebDriverWait(driver, 30).until(EC.visibility_of_element_located((By.XPATH, '//div[@class="inner"]'))
phone_names = driver.find_elements(By.XPATH , '//div[@class="info"]')
try:
print_phones(phone_names_element)
except StaleElementReferenceException:
continue
WebDriverWait(driver, 60)
driver.quit()
^^^^^^^^^^
In the above code snippet i've extracted the print_phones
functionality, I've inlined the element
's bits (as .until()
returns the element) and I've enclosed the call in a try loop as to avoid stale references.
Notes:
sleep
calls and explicity time waits with a driver configurated ìmplicit wait
Hope it helps.