pythonselenium-webdriverweb-scrapingwebdriverwait

Selenium Issues With Waiting Until Multiple Dropdowns Populate


I am having a great amount of trouble working with python selenium on a multi-dropdown url retrieval algorithm. In short, I'd like my program to go the the link (https://www.xpel.com/clearbra-installers/) and iterate through each of the three location dropdowns to retrieve every possible final link (i.e. https://www.xpel.com/clearbra-installers/united-states/california/bakersfield).

I have tried several different methods of doing this, including clicking on the table options, iterating through the dropdowns, and even collecting dataframes of each dropdown option, then creating urls and going to those links. However I try to approach the problem, there I always hit a snag related to wait times and "stale element not found" errors.

Here is the html I'm trying to scrape:

<select class="dealer-locator-select" id="dealer-locator-country-select" name="dealer-locator-country-select" data-action="country-select" data-filter="">
<option value=""> Region, Country or Area </option>
<option value="albania"> Albania </option>
<option value="algeria"> Algeria </option>
<option value="argentina"> Argentina </option>
<option value="armenia"> Armenia </option>
<option value="australia"> Australia </option>
<option value="austria"> Austria </option>
<option value="bahrain"> Bahrain </option>
<option value="belgium"> Belgium </option>
<option value="bosnia-and-herzegovina"> Bosnia and Herzegovina </option>
...
<option value="vietnam"> Vietnam </option>
</select>

Here is some code I've tried for retrieving countries:

# set wait
delay = 10 # seconds
wait = WebDriverWait(driver, delay)

# iterate through country options
for i in range(2,96):
    wait.until(EC.presence_of_element_located((By.XPATH, "//select[@id='dealer-locator-country-select']/option[" + str(i) + "]")))
    country_name = driver.find_element(By.XPATH, "//select[@id='dealer-locator-country-select']/option[" + str(i) + "]").text.lower().replace(' ', '-')
    country = Select(driver.find_element(By.XPATH, "//select[@id='dealer-locator-country-select']")).select_by_value(country_name)

print("done")

The code seems to work find until the selection; for instance, if I just retrieve the country name then print it, everything works perfectly. I've also tried using try/except/finally, which I haven't had luck with either:

# set wait
delay = 10 # seconds
wait = WebDriverWait(driver, delay)

# iterate through each country option
for i in range(2,96):
    wait.until(EC.presence_of_element_located((By.XPATH, "//select[@id='dealer-locator-country-select']/option[" + str(i) + "]")))
    country_name = driver.find_element(By.XPATH, "//select[@id='dealer-locator-country-select']/option[" + str(i) + "]").text.lower().replace(' ', '-')
    try:
        country = Select(driver.find_element(By.XPATH, "//select[@id='dealer-locator-country-select']"))
    except:
        sleep(5)
        country = Select(driver.find_element(By.XPATH, "//select[@id='dealer-locator-country-select']"))
    finally:
        country.select_by_value(country_name)

print("done")

Solution

  • You are getting StaleElementReferenceException as far as dropdown is re-rendered after new options selection.

    As far as you don't wait for re-rendering, your script on next iteration gets old reference that doesn't exist anymore and can't be reached.

    To handle this case, should wait for old dropdown element to be stale after selection.

    for i in range(2,96):
        wait.until(EC.presence_of_element_located((By.XPATH, "//select[@id='dealer-locator-country-select']/option[" + str(i) + "]")))
        country_name = driver.find_element(By.XPATH, "//select[@id='dealer-locator-country-select']/option[" + str(i) + "]").text.lower().replace(' ', '-')
        country_element = driver.find_element(By.XPATH, "//select[@id='dealer-locator-country-select']")
        country = Select(country_element)
        country.select_by_value(country_name)
        wait.until(EC.staleness_of(country_element))