I am a complete newbie to programming in the Windows environment and using Python and Selenium and Chromedriver, although I am familiar doing analogous tasks in the UNIX environment using Perl and HTML. All of my packages are at the latest current versions. I am doing this just for fun to keep my mind active in my retirement years.
I have written a Python program to scrape data from the website https://www.kijijiautos.ca/cars/gmc/canyon/#ml=%3A&ms=9900%3B13&od=up&p=5000%3A&sb=p&yc=2024%3A2024. I want to use the "Set location" button to enter my postal code (a Canadian thing). I am trying to use Chromedriver commands to emulate what I do manually. Manually I would click the "Set location" button which brings up a new little window. Then I would enter my postal code and hit the Enter key to enable it. Next I click on the "Radius" field and choose the "1000" option. Then I would click on the "Limit radius to province" checkbox to enable it. Next I would click on the "nnn results" field and the data would be entered, the window would disappear, and the webpage would update the results for my search.
I am trying to emulate this in my programming, but I cannot see that my data is updating the webpage at all. When I dump the element data before and after my updates the data never changes. This is born out by the fact it fails with "You may not select a disabled option" for the Radius field since it is disabled until data is successfully enter into the "postal code" field.
I have no idea what I am doing wrong. Any input would be very much appreciated. Note that my program saves a log of the webdriver execution so I can see any errors there - none are showing that I can see.
Program code:
# type: ignore
print('\nStarting program ...')
# Bring in basic libraries
import os
import time
# File names to be created
fnames = {
'web': 'logging-web.txt', # Webdriver console messages
}
# Check if the output files are in use - remove file if it exists and is not open
for fn in fnames:
if os.path.isfile(fnames[fn]):
try:
os.remove(fnames[fn])
except Exception as err:
print(f"\nERROR: {err} - ABORTING ...",flush=True)
exit(False)
# Bring in additional libraries
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# Set up ChromeDriver options
chrome_options = Options()
chrome_options.add_argument('--remote-debugging-pipe')
chrome_options.add_argument('--incognito')
chrome_options.add_argument('--headless=new')
chrome_options.add_argument('--accept-all-cookies')
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.add_argument('--ignore-ssl-errors')
chrome_options.add_argument('--disable-infobars')
chrome_options.add_argument('--disable-notifications')
# DEBUG chrome_options.add_argument('--disable-blink-features=AutomationControlled')
# DEBUG chrome_options.add_argument('--enable-popup-blocking')
chrome_options.add_argument('--log-level=3')
chrome_options.page_load_strategy = 'normal' # 'eager' does not always get all of the data on a page
chrome_options.add_experimental_option('excludeSwitches',['enable-logging','enable-automation'])
chrome_options.add_experimental_option('prefs',{'profile.managed_default_content_settings.images':2}) # Don't render images
# Set up ChromeDriver logging
wservice = webdriver.ChromeService(service_args=['--append-log','--readable-timestamp'],log_output=f"{fnames['web']}")
# Start the webdriver
driver = webdriver.Chrome(options=chrome_options,service=wservice)
driver.maximize_window()
driver.set_script_timeout(30)
winhancurr = driver.current_window_handle
url = 'https://www.kijijiautos.ca/cars/gmc/canyon/#ml=%3A&ms=9900%3B13&od=up&p=5000%3A&sb=p&yc=2024%3A2024'
print(f"\nGetting webpage {url} ...")
print(f"\nINFO: original window handle={driver.current_window_handle}")
driver.get(url)
WebDriverWait(driver,30).until(EC.visibility_of_element_located((By.TAG_NAME,'body')))
# Find "Set location" button and click on it. Note that there are 2 identical fields and only the 2nd one works.
# Also note that clicking this button opens another window on top of the current webpage.
elem = WebDriverWait(driver,30).until(EC.element_to_be_clickable((By.XPATH,'//*[@id="root"]/div[3]/div/section[5]/div/div/div[1]/div/div[1]/div/button')))
print(f"\nINFO: elem.before={elem.get_attribute("outerHTML")}")
elem.click()
print(f"\nINFO: elem.after={driver.find_element(By.XPATH,'//*[@id="root"]/div[3]/div/section[5]/div/div/div[1]/div/div[1]/div/button').get_attribute("outerHTML")}")
# Find new window and move to it
for winhan in driver.window_handles:
print(f"\nINFO: window handles={winhan}")
if winhan != winhancurr:
driver.switch_to.window(winhan)
print(f"\nINFO: new window handle={driver.current_window_handle}")
break
if driver.current_window_handle == winhancurr:
print(f"\nERROR: No new window created ...")
# Put postalcode data into field and end data input
elem2 = WebDriverWait(driver,30).until(EC.visibility_of_element_located((By.ID,'LocationAutosuggest')))
print(f"\nINFO: elem2.before={driver.find_element(By.ID,'LocationAutosuggest').get_attribute("outerHTML")}")
elem2.send_keys('T1J 4C8')
elem2.send_keys(Keys.ENTER)
print(f"\nINFO: elem2.after={driver.find_element(By.ID,'LocationAutosuggest').get_attribute("outerHTML")}")
# Find "Radius" field and enter data
elem3 = WebDriverWait(driver,30).until(EC.visibility_of_element_located((By.ID,'rd')))
print(f"\nINFO: elem3.before={elem3.get_attribute("outerHTML")}")
drop = Select(elem3)
drop.select_by_value('1000')
print(f"\nINFO: elem3.after={elem3.get_attribute("outerHTML")}")
# Click on "Limit radius to province" checkbox
elem4 = WebDriverWait(driver,30).until(EC.element_to_be_clickable((By.NAME,'limitedToProvince')))
print(f"\nINFO: elem4={elem4.get_attribute("outerHTML")}")
elem4.click()
# Apply all changes
elem5 = WebDriverWait(driver,30).until(EC.element_to_be_clickable((By.LINK_TEXT,'Apply')))
print(f"\nINFO: elem5={elem5.get_attribute("outerHTML")}")
elem5.click()
# Switch back to original window
driver.switch_to.window(winhancurr)
# End the driver
driver.quit()
# End of main
*** Execution log ***:
Starting program ...
Getting webpage https://www.kijijiautos.ca/cars/gmc/canyon/#ml=%3A&ms=9900%3B13&od=up&p=5000%3A&sb=p&yc=2024%3A2024 ...
INFO: original window handle=5DCA8B95024DB03D50D0A6302BA77DE3
INFO: elem.before=<button class="hDZZVr bDZZVr b2MH0T f3yLIj" type="button" data-testid="LocationLabelLink">Set location</button>
INFO: elem.after=<button class="hDZZVr bDZZVr b2MH0T f3yLIj" type="button" data-testid="LocationLabelLink">Set location</button>
INFO: window handles=5DCA8B95024DB03D50D0A6302BA77DE3
ERROR: No new window created ...
INFO: elem2.before=<input aria-invalid="false" tabindex="0" autocomplete="off" id="LocationAutosuggest" placeholder="Type the location" type="text" value="" class="focus-visible" data-focus-visible-added="">
INFO: elem2.after=<input aria-invalid="false" tabindex="0" autocomplete="off" id="LocationAutosuggest" placeholder="Type the location" type="text" value="" class="focus-visible" data-focus-visible-added="">
INFO: elem3.before=<select id="rd" class="pnNiF7 cnNiF7" disabled="" name="rd" required="" tabindex="0" data-testid="LocationRadiusDropdown"><option value="5">+ 5 km</option><option value="10">+ 10 km</option><option value="15">+ 15 km</option><option value="25">+ 25 km</option><option value="50">+ 50 km</option><option value="100">+ 100 km</option><option value="150">+ 150 km</option><option value="250">+ 250 km</option><option value="500">+ 500 km</option><option value="750">+ 750 km</option><option value="1000">+ 1,000 km</option></select>
Traceback (most recent call last):
File "C:\Users\david.DESKTOP-IH066BB\Documents\Software\CarSearch\testx.py", line 95, in <module>
drop.select_by_value('1000')
~~~~~~~~~~~~~~~~~~~~^^^^^^^^
File "C:\Users\david.DESKTOP-IH066BB\AppData\Local\Programs\Python\Python313\Lib\site-packages\selenium\webdriver\support\select.py", line 79, in select_by_value
self._set_selected(opt)
~~~~~~~~~~~~~~~~~~^^^^^
File "C:\Users\david.DESKTOP-IH066BB\AppData\Local\Programs\Python\Python313\Lib\site-packages\selenium\webdriver\support\select.py", line 213, in _set_selected
raise NotImplementedError("You may not select a disabled option")
NotImplementedError: You may not select a disabled option
I am expecting the data to update the webpage as I explained above.
Overall your code is pretty good. You are using WebDriverWait()
for waits and the Select()
class for SELECT elements, which is better than most I've seen. More will come with experience.
There are a couple issues that I would address:
There is no additional browser window, the window you see is just a dialog on the existing page. One way to tell this is to inspect the dialog and then scroll through the rest of the HTML, hovering over various elements. If you ever see an area on the main page highlighted, you know it's a dialog and not a new browser window. Also, browser windows can be moved outside the current browser container.
In general, XPaths that are long, contain indices (e.g. div[3]
), etc. should be avoided. They are considered brittle, very likely to break with even minor changes to the page which means your script will break.
You don't need to repeatedly instantiate WebDriverWait(driver, 30)
. You can instead just instantiate it once, assign it to a variable, and then reuse that variable. Also, 30 seconds is a very long time... you don't need that long for this, or most scenarios.
Replace
WebDriverWait(driver,30).until(...)
WebDriverWait(driver,30).until(...)
with
wait = WebDriverWait(driver, 10)
wait.until(...)
wait.until(...)
Here's working code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.wait import WebDriverWait
url = 'https://www.kijijiautos.ca/cars/gmc/canyon/#ml=%3A&ms=9900%3B13&od=up&p=5000%3A&sb=p&yc=2024%3A2024'
driver = webdriver.Chrome()
driver.maximize_window()
driver.get(url)
wait = WebDriverWait(driver, 10)
wait.until(EC.visibility_of_any_elements_located((By.CSS_SELECTOR, "[data-testid='LocationLabelLink']")))[0].click()
wait.until(EC.visibility_of_element_located((By.ID, "LocationAutosuggest"))).send_keys("T1J 4C8")
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "[data-testid='Autosuggest-Menu'] li"))).click()
Select(wait.until(EC.element_to_be_clickable((By.ID, "rd")))).select_by_value("1000")
driver.find_element(By.CSS_SELECTOR, "[data-testid='LocationRadiusDropdownContainer']").click()
driver.find_element(By.CSS_SELECTOR, "[for='limitedToProvince']").click()
driver.find_element(By.CSS_SELECTOR, "[data-testid='LocationModalSubmitButton']").click()
driver.quit()