While yesterday I could easily navigate on archive.org with selenium, today I cannot use selenium functions on the website in any way. Even my code to click on a simple search button does not work. Is there any solution for this?
I used import undetected_chromedriver but it didn't work, I also tried playwright library alternative to selenium but it doesn't work.
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import undetected_chromedriver as uc
chrome_driver_path = "chromedriver"
keyword = "photo"
url_photo = f"https://archive.org/search?query={keyword}&and%5B%5D=mediatype%3A%22image%22"
chrome_options = Options()
# chrome_options.add_argument('--headless')
service = Service('chromedriver')
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = uc.Chrome(options=options)
driver.get(url_photo)
WebDriverWait(driver, 100).until(EC.element_to_be_clickable((By.XPATH
,
"/html/body/app-root//main/div/router-slot/search-page//div/div[2]/collection-browser//div/div[3]/infinite-scroller//section/article[1]/tile-dispatcher//div/a/item-tile//div/div/div/image-block//div/item-image//div/img"))).click()
print("request successful")
The Search field within the website https://archive.org/search?query=photo&and%5B%5D=mediatype%3A%22image%22 is located deep within multiple #shadow-root (open)
elements.
To send a character sequence to the Search field you have to use shadowRoot.querySelector()
and you can use the following locator strategies:
Code Block:
driver.get("https://archive.org/search?query=photo&and%5B%5D=mediatype%3A%22image%22")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((driver.execute_script("return document.querySelector('app-root').shadowRoot.querySelector('search-page').shadowRoot.querySelector('collection-search-input').shadowRoot.querySelector('ia-clearable-text-input').shadowRoot.querySelector('input#text-input')")))).send_keys("xtrabyte")
Browser Snapshot:
You can find a couple of relevant discussions in: