I'm trying to use Selenium to scroll to a specific section on a webpage and retrieve the text from that section.
Context:
I’m working with a webpage that disables text highlighting through CSS properties like user-select: none
and -webkit-user-select: none
. I can disable these properties with JavaScript, but my main challenge right now is automatically scrolling down to the "Production / Artist" section in the DOM and then fetching the text.
Here’s the URL of the webpage I’m working with:
Webpage Link
I’ve tried using Selenium to scroll to the "Production / Artist" section, but I’m not sure if I’m using the correct method for this particular page structure.
My Current Code:
from selenium import webdriver
from selenium.webdriver.common.by import By
# Initialize WebDriver
driver = webdriver.Chrome()
# Open the URL
url = "https://www.art-mate.net/doc/78492?name=%E6%A8%82%E3%83%BB%E8%AA%BC%E7%8D%A8%E5%A5%8F%E5%AE%B6%E6%A8%82%E5%9C%98%E2%94%80%E2%94%80%E5%A4%A7%E6%8F%90%E7%90%B4%E8%88%87%E9%A6%AC%E7%89%B9%E8%AB%BE%E7%90%B4%E3"
driver.get(url)
# Scroll to the "Production / Artist" section
element = driver.find_element(By.XPATH, "//h2[text()='Production / Artist']")
driver.execute_script("arguments[0].scrollIntoView();", element)
# Now attempt to copy the text from the section
production_artist_section = driver.find_element(By.XPATH, "//div[contains(text(), 'Production / Artist')]")
print(production_artist_section.text)
# Close the driver
driver.quit()
The Issue:
My Question:
How do I ensure that Selenium scrolls smoothly and accurately to the "Production / Artist" section on the page before I attempt to fetch the text?
Any help or advice on how to optimize the scrolling behavior would be greatly appreciated!
Check the working code below to extract and store the values from "Production / Artist" section into an array:
Code:
import time
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
url = "https://www.art-mate.net/doc/78492?name=%E6%A8%82%E3%83%BB%E8%AA%BC%E7%8D%A8%E5%A5%8F%E5%AE%B6%E6%A8%82%E5%9C%98%E2%94%80%E2%94%80%E5%A4%A7%E6%8F%90%E7%90%B4%E8%88%87%E9%A6%AC%E7%89%B9%E8%AB%BE%E7%90%B4%E3"
driver.get(url)
driver.maximize_window()
wait = WebDriverWait(driver, 10)
# Click on 'En' element
wait.until(EC.element_to_be_clickable((By.XPATH, "//a[@class='cms_lang cms_lang_en']"))).click()
time.sleep(5)
people = wait.until(EC.visibility_of_all_elements_located((By.XPATH, "//span[@class='people_cell people_role']")))
name = wait.until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='people_box']//a")))
people_roles = []
people_names = []
# Below for loops will append each web element into the respective arrays
for p in people:
people_roles.append(p.text)
for n in name:
people_names.append(n.text)
print("People roles:", people_roles)
print("People names:", people_names)
Console result:
People roles: ['Presented by', 'Artistic Director / Cello', 'Ondes Martenot', 'Composer', 'Viola', 'Performed by']
People names: ['Musicus Society', 'Trey Lee', 'Nadia Ratsimandresy', 'Seung-Won Oh', 'Aurélie Entringer', 'Musicus Soloists Hong Kong']
Process finished with exit code 0