I've been using selenium to take screenshots of Reddit posts and comments, and I've run into an issue that I can't find a fix for online. My code gives selenium the ID of the object I want to take a screenshot of, and with the main reddit post itself, this works great. When it comes to the comment though, it always times out (when using EC.presence_of_element_located()
) or says that it can't find it (when using Driver.findElement()
).
Here's the code:
def getScreenshotOfPost(header, ID, url):
driver = webdriver.Chrome() #Using chrome to define a web driver
driver.get(url) #Plugs the reddit url into the web driver
driver.set_window_size(width=400, height=1600)
wait = WebDriverWait(driver, 30)
driver.execute_script("window.focus();")
method = By.ID #ID is what I've found to be the most reliable method of look-up
handle = f"{header}{ID}" #The header will be of the form "t3_" for posts and "t1_" for comments, and the ID is the ID of the post of comment.
element = wait.until(EC.presence_of_element_located((method, handle)))
driver.execute_script("window.focus();")
fp = open(f'Post_{header}{ID}.png', "wb")
fp.write(element.screenshot_as_png)
fp.close()
I've tried searching by ID, CLASS, CSS_SELECTOR, and XPATH, and none of them work. I've double checked and the form t1_{the id of the comment}
is the correct ID for the comment, regardless of the reddit post. Increasing the wait-time on my web driver doesn't work. I'm not sure what the issue would be.
Thanks in advance for any help!
I see what the problem is... there are a TON of nested shadow-roots on the page. If you are familiar with IFRAMEs, they behave similarly. Basically you need to switch Selenium's context into the IFRAME/shadow-root for Selenium to be able to see the DOM inside and proceed. You will have to switch into each shadow-root, one at a time, and keep diving until you get to the element you want.
Some example code,
def test_recommended_code():
driver = Chrome()
driver.get('http://watir.com/examples/shadow_dom.html')
shadow_host = driver.find_element(By.CSS_SELECTOR, '#shadow_host')
shadow_root = shadow_host.shadow_root
shadow_content = shadow_root.find_element(By.CSS_SELECTOR, '#shadow_content')
assert shadow_content.text == 'some text'
driver.quit()
You can read more about it in this article.