I want to scrape the words which reside in the <li>
elements. The results return an empty list. Are they resided within a frame because as I can see they are not within any <iframe><\iframe>
elements? If they do how do you access the frame or find the frame id in this case? Here is the site and the code
from playwright.sync_api import sync_playwright, expect
def test_fetch_paperrater():
path = r"https://www.paperrater.com/page/lists-of-adjectives"
with sync_playwright() as playwright:
browser = playwright.chromium.launch()
page = browser.new_page()
page.goto(path)
texts = page.locator("div#header-container article.page ul li").all_inner_texts()
print(texts)
browser.close()
The elements were not in div#header-container
but div#wrapper
. There were multiple ul
elements and the best way to access these was with nth()
as follows
with sync_playwright() as playwright:
browser = playwright.chromium.launch()
page = browser.new_page()
page.goto(path)
words = []
for i in range(1, 22, 2):
all_texts = page.locator("div#wrapper article.page ul").nth(i).all_inner_texts()
texts = all_texts[0].split("\n")
for text in texts:
append = words.append(text)
browser.close()