pythondomhandleplaywrightplaywright-python

In Playwright for Python, how do I get elements relative to ElementHandle (children, parent, grandparent, siblings)?


In playwright-python I know I can get an elementHandle using querySelector().

Example (sync):

from playwright import sync_playwright

with sync_playwright() as p:
    for browser_type in [p.chromium, p.firefox, p.webkit]:
        browser = browser_type.launch()
        page = browser.newPage()  
        page.goto('https://duckduckgo.com/')
        element = page.querySelector('input[id=\"search_form_input_homepage\"]')
    

How do I get the an element relative to this based on this elementHandle? I.e. the parent, grandparent, siblings, children handles?


Solution

  • Original answer:

    Using querySelector() / querySelectorAll with XPath (XML Path Language) lets you retrieve the elementHandle (respectively a collection of handles). Generally speaking, XPath can be used to navigate through elements and attributes in an XML document.

    from playwright import sync_playwright
    
    with sync_playwright() as p:
        for browser_type in [p.chromium, p.firefox, p.webkit]:
            browser = browser_type.launch(headless=False)
            page = browser.newPage()
            page.goto('https://duckduckgo.com/')
            element = page.querySelector('input[id=\"search_form_input_homepage\"]')
            
            parent = element.querySelector('xpath=..')
            grandparent = element.querySelector('xpath=../..')
            siblings = element.querySelectorAll('xpath=following-sibling::*')
            children = element.querySelectorAll('xpath=child::*')
    
            browser.close()
    

    Update (2022-07-22):

    It seems that browser.newPage() is deprecated, so in newer versions of playwright, the function is called browser.new_page() (note the different function name).

    Optionally create a browser context first (and close it afterwards) and call new_page() on that context.

    The way the children/parent/grandparent/siblings are accessed stays the same.

    from playwright import sync_playwright
    
    with sync_playwright() as p:
        for browser_type in [p.chromium, p.firefox, p.webkit]:
            browser = browser_type.launch(headless=False)
            context = browser.new_context()
            page = context.new_page()
            page.goto('https://duckduckgo.com/')
            element = page.querySelector('input[id=\"search_form_input_homepage\"]')
            
            parent = element.querySelector('xpath=..')
            grandparent = element.querySelector('xpath=../..')
            siblings = element.querySelectorAll('xpath=following-sibling::*')
            children = element.querySelectorAll('xpath=child::*')
    
            context.close()
            browser.close()