pythonxpath

Is there simpler way to get all nested text inside of ElementTree?


I am currently using the xml.etree Python library to parse HTML.

After finding a target DOM element, I am attempting to extract its text. Unfortunately, it seems that the .text attribute is severely limited in its functionality and will only return the immediate inner text of an element (and not anything nested). Do I really have to loop through all the children of the ElementTree? Or is there a more elegant solution?


Solution

  • The descendant XPath axis should return descendant nodes (including whitespaces)

    For example:

    //body/descendant::text() or //body/descendant::*/text()

    As a generic case

    //xpath/to/target/element/descendant::text()