I have a badly structured html template, where my <section>
elements contain multiple elements (p, figure, a, etc), but also raw text in between. How can I access all those snippets of texts, and edit them in place (what I need is to replace all $$code$$
with tags?)
both section.text
and section.tail
return empty strings...
Examine the .tail
of the complete tag that immediately precedes the text. So, in <section>A<p>B</p>C<p>D</p>E</section>
, the .tail
s of the two <p>
elemnts will contain C and E.
Example:
from lxml import etree
root = etree.fromstring('<root><section>A<p>B</p>C<p>D</p>E</section></root>')
for section_child in root.find('section'):
section_child.tail = section_child.tail.lower()
print(etree.tounicode(root))
Result:
<root><section>A<p>B</p>c<p>D</p>e</section></root>