I'm trying to use pyquery parse html. I'm facing one uncertain issue. My code as below:
from pyquery import PyQuery as pq
document = pq('<p id="hello">Hello</p><p id="world">World !!</p>')
p = document('p')
print(p.filter("#hello"))
And the expectation of print result should as following :
<p id="hello">Hello</p>
But the actual response as below:
<p id="hello">Hello</p><p id="world">World !!</p></div></html>
if I just want to the specify part html instead of the rest of the entire html content, how should I write it.
Thanks
You can use built in library ElementTree
import xml.etree.ElementTree as ET
html = '''<html><p id="hello">Hello</p><p id="world">World !!</p></html>'''
root = ET.fromstring(html)
p = root.find('.//p[@id="hello"]')
print(ET.tostring(p))
output
b'<p id="hello">Hello</p>'