In all the examples and tutorials I have seen of BeautifulSoup, an HTML/XML document is passed and a soup object is returned which can then be used to modify the document. However, how can I use BeautifulSoup to create a HTML/XML document from scratch? In other words, I have data that I would like to put in an XML file, but the XML file does not exist yet and I would like to build it from scratch. How can I go about it?
Just create an empty BeautifulSoup()
object:
soup = BeautifulSoup()
and start adding elements:
soup.append(soup.new_tag("a", href="http://www.example.com"))
For XML you could start out with a XML header by using the xml
tree builder:
soup = BeautifulSoup(features='xml')
This requires lxml to be installed first. This sets the .is_xml
flag on the BeautifulSoup
object (which can also be set manually).