pythonxmlfedora-25c14n

Write ElementTree as canonicalized XML (c14n)


The ElementTree class in the Python xml.etree API has a write() method that documents the optional method argument:

*method* -- either "xml" (default), "html, "text", or "c14n"

At least with Python 3.5 (on Fedora 25) writing with this method

import xml.etree.ElementTree as ET
# ... create some elements ...
tree = ET.ElementTree(root)
tree.write(filename, method='c14n')

just throws a ValueError:

File "/usr/lib64/python3.5/xml/etree/ElementTree.py", line 751, in write
  raise ValueError("unknown method %r" % method)
ValueError: unknown method 'c14n'

And the ElementTree code contains this note:

_serialize = {
    "xml": _serialize_xml,
    "html": _serialize_html,
    "text": _serialize_text,
# this optional method is imported at the end of the module
#   "c14n": _serialize_c14n,
}

What is that supposed to mean, exactly?

How to serialize a ElementTree to a c14n XML file in Python?


Solution

  • The C14N serialisation was only ever implemented as an experimental extension to the original ElementTree stand-alone library developed by Fredrik Lundh, and was never part of an official release.

    When the 1.3 branch of the library was imported into the Python standard library, the support code for the optional library was included, but the C14N serialisation never was. Later on the optional-component import code was removed altogether, but several mentions of C14N remain in the code. However, the documentation never mentioned C14N and the remaining mentions should just be removed from Python.

    However, if you must make it work, you could always download the original ElementC14N.py module, and update this to work with the current ElementTree library. This does require some work; the codebase is Python 2 specific and will need to be updated to handle Unicode instead.

    As of Python 3.8, someone has included the ElementC14N.py functionality into ElementTree, so you can now use the canonicalize() function to output c14n canocicalized XML. This function works on text, not on an ElementTree object, so you'd have to output a string first. Sadly, they did not remove the broken c14n support from ElementTree.write()