pythonlxmlxml-namespacescytoscape

How to write namespaced element attributes with LXML?


I'm using lxml (2.2.8) to create and write out some XML (specifically XGMML). The app which will be reading it is apparently fairly fussy and wants to see a top level element with:

<graph label="Test" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xlink="h
ttp://www.w3.org/1999/xlink" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-
ns#" xmlns:cy="http://www.cytoscape.org" xmlns="http://www.cs.rpi.edu/XGMML"  di
rected="1">

How do I setup those xmlns: attributes with lxml ? If I try the obvious

root.attrib['xmlns:dc']='http://purl.org/dc/elements/1.1/'
root.attrib['xmlns:xlink']='http://www.w3.org/1999/xlink'
root.attrib['xmlns:rdf']='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
root.attrib['xmlns:cy']='http://www.cytoscape.org'
root.attrib['xmlns']='http://www.cs.rpi.edu/XGMML'

lxml throws a ValueError: Invalid attribute name u'xmlns:dc'

I've used XML and lxml a fair amount in the past for simple stuff, but managed to avoid needing to know anything about namespaces so far.


Solution

  • Unlike ElementTree or other serializers that would allow this, lxml needs you to set up these namespaces beforehand:

    NSMAP = {"dc" : 'http://purl.org/dc/elements/1.1',
             "xlink" : 'http://www.w3.org/1999/xlink'}
    
    root = Element("graph", nsmap = NSMAP)
    

    (and so on and so forth for the rest of the declarations)

    And then you can use the namespaces using their proper declarations:

    n = SubElement(root, "{http://purl.org/dc/elements/1.1}foo")
    

    Of course this gets annoying to type, so it is generally beneficial to assign the paths to short constant names:

    DCNS = "http://purl.org/dc/elements/1.1"
    

    And then use that variable in both the NSMAP and the SubElement declarations:

    n = SubElement(root, "{%s}foo" % (DCNS))