pythonxml

How can I parse XML and get instances of a particular node attribute?


I have many rows in XML and I'm trying to get instances of a particular node attribute.

<foo>
   <bar>
      <type foobar="1"/>
      <type foobar="2"/>
   </bar>
</foo>

How do I access the values of the attribute foobar? In this example, I want "1" and "2".


Solution

  • I suggest ElementTree. There are other compatible implementations of the same API, such as lxml, and cElementTree in the Python standard library itself; but, in this context, what they chiefly add is even more speed -- the ease of programming part depends on the API, which ElementTree defines.

    First build an Element instance root from the XML, e.g. with the XML function, or by parsing a file with something like:

    import xml.etree.ElementTree as ET
    root = ET.parse('thefile.xml').getroot()
    

    Or any of the many other ways shown at ElementTree. Then do something like:

    for type_tag in root.findall('bar/type'):
        value = type_tag.get('foobar')
        print(value)
    

    Output:

    1
    2