pythonxmlxml-parsingparent-childelementtree

Using Python, how can I get the text of an XML element when a sibling element's tag is the string I am looking for?


I hope this is an easy question. I will try to be clear on what I am trying to accomplish. Below is just a small snippet of what my XML file looks like. What I am trying to do is see if the element structure exists. If so, the code proceeds. I then try to look through all of the elements and if child element (test) is False, then I would like to get the text of the id element. The following code I have will work if the element is before the element. I want to make sure that whatever order ID is list in (before or after the ) that I get the appropriate child id belonging to the appropriate parent. Currently I am using element tree.

<data>
<cs>
    <c>
        <id>1</id>
        <test>True</test>
        <test2>False</test2>
        <test3>False</test3>
        <test4>True</test4>
    </c>
    <c>
        <test>False</test>
        <test2>False</test2>
        <test3>False</test3>
        <id>2</id>
        <test4>True</test4>
    </c>
</cs>
elementTree = self.param2
isCS = elementTree.find('./cs')
getCS = elementTree.findall('./cs')
CIDs = []

if isCS is None:
    raise Exception("Unable to find the 'cs' element structure under <data>. Failed to build a list of CID's.")
else:
    # Build the list of CID's.
    for cs in getCS:
        for c in cs:
            for child in c.getchildren():
                if str(child.tag).lower() == 'id':
                    myid = child.text
                elif str(child.tag).lower() == 'test' and str(child.text).lower() == 'false':
                    CIDs.append(myid)

    print CIDs

What I am getting (depending on the order which the element is listed) is the following output:

1

When I am really expecting the following:

2

I just need to know how I can run specific tests on the subelements of and get data depending on what I find in the text of .


Solution

  • Here is one way to do it:

    cids = []
    for c_node in tree.findall('.//cs/c'):
        test_node = c_node.find('test')
        if test_node is not None and test_node.text == 'False':
            id_node = c_node.find('id')
            cids.append(id_node.text)
    
    print cids
    

    Discussion