xpathsgml

XPATH for all child nodes with different names


I have a parent element with various child elements that I need to keep a count on. The problem I'm having is each child element is different name, so everytime I use count(*) the numbering restarts. I need the numbering to go 1.1, 1.2, 1.3...

The parent tag is <application> would be 1, <ident> would be 1.1, <kitapplic> would be 1.2, and <tctoproof> would be 1.3

I thought I could just do a count(child::application) but that didn't work. Your help is appreciated.

<application>
    <ident>
        <para>This Technical Order is applicable.</para>
    </ident>
    <kitapplic>
        <kitapptbl>
            <kitapptblrow>
                <model>Model</model>
                <serialno>Serial Number</serialno>
                <kitno>Kit Required</kitno>
            </kitapptblrow>
        </kitapptbl>
    </kitapplic>
    <tctoproof>
        <para>Time Compliance Technical Order (TCTO) verification, in accordance
            with TO 00-5-15, was accomplished 28 August 2019 at Nellis Air Force
        Base with GCS serial number 5147.</para>
    </tctoproof>
</application>

Solution

  • With XPath, you can use count preceding-sibling and concat to get the desired numbers. Example with kitapplic :

    concat("1.",count(application/kitapplic/preceding-sibling::*)+1)
    

    Output : 1.2

    If you need a list with 1.1, 1.2, 1.3 for each child of application element you can do (example in Python) :

    data = """<application>
        <ident>
            <para>This Technical Order is applicable.</para>
        </ident>
        <kitapplic>
            <kitapptbl>
                <kitapptblrow>
                    <model>Model</model>
                    <serialno>Serial Number</serialno>
                    <kitno>Kit Required</kitno>
                </kitapptblrow>
            </kitapptbl>
        </kitapplic>
        <tctoproof>
            <para>Time Compliance Technical Order (TCTO) verification, in accordance
                with TO 00-5-15, was accomplished 28 August 2019 at Nellis Air Force
            Base with GCS serial number 5147.</para>
        </tctoproof>
    </application>"""
    
    import lxml.html
    tree = lxml.html.fromstring(data)
    
    for el in tree.xpath("//application/*"):
        print(el.xpath("concat(name(.),' 1.',count(./preceding-sibling::*)+1)"))
    

    Output :

    ident 1.1
    kitapplic 1.2
    tctoproof 1.3