pythonxml-parsingattributesxml-namespacessax

Getting Attributes from Python SAX Parsing with Namespace


I've tried countless ways to get attribute values out of Python's SAX parser when using namespaces and cannot find a way to do it. There must be a simple solution here, but for the life of me I cannot determine it. I have a fairly straightforward XML that utilizes namespaces within. It's easy enough to get this data with DOM parsing, but I'm trying to create a SAX parser instead. Here's the XML:

<?xml version="1.0" encoding="UTF-8"?>

<cdf:Benchmark style="1.2" resolved="1" xmlns:cdf="http://checklists.nist.gov/xccdf/1.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/">
      <cdf:TestResult end-time="2024-06-14T17:55:55" id="test_id1" start-time="2024-06-14T17:55:53">
            <cdf:rule-result idref="rule_1" role="full" time="2024-06-14T17:55:53">
                  <cdf:result>pass</cdf:result>
                  <cdf:message severity="info">Result: true</cdf:message>
            </cdf:rule-result>
            <cdf:rule-result idref="rule_2" role="full" time="2024-06-14T17:55:54">
                  <cdf:result>fail</cdf:result>
                  <cdf:message severity="info">Result : false</cdf:message>
            </cdf:rule-result>
      </cdf:TestResult>
</cdf:Benchmark>

And here's a simple attempt at getting the attribute value for 'idref'. I've also tried using get() and getValue() with countless namespace combinations and nothing works. I'm getting a KeyError, stating that 'idref' is not a valid key. Here's my code:

import xml.sax

class CustomHandler(xml.sax.ContentHandler):
    
    def startElementNS(self, name, qname, attrs):
        (cdf, self.localname) = name
        if self.localname == 'rule-result':
            attributes = attrs['idref']
            print(attributes)

    def characters(self, content):
        if self.localname == 'rule-result':
            self.rule_result = content

    def endElementNS(self, name, qname):
        (cdf, self.localname) = name
        if self.localname == 'rule-result':
            print(self.rule_result)
        self.localname = ''

handler = CustomHandler()
parser = xml.sax.make_parser()
parser.setContentHandler(handler)
parser.setFeature(xml.sax.handler.feature_namespaces, True)
parser.parse('test_xml_ns.xml')

Solution

  • attrs is an AttributesNSImpl object. The keys are (namespaceURI, localname) tuples.

    It works if you change

    attrs['idref']
    

    to

    attrs[(None, 'idref')]