I've tried countless ways to get attribute values out of Python's SAX parser when using namespaces and cannot find a way to do it. There must be a simple solution here, but for the life of me I cannot determine it. I have a fairly straightforward XML that utilizes namespaces within. It's easy enough to get this data with DOM parsing, but I'm trying to create a SAX parser instead. Here's the XML:
<?xml version="1.0" encoding="UTF-8"?>
<cdf:Benchmark style="1.2" resolved="1" xmlns:cdf="http://checklists.nist.gov/xccdf/1.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/">
<cdf:TestResult end-time="2024-06-14T17:55:55" id="test_id1" start-time="2024-06-14T17:55:53">
<cdf:rule-result idref="rule_1" role="full" time="2024-06-14T17:55:53">
<cdf:result>pass</cdf:result>
<cdf:message severity="info">Result: true</cdf:message>
</cdf:rule-result>
<cdf:rule-result idref="rule_2" role="full" time="2024-06-14T17:55:54">
<cdf:result>fail</cdf:result>
<cdf:message severity="info">Result : false</cdf:message>
</cdf:rule-result>
</cdf:TestResult>
</cdf:Benchmark>
And here's a simple attempt at getting the attribute value for 'idref'. I've also tried using get()
and getValue()
with countless namespace combinations and nothing works. I'm getting a KeyError
, stating that 'idref'
is not a valid key. Here's my code:
import xml.sax
class CustomHandler(xml.sax.ContentHandler):
def startElementNS(self, name, qname, attrs):
(cdf, self.localname) = name
if self.localname == 'rule-result':
attributes = attrs['idref']
print(attributes)
def characters(self, content):
if self.localname == 'rule-result':
self.rule_result = content
def endElementNS(self, name, qname):
(cdf, self.localname) = name
if self.localname == 'rule-result':
print(self.rule_result)
self.localname = ''
handler = CustomHandler()
parser = xml.sax.make_parser()
parser.setContentHandler(handler)
parser.setFeature(xml.sax.handler.feature_namespaces, True)
parser.parse('test_xml_ns.xml')
attrs
is an AttributesNSImpl
object. The keys are (namespaceURI, localname)
tuples.
It works if you change
attrs['idref']
to
attrs[(None, 'idref')]