I've got a question about parsing a rather complicated XML document in Python with xml.etree.ElementTree
. The XML is scap-security-guide-0.1.75/ssg-ubuntu2204-ds.xml
from https://github.com/ComplianceAsCode/content/releases/download/v0.1.75/scap-security-guide-0.1.75.zip
and the root tag and its attributes are:
<ds:data-stream-collection xmlns:cat="urn:oasis:names:tc:entity:xmlns:xml:catalog" xmlns:cpe-dict="http://cpe.mitre.org/dictionary/2.0" xmlns:cpe-lang="http://cpe.mitre.org/language/2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:ds="http://scap.nist.gov/schema/scap/source/1.2" xmlns:html="http://www.w3.org/1999/xhtml" xmlns:ind="http://oval.mitre.org/XMLSchema/oval-definitions-5#independent" xmlns:linux="http://oval.mitre.org/XMLSchema/oval-definitions-5#linux" xmlns:ocil="http://scap.nist.gov/schema/ocil/2.0" xmlns:oval="http://oval.mitre.org/XMLSchema/oval-common-5" xmlns:oval-def="http://oval.mitre.org/XMLSchema/oval-definitions-5" xmlns:unix="http://oval.mitre.org/XMLSchema/oval-definitions-5#unix" xmlns:xccdf-1.2="http://checklists.nist.gov/xccdf/1.2" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" id="scap_org.open-scap_collection_from_xccdf_ssg-ubuntu2204-xccdf.xml" schematron-version="1.3">
When I load the document with ET.parse(...).getroot()
and look at the root element, I can only see the attributes without a namespace:
id='scap_org.open-scap_collection_from_xccdf_ssg-ubuntu2204-xccdf.xml'
schematron-version='1.3'
I don't really need the other attributes but I'm curious why I don't get them all. What if I needed one of the other attributes? How would I access them?
Those other "attributes" are namespace declarations and are "reserved attributes" that behave a little differently.
A namespace (or more precisely, a namespace binding) is declared using a family of reserved attributes. Such an attribute's name must either be
xmlns
or beginxmlns:
. These attributes, like any other XML attributes, may be provided directly or by default.
There are no attribute nodes corresponding to attributes that declare namespaces. In XPath they are referred to as namespace nodes and can be selected using the namespace::
axis:
/*/namespace::*