pythonxmlminidom

How to access the child elements of an element with minidom?


I am reading an XML/OWL file (generated from Protege) in Jupyter notebook.

I can read the root element, but for children its showing error/blank.

from xml.dom.minidom import parse

DOMTree = parse("pressman.owl")
collection = DOMTree.documentElement

if collection.hasAttribute("shelf"):
   print("Root element : %s" % collection.getAttribute("owl:ObjectProperty"))

for objectprop in collection.getElementsByTagName("owl:ObjectProperty"):
    if objectprop.hasAttribute("rdf:about"):
            propertytext = objectprop.getAttribute("rdf:about")
            property = propertytext.split('#',2)
            print ("Property: %s" % property[1])
            type = objectprop.getElementsByTagName('rdf:resource')
            print ("Type: %s" % type)

and the pressman.owl file (abridged):

<rdf:RDF xmlns="http://www.semanticweb.org/sraza/ontologies/2021/4/untitled-ontology-6#"
     xml:base="http://www.semanticweb.org/sraza/ontologies/2021/4/untitled-ontology-6"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:xml="http://www.w3.org/XML/1998/namespace"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:PressmanOntology="urn:absolute:PressmanOntology#"
     xmlns:UniversityOntology="http://www.semanticweb.org/sraza/ontologies/2021/4/UniversityOntology#">
    <owl:Ontology rdf:about="urn:absolute:PressmanOntology"/>
    
    <!-- Object Properties -->

    <owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasAdvice"/>

    <owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDefinition">
        <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
        <rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
        <rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
    </owl:ObjectProperty>

    <owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDiagram">
        <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
        <rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
        <rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
    </owl:ObjectProperty>

    <!-- more entries... -->    
</rdf:RDF>

The output fis

Property: hasAdvice
Type: []
Property: hasDefinition
Type: []
Property: hasDiagram
Type: []

Solution

  • You have this structure

    <owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDefinition">
        <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
        <rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
        <rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
    </owl:ObjectProperty>
    

    and you're using

    type = objectprop.getElementsByTagName('rdf:resource')
    

    This cannot work, because rdf:resource is not an element, it's an attribute. I assume the one you're interested in belongs to <rdf:type>. So we need to go one more level down:

    rdf_type = objectprop.getElementsByTagName('rdf:type')
    

    Now rdf_type is a node list - after all the method is called "get elements by tag name", and minidom cannot know that there only might be a single <rdf:type> in your case. We take the first one, if it exists:

    rdf_type = rdf_type[0] if len(rdf_type) > 0 else None
    

    Now rdf:resource is an attribute on that element. Attributes are accessed through .getAttribute() in minidom.

    In theory, the rdf:resource attribute could be is missing in the XML, so let's make sure it exists before using it:

    if rdf_type is not None and rdf_type.hasAttribute('rdf:resource'):
        rdf_resource = rdf_type.getAttribute('rdf:resource')
    else:
        rdf_resource = None
    
    print(rdf_resource)
    

    All that being said, instead of manually wrestling with RDF files, it might be worthwhile to check out libraries that were written for RDF, such as rdflib, or even for OWL specifically, such as pyLODE.