I am reading an XML/OWL file (generated from Protege) in Jupyter notebook.
I can read the root element, but for children its showing error/blank.
from xml.dom.minidom import parse
DOMTree = parse("pressman.owl")
collection = DOMTree.documentElement
if collection.hasAttribute("shelf"):
print("Root element : %s" % collection.getAttribute("owl:ObjectProperty"))
for objectprop in collection.getElementsByTagName("owl:ObjectProperty"):
if objectprop.hasAttribute("rdf:about"):
propertytext = objectprop.getAttribute("rdf:about")
property = propertytext.split('#',2)
print ("Property: %s" % property[1])
type = objectprop.getElementsByTagName('rdf:resource')
print ("Type: %s" % type)
and the pressman.owl
file (abridged):
<rdf:RDF xmlns="http://www.semanticweb.org/sraza/ontologies/2021/4/untitled-ontology-6#"
xml:base="http://www.semanticweb.org/sraza/ontologies/2021/4/untitled-ontology-6"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:PressmanOntology="urn:absolute:PressmanOntology#"
xmlns:UniversityOntology="http://www.semanticweb.org/sraza/ontologies/2021/4/UniversityOntology#">
<owl:Ontology rdf:about="urn:absolute:PressmanOntology"/>
<!-- Object Properties -->
<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasAdvice"/>
<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDefinition">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
<rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
<rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
</owl:ObjectProperty>
<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDiagram">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
<rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
<rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
</owl:ObjectProperty>
<!-- more entries... -->
</rdf:RDF>
The output fis
Property: hasAdvice Type: [] Property: hasDefinition Type: [] Property: hasDiagram Type: []
You have this structure
<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDefinition">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
<rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
<rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
</owl:ObjectProperty>
and you're using
type = objectprop.getElementsByTagName('rdf:resource')
This cannot work, because rdf:resource
is not an element, it's an attribute. I assume the one you're interested in belongs to <rdf:type>
. So we need to go one more level down:
rdf_type = objectprop.getElementsByTagName('rdf:type')
Now rdf_type
is a node list - after all the method is called "get elements by tag name", and minidom cannot know that there only might be a single <rdf:type>
in your case. We take the first one, if it exists:
rdf_type = rdf_type[0] if len(rdf_type) > 0 else None
Now rdf:resource
is an attribute on that element. Attributes are accessed through .getAttribute()
in minidom.
In theory, the rdf:resource
attribute could be is missing in the XML, so let's make sure it exists before using it:
if rdf_type is not None and rdf_type.hasAttribute('rdf:resource'):
rdf_resource = rdf_type.getAttribute('rdf:resource')
else:
rdf_resource = None
print(rdf_resource)
All that being said, instead of manually wrestling with RDF files, it might be worthwhile to check out libraries that were written for RDF, such as rdflib, or even for OWL specifically, such as pyLODE.