I have a .ttl
file. I want to extract all distinct predicates from it. I am using Apache-jena
. For this, I have used this SPARQL command:
"SELECT DISTINCT ?property WHERE {" +
" ?s ?property ?o ."
+ "}";
And I get a result, something like this:
<http://something.dk/ontology/business/name
<http://something.dk/ontology/business/id
What I want is to get rid of this prefix,
<http://something.dk/ontology/business/
and get only name
and id
as predicates which will be used to get their object value accordingly. For now, I'm doing this:
"prefix j.0`<http://something.dk/ontology/business/>" +
"select ?a ?b where {" +
" ?Name j.0:name ?a ."
+ " ?Name j.0:id ?b ."
+ "}";
But this is not efficient as there might be some other properties. How can I get all predicates from the model without prefixes and use those predicates to get the object values?
Your predicate URIs all contain the word "ontology"... do you actually have an ontology? Do you understand that an ontology is different from just any free-form linked data triples? Where are the class <http://something.dk/ontology/business/village>
and the predicate <http://something.dk/ontology/business/population>
defined?
In other words, for these data triples:
prefix : <http://something.dk/ontology/business/>
<http://something.dk/resource/business/community/326> :name "Akalia" ;
a :village ;
:id "326" ;
:population "2000" ;
:area "30" .
I would expect to see at least the following minimal ontology:
prefix : <http://something.dk/ontology/business/> .
prefix owl: <http://www.w3.org/2002/07/owl#> .
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
:madman.owl rdf:type owl:Ontology .
:area rdf:type owl:DatatypeProperty ;
rdfs:label "area" .
:id rdf:type owl:DatatypeProperty ;
rdfs:label "id" .
:name rdf:type owl:DatatypeProperty ;
rdfs:label "name" .
:area rdf:type owl:DatatypeProperty ;
rdfs:label "area" .
:village rdf:type owl:Class ;
rdfs:label "village" .
If you load both the data and the ontology into a triplestore like Jena Fuseki, this query:
PREFIX : <http://something.dk/ontology/business/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?v ?l ?o
WHERE
{ ?v a :village ;
?p ?o .
?p rdfs:label ?l
}
Returns this result:
+-----------------------------------------------------+------+--------+
| v | l | o |
+-----------------------------------------------------+------+--------+
| http://something.dk/resource/business/community/326 | id | 326 |
| http://something.dk/resource/business/community/326 | area | 30 |
| http://something.dk/resource/business/community/326 | name | Akalia |
+-----------------------------------------------------+------+--------+
If you're using one of Jena's other ways of accessing RDF content, you could use the same query, but you would have to use a different method for combining the data triples and the triples from the ontology.
@AKSW's comment is one way of doing a sub-string removal for this particular task. Specifically, we are removing the content of the default :
prefix from every URI. A more general function is replace()
.
I have never seen @AKSW give bad advice, but I would really urge you to get into the habit of using as proper ontology, not a string manipulation workaround.
PREFIX : <http://something.dk/ontology/business/>
SELECT ?v ?extrLabel ?o
WHERE
{ ?v a :village ;
?p ?o
BIND(strafter(str(?p), str(:)) AS ?extrLabel)
}
@Stanislav also knows his stuff. It looks to me like afn:localname()
is a convenience function, so you don't have to type out this regular expression replace
ment: REPLACE(STR(?x), "^(.*)(/|#)([^#/]*)$", "$3")
PREFIX : <http://something.dk/ontology/business/>
PREFIX afn: <http://jena.apache.org/ARQ/function#>
SELECT ?v ?extrLabel ?o
WHERE
{ ?v a :village ;
?p ?o
BIND(afn:localname(?p) AS ?extrLabel)
}
A fun exercise would be obtaining or synthesizing many thousands of triples like you provided and timing the performance of these three different labeling methods.
Also, with an ontology you could set the domain and ranges for your datatype properties, like population
. That should take an xsd:integer
, not an untyped string in my opinion.