How can i query a remote endpoint (like endpoints of DBPedia or Wikidata) and insert resulting triples in a local graph? So far, i know that there are commands like INSERT, ADD, COPY etc. which can be used for such tasks. What i don't understand is how to address a remote endpoint while updating my local graph. Could someone provide a minimum example or the main steps?
I'm using Apache Jena Fuseki v2 on Windows and this is my query so far:
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX wd: <http://www.wikidata.org/entity/>
INSERT
{ GRAPH <???> { ?s ?p ?o } } #don't know what to insert here for "GRAPH"
WHERE
{ GRAPH <???> #don't know what to insert here for "GRAPH" either
{ #a working example query for wikidata:
?s wdt:P31 wd:Q5. #humans
?s wdt:P54 wd:Q43310. #germans
?s wdt:P1344 wd:Q79859. #part of world cup 2014
?s ?p ?o.
}
}
My local endpoint i'm querying is http://localhost:3030/mylocaldb/update
. I've read that /update
is necessary to edit the database (i'm not sure if i understood that correctly though).
Is my approach correct so far? Or is more stuff like additional scripting outside SPARQL needed?
Taken from the SPARQL 1.1 Update W3C specs:
The syntax is
( WITH IRIref )?
INSERT QuadPattern
( USING ( NAMED )? IRIref )*
WHERE GroupGraphPattern
If the INSERT template specifies GRAPH blocks then these will be the graphs affected. Otherwise, the operation will be applied to the default graph, or, respectively, to the graph specified in the WITH clause, if one was specified. If no USING (NAMED) clause is present, then the pattern in the WHERE clause will be matched against the Graph Store, otherwise against the dataset specified by the USING (NAMED) clauses. The matches against the WHERE clause create bindings to be applied to the template for determining triples to be inserted (following the same rules as for DELETE/INSERT).
So this basically means, you can omit the GRAPH
definition from the INSERT
part if you want to store it in the default graph, otherwise it will be the graph in which you want to store the data.
Regarding the WHERE
clause, usually you would have to use the SERVICE
keyword here to apply federated querying on the Wikidata endpoint (https://query.wikidata.org/bigdata/namespace/wdq/sparql):
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX wd: <http://www.wikidata.org/entity/>
INSERT
{ ?s ?p ?o }
WHERE
{ SERVICE <https://query.wikidata.org/bigdata/namespace/wdq/sparql>
{ #a working example query for wikidata:
?s wdt:P31 wd:Q5. #humans
?s wdt:P54 wd:Q43310. #germans
?s wdt:P1344 wd:Q79859. #part of world cup 2014
?s ?p ?o.
}
}
I tested it with Apache Jena and it inserts 4462 triples into my local dataset.