I have setup a local SPARQL endpoint with DBPedia database using Openlink Virtuoso through this guide. Then I tried to query my database through Python with the help of RDFLib and SPARQLWrapper.
Problem is the time it take for a query (through Python) is very long, usually 2 to 3 seconds before I can get a result back. But when I use my browser to query directly at the endpoint (go to localhost from Chrome), for the same type of query I get the result instantly.
I don't think it's a problem with my Python code, because if I keep the same code and just change to DBPedia endpoint, I can get a query result within 0.1 to 0.2 seconds. My database file is around 6GB and I have configured the ini file to use more memory as instructed.
Anyone can troubleshoot the problem for me? I'm thinking I need to tweak some parameter with the virtuoso server but I don't know where to start. Thanks!
Here's what my query looks like (DBPedia endpoint or local endpoint directly from Chrome: almost instant result; local endpoint through Python: 2+ seconds):
sparql = SPARQLWrapper('http://localhost:8890/sparql')
sparql.setQuery('''
SELECT ?id
WHERE {
?linkto dbo:wikiPageID ?id.
?origin dbo:wikiPageWikiLink ?linkto.
?origin dbo:wikiPageID 9186.
}
''')
sparql.setReturnFormat(CSV)
qres = sparql.query().convert().decode('u8')
EDIT: I printed out the runtime of several hundreds queries (on local endpoint) and every single one of them took around 2.01 to 2.05 sec to complete, not even one is below 2 sec. So I thought somewhere along the pipeline there’s a fixed 2 sec delay, and the actual query only takes 10 to 50ms to complete.
As I suggested in the comments, which did the trick --
You might try changing
localhost
to127.0.0.1
. DNS is often the cause of weirdly slow things like this.