pythonsparqlvirtuosordflib

Querying local SPARQL endpoint is very slow


I have setup a local SPARQL endpoint with DBPedia database using Openlink Virtuoso through this guide. Then I tried to query my database through Python with the help of RDFLib and SPARQLWrapper.

Problem is the time it take for a query (through Python) is very long, usually 2 to 3 seconds before I can get a result back. But when I use my browser to query directly at the endpoint (go to localhost from Chrome), for the same type of query I get the result instantly.

I don't think it's a problem with my Python code, because if I keep the same code and just change to DBPedia endpoint, I can get a query result within 0.1 to 0.2 seconds. My database file is around 6GB and I have configured the ini file to use more memory as instructed.

Anyone can troubleshoot the problem for me? I'm thinking I need to tweak some parameter with the virtuoso server but I don't know where to start. Thanks!

Here's what my query looks like (DBPedia endpoint or local endpoint directly from Chrome: almost instant result; local endpoint through Python: 2+ seconds):

sparql = SPARQLWrapper('http://localhost:8890/sparql')
sparql.setQuery('''
    SELECT ?id
    WHERE { 
     ?linkto dbo:wikiPageID ?id.
     ?origin    dbo:wikiPageWikiLink  ?linkto.
     ?origin  dbo:wikiPageID 9186.
    }
''')
sparql.setReturnFormat(CSV)
qres = sparql.query().convert().decode('u8')

EDIT: I printed out the runtime of several hundreds queries (on local endpoint) and every single one of them took around 2.01 to 2.05 sec to complete, not even one is below 2 sec. So I thought somewhere along the pipeline there’s a fixed 2 sec delay, and the actual query only takes 10 to 50ms to complete.


Solution

  • As I suggested in the comments, which did the trick --

    You might try changing localhost to 127.0.0.1. DNS is often the cause of weirdly slow things like this.