sparqlfusekisparqlwrapper

SPARQLWrapper can't make CONSTRUCT query return other than XML


I'm using SPARQLWrapper to query a local SPARQL endpoint (using apache-jena-fuseki), and some of my queries are CONSTRUCT queries.

The query will give me valid results on web-based SPARQL interface, e.g. yasgui. When using SPARQLWrapper, the default query method will give me this error:

Response:
b'Error 400: Failed to write output in RDF/XML: Only well-formed absolute URIrefs can be included in RDF/XML output: <arcp://uuid,00000000-0000-0000-0000-000000000000/> Code: 28/NOT_DNS_NAME in HOST: The host component did not meet the restrictions on DNS names.\n'

(I have replaced the UUID with 0.)

I found this post. Unfortunately the source data is out of my control so I cannot change its content easily -- it is CWL-Prov and its standard tells it to use this representation. Therefore, I need to use other return formats. I tried N-Triples and Turtle formats on yasgui, and they work there.

However, when setting the return format on SPARQLWrapper, problem occurs. If I set it to anything other than SPARQLWrapper.XML, it returns this error (using N3 as an example):

Response:
b"Error 400: Can't determine output content type: n3\n"

(JSON is not supported for CONSTRUCT query.)

If I use a custom string other than the given ones, it will fallback to XML automatically (as described in its document).

The error is generated by fuseki, so I believe maybe I have done something wrong. Does anyone experience this and how can it be solved?


The code snippet I'm using to do the query:

import SPARQLWrapper

sparql = SPARQLWrapper.SPARQLWrapper('http://localhost:3030/prov')
#query = '' # The CONSTRUCT query here
sparql.setQuery(query)
sparql.setReturnFormat(SPARQLWrapper.N3)
return sparql.query().convert()

As suggested by @AndyS, I replaced N3 with Turtle, but error still occurs. Running fuseki with -v, here is what I get:

[2020-11-04 17:02:22] Fuseki     INFO  [1]   => User-Agent:          sparqlwrapper 1.8.5 (rdflib.github.io/sparqlwrapper)
[2020-11-04 17:02:22] Fuseki     INFO  [1]   => Connection:          close
[2020-11-04 17:02:22] Fuseki     INFO  [1]   => Host:                127.0.0.1:3030
[2020-11-04 17:02:22] Fuseki     INFO  [1]   => Accept-Encoding:     identity
[2020-11-04 17:02:22] Fuseki     INFO  [1]   => Accept:              application/turtle,text/turtle
[2020-11-04 17:02:22] Fuseki     WARN  SPARQL Query: Unrecognize request parameter (ignored): results
[2020-11-04 17:02:22] Fuseki     INFO  [1] Query = 

MY-ORIGINAL-QUERY-OMITTED

[2020-11-04 17:02:22] Fuseki     INFO  [1]   <= Vary:                Accept,Accept-Encoding,Accept-Charset
[2020-11-04 17:02:22] Fuseki     INFO  [1] 400 Can't determine output content type: turtle (165 ms)

I copied the printed query, and it works on YASGUI. There are also some errors on URI/IRI scheme violation, which I omitted here.

I saw these extra query parameters at the end of the query URL:

&format=turtle&output=turtle&results=turtle

Maybe they are related to the error? But why doesn't fuseki complain about format and output (like for results) nor prints them (like for query)?


Solution

  • SPARQLWrapper defaults to adding

    &format=turtle&output=turtle&results=turtle

    to the request.

    SPARQLWrapper has a method setOnlyConneg that turns off the adding of the additional query string parts.

    1. The WARN SPARQL Query: Unrecognize request parameter (ignored): results happens because Fuseki does understand results and logs a warning about it. It is just a warning.

    2. format is a mechanism to override the proper HTTP content negotiation mechanism because in some situations it is hard to set the HTTP headers. This does not apply to SPARQLWrapper which does correctly set Accept:.

    3. format=turtle isn't in the list of names for a CONSTRUCT query. ttl is. (`turtle can be added to future version of Fuseki for completeness).

    The best way is not to have the non-standard query string parameters with setOnlyConneg. SPARQLWrapper correctly sets the "Accept:" header in the request and Fuseki has content negotiation and will work with that header.