sparqlrdfsemantic-websesameblank-nodes

How to get a concise bounded description of a resource with Sesame?


I've been testing Sesame 2.7.2 and I got a big surprise when faced to the fact that DESCRIBE queries do not include blank nodes closure [EDIT: the right term for this is CBD for concise bounded description]

If I correctly understand, the SPARQL spec is quite loose on that and says that what is returned is actually up to the provider, but I'm still surprised at the choice, since bnodes (in the results of the describe query) cannot be used in subsequent SPARQL queries.

So the question is: how can I get a closed description of a resource <uri1> without doing:

  1. query DESCRIBE <uri1>
  2. iterate over the result to determine which objects are blank nodes
  3. then DESCRIBE ?b WHERE { <uri1> pred_relating_to_bnode_ ?b }
  4. do it recursively and chaining over as long as bnodes are found

If I'm not mistaken, depth-2 bnodes would have to be described with

DESCRIBE ?b2 WHERE {<uri1> <p1&> ?b . ?b <p2> ?b2 }

unless there is a simpler way to do this?

Finally, would it not be better and simpler to let DESCRIBE return a closed description of a resource where you can still obtain the currently returned result with something like the following?

CONSTRUCT {<uri1> ?p ?o} WHERE {<uri1> ?p ?o}

EDIT: here is an example of a closed result I want to get back from Sesame

<urn:sites#1> a my:WebSite .
<urn:sites#1> my:domainName _:autos1 .
<urn:sites#1> my:online "true"^^xsd:boolean .
_:autos1 a rdf:Alt .
_:autos1 rdf:_1 _:autos2
_:autos2 my:url "192.168.2.111:15001"@fr
_:autos2 my:url "192.168.2.111:15002"@en

Currently: DESCRIBE <urn:sites#1> returns me the same result as the query CONSTRUCT WHERE {<urn:sites#1> ?p ?o}, so I get only that

<urn:sites#1> a my:WebSite .
<urn:sites#1> my:domainName _:autos1 .
<urn:sites#1> my:online "true"^^xsd:boolean .

Solution

  • Partial solutions using SPARQL

    Based on your comments, this isn't an exact solution yet, but note that you can describe multiple things in a given describe query. For instance, given the data:

    @prefix : <http://example.org/> .
    
    :Alice :named "Alice" ;
           :likes :Bill, [ :named "Carl" ;
                           :likes [ :named "Daphne" ]].
    :Bill :likes :Elaine ;
          :named "Bill" .
    

    you can run the query:

    PREFIX : <http://example.org/>
    
    describe :Alice ?object where {
      :Alice :likes* ?object .
      FILTER( isBlank( ?object ) )
    }
    

    and get the results:

    @prefix :        <http://example.org/> .
    
    :Alice
          :likes        :Bill ;
          :likes        [ :likes        [ :named        "Daphne"
                                        ] ;
                          :named        "Carl"
                        ] ;
          :named        "Alice" .
    

    That's not a complete description of course, because it's only following :likes out from :Alice, not arbitrary predicates. But it does get the blank nodes named "Carl" and "Daphne", which is a start.

    The larger issue in Sesame

    It looks like you're going to have to do something like what's described above, and possibly with multiple searches, or you're going to have to modify Sesame. The alternative to writing some creative SPARQL is to change the way that Sesame implements describe queries. Some endpoints make this relatively easy, but Sesame doesn't seem to be one of them. There's a mailing list thread from 2011, Custom SPARQL DESCRIBE Implementation, that seems addressed at this same problem.

    Roberto García asks:

    I'm trying to customise the behaviour of SPARQL DESCRIBE queries. I'm willing to get something similar to CBD (i.e. all properties and values for the described resource plus all properties and values for the blank nodes connected to it).

    I have tried to reproduce a similar behaviour using a CONSTRUCT query but the performance is not good and the query gets quite complex if I try to consider long chains of properties pointing to blank nodes starting from the described resource.

    Jeen Broekstra replies:

    The implementation of DESCRIBE in Sesame is hardcoded in the query parser. It can only be changed by adapting the parser itself, and even then it will be tricky, as the query model has no easy way to express it either: it needs an extension of the algebra.

    > If this is not possible, any advice about how to implement it using CONSTRUCT queries?

    I'm not sure it's technically possible to do this in a single query. CBDs are recursive in nature, and while SPARQL does have some support for recursivity (property chains), the problem is that you have to do an intermediate check in every step of the property chain to see if the bound value is a blank node or not. This is not something that SPARQL supports out of the box: property chains are defined to have only length of the path as the stop condition.

    Perhaps something is possible using a convoluted combination of subqueries, unions and optionals, but I doubt it.

    I think the best workaround is instead to use the standard DESCRIBE format that Sesame supports, and for each blank node value in that result do a separate consecutive query. In other words: you solve it by hand.

    The only other option is to log a feature request for support of CBDs in Sesame. I can't give any guarantees about if/when that will be followed up on though.