collectionsgroup-bysparqlrdfshacl

SPARQL CONSTRUCT to generate RDF collection from GROUP BY bindings


I would like to generate an RDF collection (i.e. rdf:first/rdf:rest linked list) using a SPARQL construct query, putting all grouped bindings for a variable into one collection.

So for the data

@prefix ex: <https://example.com/ns#> .
ex:example1  a          ex:Example ;
             ex:name    "Example1" ;
             ex:even    false .

ex:example2  a          ex:Example ;
             ex:name    "Example2" ;
             ex:even    true .

ex:example3  a          ex:Example ;
             ex:name    "Example3" ;
             ex:even    false .

ex:example4  a          ex:Example ;
             ex:name    "Example4" ;
             ex:even    true .

ex:example5  a          ex:Example ;
             ex:name    "Example5" ;
             ex:even    false .

if the SELECT query

PREFIX ex: <https://example.com/ns#>
select (group_concat(?name) as ?names) where {
  ?a ex:even ?even;
  ex:name ?name 
} group by ?even

yields

names
Example1 Example3 Example5
Example2 Example4

what would a corresponding CONSTRUCT query look like that contains the bindings for ?names as an rdf collection, ie something like

( "Example1" "Example3" "Example5" )
( "Example2" "Example4")

(Assuming TTL interpretation of the above)

Background: I would like to generate SHACL shapes using SHACL-AF SPARQLRules, and one thing I am struggling with is to generate sh:in (...) where the list is generated as an aggregate over multiple solutions of the query.


Solution

  • I have recently seen a solution which solves this question by generating unique IRIs for the RDF collection items.

    For example, the SPARQL query would be:

    prefix ex: <https://example.com/ns#>
    prefix rdf:      <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    
    CONSTRUCT {
     ?evenValue   ex:names ?names .
     ?names       a         rdf:List ;
                  rdf:first ?minName .
     ?currentItem a rdf:List ;
                  rdf:first ?currentName ;
                  rdf:rest  ?nextItem .
     ?lastItem    a rdf:List ;
                  rdf:first ?maxName ;
                  rdf:rest rdf:nil .
     }
    where {
       ?x ex:name ?minName .
       ?x ex:even ?even .
      {
        SELECT ?even (MIN(?name) AS ?minName) (MAX(?name) AS ?maxName)
        WHERE {
          ?y ex:name ?name .
          ?y ex:even ?even .
        } group by ?even
      }
      {
         SELECT ?currentName ?even (Min(?otherName) as ?nextName)
         WHERE {
                ?x ex:name ?currentName .
                ?x ex:even ?even .
    
                ?y ex:name ?otherName .
                ?y ex:even ?even2 .
    
                Filter (STR(?otherName) > STR(?currentName) && ?even = ?even2 )
         } group by ?currentName ?even
        }
      BIND(IRI(CONCAT(STR(?x), '-li-', STR(?minName))) AS ?names)
      BIND(IRI(CONCAT(STR(?x), '-li-', STR(?currentName))) AS ?currentItem)
      BIND(IRI(CONCAT(STR(?x), '-li-', STR(?nextName))) AS ?nextItem)
      BIND(IRI(CONCAT(STR(?x), '-li-', STR(?maxName))) AS ?lastItem)
      BIND(IRI(CONCAT(STR(?x), '-',    STR(?even))) AS ?evenValue)
    } 
    

    and the results of running the query with the question's data would be:

    @prefix ex:  <https://example.com/ns#> .
    @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
    
    ex:example1-false  ex:names  ex:example1-li-Example1 .
    ex:example2-true  ex:names  ex:example2-li-Example2 .
    
    ex:example1-li-Example1
            rdf:type   rdf:List ;
            rdf:first  "Example1" ;
            rdf:rest   ex:example1-li-Example3 .
    
    ex:example1-li-Example3
            rdf:type   rdf:List ;
            rdf:first  "Example3" ;
            rdf:rest   ex:example1-li-Example5 .
    
    ex:example1-li-Example5
            rdf:type   rdf:List ;
            rdf:first  "Example5" ;
            rdf:rest   () .
    
    ex:example2-li-Example2
            rdf:type   rdf:List ;
            rdf:first  "Example2" ;
            rdf:rest   ex:example2-li-Example4 .
    
    ex:example2-li-Example4
            rdf:type   rdf:List ;
            rdf:first  "Example4" ;
            rdf:rest   () .
    

    One issue in this solution is that it would be nicer if it generated bnodes for the items in the list, but I am not sure if it is possible as the BNODE function generates distinct blank nodes always.

    Credits: The previous solution was shown to me by Jerven Bolleman.