rdfsparqljenan-triples

How to merge RDF subjects with same properties summing their values?


Given the following triples:

s1 nameProperty "Bozo"
s1 laughProperty "Haha"
s1 valueProperty "2.00"^^xml:double

s2 nameProperty "Clown"
s2 laughProperty "hehe"
s2 valueProperty "3.00"^^xml:double

s3 nameProperty "Bozo"
s3 laughProperty "Haha"
s3 valueProperty "1.00"^^xml:double

I'd like to merge subjects with the same name and laugh and sum their values, with a result somewhat like:

s1 nameProperty "Bozo"
s1 laughProperty "Haha"
s1 valueProperty "3.00"^^xml:double
s2 nameProperty "Clown"
s2 laughProperty "hehe"
s2 valueProperty "3.00"^^xml:double

How to perform this with SPARQL with the most efficiency? (There is no need to retain subjects. They can be inserted as long as the new one with the merged values shares the same nameProperty and laughProperty.)


Solution

  • It's usually helpful if you provide data that we can actually run queries over. Here's data analogous to yours, but that we can actually work with:

    @prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
    @prefix : <urn:ex:>
    
    :s1 :nameProperty "Bozo".
    :s1 :laughProperty "Haha".
    :s1 :valueProperty "2.00"^^xsd:double.
    
    :s2 :nameProperty "Clown".
    :s2 :laughProperty "hehe".
    :s2 :valueProperty "3.00"^^xsd:double.
    
    :s3 :nameProperty "Bozo".
    :s3 :laughProperty "Haha".
    :s3 :valueProperty "1.00"^^xsd:double.
    

    This is a pretty straightforward construct query. The only tricky part is that since we need a group by we have to use a nested select query so that we can use the sum and sample aggregate functions.

    prefix : <urn:ex:>
    
    construct {
      ?clown :nameProperty ?name ;
             :laughProperty ?laugh ;
             :valueProperty ?total
    }
    where {
      { select (sample(?s) as ?clown) ?name ?laugh (sum(?value) as ?total) where {
          ?s :nameProperty ?name ;
             :laughProperty ?laugh ;
             :valueProperty ?value
        }
        group by ?name ?laugh }
    }
    

    Results (in N3 and N-Triples, just to be sure that the 3.0e0 is actually an xsd:double):

    @prefix :      <urn:ex:> .
    @prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
    
    :s3     :laughProperty  "Haha" ;
            :nameProperty   "Bozo" ;
            :valueProperty  3.0e0 .
    
    :s2     :laughProperty  "hehe" ;
            :nameProperty   "Clown" ;
            :valueProperty  "3.00"^^xsd:double .
    

    <urn:ex:s2> <urn:ex:laughProperty> "hehe" .
    <urn:ex:s2> <urn:ex:nameProperty> "Clown" .
    <urn:ex:s2> <urn:ex:valueProperty> "3.00"^^<http://www.w3.org/2001/XMLSchema#double> .
    <urn:ex:s3> <urn:ex:laughProperty> "Haha" .
    <urn:ex:s3> <urn:ex:nameProperty> "Bozo" .
    <urn:ex:s3> <urn:ex:valueProperty> "3.0e0"^^<http://www.w3.org/2001/XMLSchema#double> .