validationontologyturtle-rdfknowledge-graphshacl

How to impose the uniqueness of data property values via SHACL


I cannot figure out how to impose the uniqueness of a data property value via SHACL.

The following excerpts are lite versions of the examples presented by Henriette Harmse in her personal blog.

Suppose we have the following data:

@prefix ex: <http://example.com/ns#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:Alice
    a ex:Person ;
    ex:ssn "987-65-4321" .
      
ex:Bob
    a ex:Person ;
    ex:ssn "987-65-4321" .

And below is the respective shape definition, which permits max 1 value for property ex:ssn and constraints its format:

@prefix dash: <http://datashapes.org/dash#> .
@prefix ex: <http://example.com/ns#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:PersonShape
    a sh:NodeShape ;
    sh:targetClass ex:Person ; 
    sh:property
        [
            sh:path ex:ssn ; 
            sh:maxCount 1 ;
            sh:datatype xsd:string ;
            sh:pattern "^\\d{3}-\\d{2}-\\d{4}$" ;
        ] .

The data above naturally conforms and no violations are reported. However, I would like to also impose an additional constraint that ssn numbers should be unique per person and that no two persons can share the same ssn number. In this case, the data would be invalid, since both ex:Alice and ex:Bob have the same ssn. How can this be expressed in SHACL?


Solution

  • Using the DASH constraint components library:

    @prefix sh: <http://www.w3.org/ns/shacl#> .
    @prefix dash: <http://datashapes.org/dash#> .
    @prefix ex: <http://example.com/ns#> .
    @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
    
    ex:PersonShape
        a sh:NodeShape ;
        sh:targetClass ex:Person ;
        sh:property [
            sh:path ex:ssn ;
            sh:datatype xsd:string ;
            sh:maxCount 1 ;
            dash:uniqueValueForClass ex:Person ;
        ] .
    

    If DASH is not supported, use SPARQL-based constraint:

    @prefix sh: <http://www.w3.org/ns/shacl#> .
    
    ex:PersonShape
        a sh:NodeShape ;
        sh:targetClass ex:Person ;
        sh:sparql [
            sh:message "SSN is not unique" ;
            sh:select """
                PREFIX ex: <http://example.com/ns#> . 
                PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
                SELECT DISTINCT $this ?value ?other WHERE {
                    $this ex:ssn ?value .
                    ?other ex:ssn ?value .
                    FILTER (?other != $this) .
                    ?other a/rdfs:subClassOf* ex:Person .
                }
                """
        ]