I'm facing a strange behavior using Allegrograph 4.13
This is the data for test case
prefix : <http://example.com/example#>
INSERT DATA {
:A rdfs:label "A" .
:A :hasProp :Prop1 .
:Prop1 :Key "1" .
:Prop1 :Value "AA" .
:B :hasProp :Prop2 .
:Prop2 :Key "1" .
:Prop2 :Value "AA" .
:C :hasProp :Prop3 .
:C :hasProp :Prop4 .
:Prop3 :Key "1" .
:Prop3 :Value "AA" .
:Prop4 :Key "2" .
:Prop4 :Value "BB" .
}
Given :A, I need to find resources that have exactly the same properties. That is, I want to find :B but not :C, because :C has one property more (Key "2" and Value "BB")
See also this question Find individuals in SPARQL based on other relations / Compare sets
The following query kindly provided by Joshua Taylor uses resource directly (:A) and does exactly what I want:
prefix : <http://example.com/example#>
select ?other ?k ?v {
:A :hasProp [ :Key ?k ; :Value ?v ] .
?other :hasProp [ :Key ?k ; :Value ?v ] .
filter not exists {
{ :A :hasProp [ :Key ?kk ; :Value ?vv ] .
filter not exists { ?other :hasProp [ :Key ?kk ; :Value ?vv ] .
}
}
union
{
?other :hasProp [ :Key ?kk ; :Value ?vv ] .
filter not exists { :A :hasProp [ :Key ?kk ; :Value ?vv ] .
}
}
}
}
Answer:
-------------------
|other| k | v
|A | "1" | "AA"
|B | "1" | "AA"
-------------------
The second one is using a variable ?a, because I need to find :A first according to some criteria (rdfs:label in this example)
Query using variable ?a:
prefix : <http://example.com/example#>
select ?other ?k ?v {
?a rdfs:label "A" .
?a :hasProp [ :Key ?k ; :Value ?v ] .
?other :hasProp [ :Key ?k ; :Value ?v ] .
filter not exists {
{ ?a :hasProp [ :Key ?kk ; :Value ?vv ] .
filter not exists { ?other :hasProp [ :Key ?kk ; :Value ?vv ] .
}
}
union
{
?other :hasProp [ :Key ?kk ; :Value ?vv ] .
filter not exists { ?a :hasProp [ :Key ?kk ; :Value ?vv ] .
}
}
}
}
returns
-------------------
|other| k | v
|A | "1" | "AA"
|B | "1" | "AA"
|C | "1" | "AA"
-------------------
This query returns also :C which is wrong in my opinion.
Can anybody explain this behavior or verify this test case with other triple stores / SPARQL engines ?
As per request in the comments, I added the prefix for rdfs and also substituted the blank nodes with variables. This seems to have no effect.
prefix : <http://example.com/example#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?a ?pr1 ?pr2 ?other ?k ?v {
?a rdfs:label "A" .
# bind (:A as ?a) .
?a :hasProp ?pr1 .
?pr1 :Key ?k ; :Value ?v .
?other :hasProp ?pr2 .
?pr2 :Key ?k ; :Value ?v .
filter not exists {
{ ?a :hasProp ?pp1 .
?pp1 :Key ?kk ; :Value ?vv .
filter not exists { ?other :hasProp ?pp2 .
?pp2 :Key ?kk ; :Value ?vv .
}
}
union
{
?other :hasProp ?pp3 .
?pp3 :Key ?kk ; :Value ?vv .
filter not exists { ?a :hasProp ?pp4 .
?pp4 :Key ?kk ; :Value ?vv .
}
}
}
}
a pr1 pr2 other k v
A Prop1 Prop1 A "1" "AA"
A Prop1 Prop2 B "1" "AA"
A Prop1 Prop3 C "1" "AA"
If I use BIND (commented) instead of the line with rdfs:label it looks the same.
I think that you've found a bug in AllegroGraph. It seems like adding the ?a rdfs:label "A" should restrict the value of ?a to being :A, and that's the behavior we see with Jena.
Jena: VERSION: 2.11.0
Jena: BUILD_DATE: 2013-09-12T10:49:49+0100
ARQ: VERSION: 2.11.0
ARQ: BUILD_DATE: 2013-09-12T10:49:49+0100
RIOT: VERSION: 2.11.0
RIOT: BUILD_DATE: 2013-09-12T10:49:49+0100
prefix : <http://example.com/example#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?other ?k ?v {
?a rdfs:label "A" .
?a :hasProp [ :Key ?k ; :Value ?v ] .
?other :hasProp [ :Key ?k ; :Value ?v ] .
filter not exists {
{ ?a :hasProp [ :Key ?kk ; :Value ?vv ] .
filter not exists { ?other :hasProp [ :Key ?kk ; :Value ?vv ] .
}
}
union
{
?other :hasProp [ :Key ?kk ; :Value ?vv ] .
filter not exists { ?a :hasProp [ :Key ?kk ; :Value ?vv ] .
}
}
}
}
----------------------
| other | k | v |
======================
| :B | "1" | "AA" |
| :A | "1" | "AA" |
----------------------
It probably makes sense to come up with the minimal example that reproduces this behavior, and to submit a bug report.