I am struggling to write a SPARQL query to fetch a list of products by the owner along with a count of other owners.
following is the query i expect to get the result
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX schema:<http://schema.org/>
SELECT distinct ?uri ?label ?r ?ownership ?rating ?comments ?allOwners
FROM <http://xxxx.net/>
WHERE {
?r rdf:type <http://schema.org/Relation> .
?r schema:property ?uri.
?r schema:owner ?owner .
?r schema:ownership ?ownership .
?uri rdfs:label ?label .
OPTIONAL {?r schema:comments ?comments .}
OPTIONAL {?r schema:rating ?rating .}
filter (?owner =<http://xxxx.net/resource/37654824-334f-4e57-a40c-4078cac9c579>)
{
SELECT (count(distinct ?owner) as ?allOwners)
FROM <http://xxxx.net/>
where {
?relation rdf:type <http://schema.org/Relation> .
?relation schema:owner ?owner .
?relation schema:property ?uri .
} group by ?uri
}
}
but it duplicates the result along with random count values.
How to write such a query, I know the inner query runs before the outer but how to use ?uri (subject) being used in the inner query for each record of outer result?
SPARQL Query semantics specify how portions of the query are joined together. Your sub-query does not project any common variables that are shared with the outer query. It only SELECT
s the ?allOwners
variable which does not appear in the rest of the query.
This means that you get a cross product of all the counts and all your other results; this is why you get duplicate rows and no correlations between the counts and rows.
This kind of query can be achieved if you structure it correctly. Since you haven't provided example results you desire, I'm having to make a best guess of what you want. Something like the following may have the desired results:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX schema:<http://schema.org/>
SELECT distinct ?uri ?label ?r ?ownership ?rating ?comments ?allOwners
FROM <http://xxxx.net/>
WHERE
{
?r rdf:type <http://schema.org/Relation> .
?r schema:property ?uri.
?r schema:owner ?owner .
?r schema:ownership ?ownership .
?uri rdfs:label ?label .
FILTER (?owner = <http://xxxx.net/resource/37654824-334f-4e57-a40c-4078cac9c579>)
{
SELECT ?uri (count(distinct ?owner) as ?allOwners)
FROM <http://xxxx.net/>
WHERE
{
?relation rdf:type <http://schema.org/Relation> .
?relation schema:owner ?owner .
?relation schema:property ?uri .
} GROUP BY ?uri
}
OPTIONAL { ?r schema:comments ?comments . }
OPTIONAL { ?r schema:rating ?rating . }
}
This differs from your original query as follows:
FILTER
on ?owner
sooner in the query to help the query engine apply it sooner.
FILTER
position is usually pretty flexible except when you are using nested graph patterns (like OPTIONAL
or MINUS
), in which case placing it after those clauses may be applying it later than you intendFILTER
clauses as soon as possible after all the variables you need are introducedGROUP BY
variable ?uri
from your sub-query into the SELECT
line of your sub-query
?allOwners
count with the ?uri
to which it pertainsThis may or may not be the query you are after, but hopefully it helps point you in the right direction