I am new to SPARQL, and graph database querying as a whole so please excuse any ignorance but I am trying to write a basic output using some data stored within Fueski and am struggling to understand the best practice for handling duplication of rows due to the cardinality that exist between the various concepts.
I will use a simple example to hopefully demonstrate my point.
This is a representative sample of the types of data and relationships I am currently working with;
Based on this structure I have produced the following triples (N-Triple format);
<http://www.test.com/ontologies/Author/JohnGrisham> <http://www.test.com/ontologies/property#firstName> "John" .
<http://www.test.com/ontologies/Author/JohnGrisham> <http://www.test.com/ontologies/property#lastName> "Grisham" .
<http://www.test.com/ontologies/Author/JohnGrisham> <http://www.test.com/ontologies/property#hasWritten> <http://www.test.com/ontologies/Book/TheClient> .
<http://www.test.com/ontologies/Author/JohnGrisham> <http://www.test.com/ontologies/property#hasWritten> <http://www.test.com/ontologies/Book/TheFirm> .
<http://www.test.com/ontologies/Book/TheFirm> <http://www.test.com/ontologies/property#name> "The Firm" .
<http://www.test.com/ontologies/Book/TheFirm> <http://www.test.com/ontologies/property#soldBy> <http://www.test.com/ontologies/Retailer/Foyles> .
<http://www.test.com/ontologies/Book/TheFirm> <http://www.test.com/ontologies/property#soldBy> <http://www.test.com/ontologies/Retailer/Waterstones> .
<http://www.test.com/ontologies/Book/TheClient> <http://www.test.com/ontologies/property#name> "The Client" .
<http://www.test.com/ontologies/Book/TheClient> <http://www.test.com/ontologies/property#soldBy> <http://www.test.com/ontologies/Retailer/Amazon> .
<http://www.test.com/ontologies/Book/TheClient> <http://www.test.com/ontologies/property#soldBy> <http://www.test.com/ontologies/Retailer/Waterstones> .
<http://www.test.com/ontologies/Retailer/Amazon> <http://www.test.com/ontologies/property#name> "Amazon" .
<http://www.test.com/ontologies/Retailer/Waterstones> <http://www.test.com/ontologies/property#name> "Waterstones" .
<http://www.test.com/ontologies/Retailer/Foyles> <http://www.test.com/ontologies/property#name> "Foyles" .
Now what I am trying to do is render a page where all authors are displayed showing details of all the books and the retailers in which those individual books are sold. so something like this (pseudo code);
for-each:Author
<h1>Author.firstName + Author.lastName</h1>
for-each:Author.Book
<h2>Book.Name</h2>
Sold By:
for-each:Book.Retailer
<h2>Retailer.name</h2>
For the rendering to work my thinking was I would need the author's First name and last name, then all book names they have and the various retailer names those books are sold through and therefore I came up with the following SPARQL;
PREFIX p: <http://www.test.com/ontologies/property#>
SELECT ?authorfirstname
?authorlastname
?bookname
?retailername
WHERE {
?author p:firstName ?authorfirstname;
p:lastName ?authorlastname;
p:hasWritten ?book .
OPTIONAL {
?book p:name ?bookname;
p:soldBy ?retailer .
?retailer p:name ?retailername .
}
}
This provides the following results;
Unfortunately due to the duplication of rows my basic rendering attempt cannot produce output as expected, in fact it's rendering a new "Author" section for every row returned from the query.
I guess what I'm trying to understand is how should this type of rendering should be done.
Is it the renderer that is supposed to regroup data back into the graph form it wants to travese (I honestly cannot see how this can be the case)
Is the SPARQL invalid - is there a way to do what I want in the SPARQL language itself?
Am I just doing something completely wrong?
GROUP_CONCAT
When reviewing the options available to me I came across GROUP_CONCAT
but after a bit of playing with it decided it probably wasn't the option that was going to give me what I wanted and probably wasn't the best route. The reasons for this are;
Whilst the data set I am running my examples over in this post is small only spanning 3 concepts and a very restricted data set the actual concepts and data I am running against in the real world is far far larger where concatenating results will produce extremely long delimitered strings, especially for free format columns such as descriptions.
Whilst trying out group_concat I quickly realised that I couldn't understand the context of how the various data elements across the group_concat columns related.. I can show that by using the book example above.
PREFIX p: <http://www.test.com/ontologies/property#>
select ?authorfirstname
?authorLastName
(group_concat(distinct ?bookname; separator = ";") as ?booknames)
(group_concat(distinct ?retailername; separator = ";") as ?retailernames)
where {
?author p:firstName ?authorfirstname;
p:lastName ?authorLastName;
p:hasWritten ?book .
OPTIONAL {
?book p:name ?bookname;
p:soldBy ?retailer .
?retailer p:name ?retailername .
}
}
group by ?authorfirstname ?authorLastName
This produced the following output;
firstname = "John"
lastname = "Grisham"
booknames = "The Client;The Firm"
retailernames = "Amazon;Waterstones;Foyles"
As you can see this has produced one result row but you can no longer work out how the various data elements relate. Which Retailers are for which Book?
Any help/guidance would be greatly appreciated.
Based on the recommended solution below I have used the concept of keys to bring the various data sets togehter however I have tweeked it slightly so that I am using a query per concept (E.g. author, book and retailer) and then used the keys to bring together the results in my renderer.
firstname lastname books
--------------------------------------------------------------------------------
1 John Grisham ontologies/Book/TheClient|ontologies/Book/TheFirm
id name retailers
-------------------------------------------------------------------------------------------------------
1 ontologies/Book/TheClient The Client ontologies/Retailer/WaterStones|ontologies/Retailer/Amazon
2 ontologies/Book/TheFirm The Firm ontologies/Retailer/WaterStones|ontologies/Retailer/Foyles
id name
--------------------------------------------------
1 ontologies/Retailer/Amazon Amazon
2 ontologies/Retailer/Waterstones Waterstones
3 ontologies/Retailer/Foyles Foyles
What I then do in my renderer is use the ID's to pull results from the various result sets...
for-each author a : authors
output(a.firstname)
for-each book b : a.books.split("|")
book = books.get(b) // get the result for book b (e.g. Id to Foreign key)
output(book.name)
for-each retailer r : book.retailers.split("|")
retailer = retailers.get(r)
output(retailer.name)
So effectively you are stitching together what you want from the various different result sets and presenting it.
This seems to be working OK for the moment.
I find it easier to construct objects out of the SPARQL results in code rather than trying to form a query that returns only a single row per the relevant resource.
I would use the URI of the resources to identify which rows belong to which resource (author in this case), and then merge the result rows based on said URI.
For JS applications I use the code here to construct objects out of SPARQL results.
For complex values I use __
in the variable name to denote that an object should be constructed from the value. For example all values with variables prefixed with ?book__
would be turned into an object with the remainder of the variable's name as the name of the object's attribute, each object identified by ?book__id
. So having values for ?book__id
and ?book__name
would result in an attribute book
for the author, such that author.book = { id: '<book-uri>', name: 'book name'}
(or a list of such objects if there are multiple books).
For example in this case I would use the following query:
PREFIX p: <http://www.test.com/ontologies/property#>
SELECT ?id ?firstName ?lastName ?book__id ?book__name
?book__retailer
WHERE {
?id p:firstName ?firstName;
p:lastName ?lastName;
p:hasWritten ?book__id .
OPTIONAL {
?book__id p:name ?book__name;
p:soldBy/p:name ?book__retailer .
}
}
And in the application code I would construct Author objects that look like this (JavaScript notation):
[{
id: '<http://www.test.com/ontologies/Author/JohnGrisham>',
firstName: 'John',
lastName: 'Grisham',
book: [
{
id: '<http://www.test.com/ontologies/Book/TheFirm>',
name: 'The Firm',
retailer: ['Amazon', 'Waterstones', 'Foyles']
},
{
id: '<http://www.test.com/ontologies/Book/TheClient>',
name: 'The Client',
retailer: ['Amazon', 'Waterstones', 'Foyles']
}
]
}]