I would like to use a query to deduplicate resources using a unique id. An insert/delete - query doesn't work, because less nodes have to be created than are deleted. Is it possible to use something similar to this?
insert {
?new a mails:Account.
?new mails:hasID ?id.
?new rdfs:label ?label
}
where {
{
select distinct ?id ?label where {
?account a mails:Account.
?account mails:hasID ?id.
?account rdfs:label ?label
}
}
bind(bnode() as ?new)
{
delete where {
?account mails:hasID ?id
}
}
}
Just "because less nodes have to be created than are deleted" doesn't necessarily mean that you can't use a normal insert/delete. RDF is a set-based representation; if you insert the same triple multiple times, it's the same as inserting it once. If you want to normalize a bunch of triples, you can create the same blank node for query results by using bnode with an argument: (emphasis added):
The BNODE function constructs a blank node that is distinct from all blank nodes in the dataset being queried and distinct from all blank nodes created by calls to this constructor for other query solutions. If the no argument form is used, every call results in a distinct blank node. If the form with a simple literal is used, every call results in distinct blank nodes for different simple literals, and the same blank node for calls with the same simple literal within expressions for one solution mapping.
This means that you can do:
insert {
?new a mails:Account.
?new mails:hasID ?id.
?new rdfs:label ?label
}
delete {
?account mails:hasId ?id
}
where {
?account a mails:Account.
?account mails:hasID ?id.
?account rdfs:label ?label
#-- One new bnode is created for each *distinct*
#-- ?id value. If two accounts have the same
#-- ?id value, then they will get the same bnode().
bind (bnode(str(?id)) as ?new)
}
If you're trying to merge all accounts into one, even if they have different IDs, then you could just pass a constant value into the bnode function, e.g.,
bind (bnode("") as ?new)