In my database, I want certain nodes and relationships returned if items in 2 aggregated lists do not match. To be specific, a user has certain accesses and role. I want to pinpoint those users whose current accesses do not match with the accesses required by their roles. This is my main query to achieve this:
match(u:User)-[r1:HAS_ATTRIBUTE]->(ri:RoleInfo)-[r2:CURRENT]->(rr:Role{roleName:'Claims Specialist'})-[rel:REQUIRES_ACCESS_TO]->(e:Entitlement)
match (u)-[r3:HAS_ACCOUNT]->(ua:UserAccount)-[r4:HAS_ATTRIBUTE]->(ei:EntitlementInfo)-[r5:CURRENT]->(en:Entitlement)
with apoc.coll.sort(collect(distinct e.entitlementName)) as reqdAccesses, e,u,r1, ri, r2, rr,r3,ua,r4,ei,r5,en, rel, apoc.coll.sort(collect(distinct en.entitlementName)) as currentAccesses
WHERE any(item in currentAccesses where not item in reqdAccesses) and size(reqdAccesses) <> size(currentAccesses)
return e,u,r1,ri,r2,r3,ua,r4,ei,r5,en,rel,rr
Now as per the data both the lists for this role have a mismatch as shown by below query and output.
match(u:User)-[r1:HAS_ATTRIBUTE]->(ri:RoleInfo)-[r2:CURRENT]->(rr:Role{roleName:'Claims Specialist'})-[rel:REQUIRES_ACCESS_TO]->(e:Entitlement)
match (u)-[r3:HAS_ACCOUNT]->(ua:UserAccount)-[r4:HAS_ATTRIBUTE]->(ei:EntitlementInfo)-[r5:CURRENT]->(en:Entitlement)
return apoc.coll.sort(collect(distinct en.entitlementName)) as currentAccesses, apoc.coll.sort(collect(DISTINCT e.entitlementName)) as reqdAccesses
output:
currentAccesses
["Claims Adjustment Portal", "Claims Management System", "Customer Relationship Management System", "Electronic Health Records System", "Fraud Detection Platform", "Patient Portal"]
reqdAccesses
["Claims Adjustment Portal", "Claims Management System", "Customer Feedback System", "Fraud Detection Platform", "MEDSTAR CRM"]
I still do not get the data returned. Is there anything wrong in my main query ?
Aggregating functions like COLLECT
aggregate on grouping keys.
Your first query uses a very large set of grouping keys (e,u,r1,ri,r2,rr,r3,ua,r4,ei,r5,en,rel):
with apoc.coll.sort(collect(distinct e.entitlementName)) as reqdAccesses, e,u,r1, ri, r2, rr,r3,ua,r4,ei,r5,en, rel, apoc.coll.sort(collect(distinct en.entitlementName)) as currentAccesses
Because that clause includes e
and en
as grouping keys, the collect(distinct e.entitlementName)
and collect(distinct en.entitlementName)
aggregated lists will each only contain a single item.
Your second query specifies no grouping keys, and therefore works as you intended:
return apoc.coll.sort(collect(distinct en.entitlementName)) as currentAccesses, apoc.coll.sort(collect(DISTINCT e.entitlementName)) as reqdAccesses
This query should identify every user that has accesses that they are not entitled to, and return them and a list of their violations:
MATCH (u:User)-[r1:HAS_ATTRIBUTE]->(ri:RoleInfo)-[r2:CURRENT]->(rr:Role{roleName:'Claims Specialist'})-[rel:REQUIRES_ACCESS_TO]->(e:Entitlement)
MATCH (u)-[r3:HAS_ACCOUNT]->(ua:UserAccount)-[r4:HAS_ATTRIBUTE]->(ei:EntitlementInfo)-[r5:CURRENT]->(en:Entitlement)
WITH u, COLLECT(DISTINCT e.entitlementName) AS entitledAccesses, COLLECT(DISTINCT en.entitlementName) AS currentAccesses
WITH u, [c IN currentAccesses WHERE NOT c IN entitledAccesses] AS violations
WHERE SIZE(violations) > 0
RETURN u, violations