neo4jcyphergraph-databasesneo4j-apoc

How to return nodes and relationships only if a certain condition is true in cypher


In my database, I want certain nodes and relationships returned if items in 2 aggregated lists do not match. To be specific, a user has certain accesses and role. I want to pinpoint those users whose current accesses do not match with the accesses required by their roles. This is my main query to achieve this:

match(u:User)-[r1:HAS_ATTRIBUTE]->(ri:RoleInfo)-[r2:CURRENT]->(rr:Role{roleName:'Claims Specialist'})-[rel:REQUIRES_ACCESS_TO]->(e:Entitlement)
match (u)-[r3:HAS_ACCOUNT]->(ua:UserAccount)-[r4:HAS_ATTRIBUTE]->(ei:EntitlementInfo)-[r5:CURRENT]->(en:Entitlement)
with apoc.coll.sort(collect(distinct e.entitlementName)) as reqdAccesses, e,u,r1, ri, r2, rr,r3,ua,r4,ei,r5,en, rel, apoc.coll.sort(collect(distinct  en.entitlementName)) as currentAccesses
WHERE any(item in currentAccesses where not item in reqdAccesses) and size(reqdAccesses) <> size(currentAccesses)
return e,u,r1,ri,r2,r3,ua,r4,ei,r5,en,rel,rr

Now as per the data both the lists for this role have a mismatch as shown by below query and output.

match(u:User)-[r1:HAS_ATTRIBUTE]->(ri:RoleInfo)-[r2:CURRENT]->(rr:Role{roleName:'Claims Specialist'})-[rel:REQUIRES_ACCESS_TO]->(e:Entitlement)
match (u)-[r3:HAS_ACCOUNT]->(ua:UserAccount)-[r4:HAS_ATTRIBUTE]->(ei:EntitlementInfo)-[r5:CURRENT]->(en:Entitlement)
return apoc.coll.sort(collect(distinct en.entitlementName)) as currentAccesses, apoc.coll.sort(collect(DISTINCT e.entitlementName)) as reqdAccesses

output:

currentAccesses 
["Claims Adjustment Portal", "Claims Management System", "Customer Relationship Management System", "Electronic Health Records System", "Fraud Detection Platform", "Patient Portal"]
reqdAccesses
["Claims Adjustment Portal", "Claims Management System", "Customer Feedback System", "Fraud Detection Platform", "MEDSTAR CRM"]

I still do not get the data returned. Is there anything wrong in my main query ?


Solution

  • Aggregating functions like COLLECT aggregate on grouping keys.

    Your first query uses a very large set of grouping keys (e,u,r1,ri,r2,rr,r3,ua,r4,ei,r5,en,rel):

    with apoc.coll.sort(collect(distinct e.entitlementName)) as reqdAccesses, e,u,r1, ri, r2, rr,r3,ua,r4,ei,r5,en, rel, apoc.coll.sort(collect(distinct  en.entitlementName)) as currentAccesses
    

    Because that clause includes e and en as grouping keys, the collect(distinct e.entitlementName) and collect(distinct en.entitlementName) aggregated lists will each only contain a single item.

    Your second query specifies no grouping keys, and therefore works as you intended:

    return apoc.coll.sort(collect(distinct en.entitlementName)) as currentAccesses, apoc.coll.sort(collect(DISTINCT e.entitlementName)) as reqdAccesses
    

    This query should identify every user that has accesses that they are not entitled to, and return them and a list of their violations:

    MATCH (u:User)-[r1:HAS_ATTRIBUTE]->(ri:RoleInfo)-[r2:CURRENT]->(rr:Role{roleName:'Claims Specialist'})-[rel:REQUIRES_ACCESS_TO]->(e:Entitlement)
    MATCH (u)-[r3:HAS_ACCOUNT]->(ua:UserAccount)-[r4:HAS_ATTRIBUTE]->(ei:EntitlementInfo)-[r5:CURRENT]->(en:Entitlement)
    WITH u, COLLECT(DISTINCT e.entitlementName) AS entitledAccesses, COLLECT(DISTINCT en.entitlementName) AS currentAccesses
    WITH u, [c IN currentAccesses WHERE NOT c IN entitledAccesses] AS violations
    WHERE SIZE(violations) > 0
    RETURN u, violations