neo4jcypherneo4j-apoc

Neo4j count relationships of each type for given node


I want to count the relationships for each type of a given start node. I have constructed two possible queries to achieve that, but I don't know which one is going to be more efficient when dealing with lots of relationships.

  1. Cypher only query with count()
MATCH (n) WHERE id(n) = 0
CALL {
    WITH n
    MATCH (n)<-[r]-()
    RETURN '<'+TYPE(r) AS type, COUNT(r) AS count
UNION ALL 
    WITH n
    MATCH (n)-[r]->()
    RETURN TYPE(r)+'>' AS type, COUNT(r) AS count
}
RETURN type, count

Result:

╒════════════╤═════╕
│type        │count│
╞════════════╪═════╡
│"<ACTED_IN" │5    │
├────────────┼─────┤
│"<PRODUCED" │1    │
├────────────┼─────┤
│"<DIRECTED" │2    │
└────────────┴─────┘
  1. Cypher + APOC apoc.node.relationship.types() and type, apoc.node.degree.[in|out]()
MATCH (n) WHERE id(n) = 0
WITH n, apoc.node.relationship.types(n) AS types
CALL {
    WITH n, types
    UNWIND types as type
    RETURN '<'+type AS type, apoc.node.degree.in(n, type) as count
UNION ALL 
    WITH n, types
    UNWIND types as type
    RETURN type+'>' AS type, apoc.node.degree.out(n, type) as count
}
RETURN type, count

Result:

╒════════════╤═════╕
│type        │count│
╞════════════╪═════╡
│"<ACTED_IN" │5    │
├────────────┼─────┤
│"<DIRECTED" │2    │
├────────────┼─────┤
│"<PRODUCED" │1    │
├────────────┼─────┤
│"ACTED_IN>" │0    │
├────────────┼─────┤
│"DIRECTED>" │0    │
├────────────┼─────┤
│"PRODUCED>" │0    │
└────────────┴─────┘

The second query returns rows for empty relationship types, but this can be neglected.

I can only profile the first cypher-only query, because custom procedures like APOC can't be profiled.


Solution

  • There is actually a faster approach that also fixes a potential problem in your current queries. If n has any self-relationships (relationships that start/end at n), then such relationships would be counted twice (as both inbound and outbound relationships). In other words, the sum of the counts could be greater than the actual number of relationships.

    This query should be fast and also solve the self-relationship problem:

    MATCH (n)-[r]-() WHERE id(n) = 0
    RETURN
      CASE n WHEN ENDNODE(r) THEN '<' ELSE '' END +
      TYPE(r) +
      CASE n WHEN STARTNODE(r) THEN '>' ELSE '' END AS type,
      COUNT(*) AS count
    

    An inbound REL relationship would be represented as <REL, an outbound one as REL>, and a self-relationship as <REL>. And the sum of the counts would equal the actual number of relationships.

    Including endnode labels

    Here is a slightly altered query that returns a count of each distinct combination of type and end node labels (a node can have multiple labels):

    MATCH (n)-[r]-(m) WHERE id(n) = 0
    RETURN
      CASE n WHEN ENDNODE(r) THEN '<' ELSE '' END +
      TYPE(r) +
      CASE n WHEN STARTNODE(r) THEN '>' ELSE '' END AS type,
      LABELS(m) AS endNodeLabels,
      COUNT(*) AS count
    

    Reading how aggregating functions work will help you understand these 2 queries.