I want to count the relationships for each type of a given start node. I have constructed two possible queries to achieve that, but I don't know which one is going to be more efficient when dealing with lots of relationships.
count()
MATCH (n) WHERE id(n) = 0
CALL {
WITH n
MATCH (n)<-[r]-()
RETURN '<'+TYPE(r) AS type, COUNT(r) AS count
UNION ALL
WITH n
MATCH (n)-[r]->()
RETURN TYPE(r)+'>' AS type, COUNT(r) AS count
}
RETURN type, count
Result:
╒════════════╤═════╕
│type │count│
╞════════════╪═════╡
│"<ACTED_IN" │5 │
├────────────┼─────┤
│"<PRODUCED" │1 │
├────────────┼─────┤
│"<DIRECTED" │2 │
└────────────┴─────┘
apoc.node.relationship.types()
and type, apoc.node.degree.[in|out]()
MATCH (n) WHERE id(n) = 0
WITH n, apoc.node.relationship.types(n) AS types
CALL {
WITH n, types
UNWIND types as type
RETURN '<'+type AS type, apoc.node.degree.in(n, type) as count
UNION ALL
WITH n, types
UNWIND types as type
RETURN type+'>' AS type, apoc.node.degree.out(n, type) as count
}
RETURN type, count
Result:
╒════════════╤═════╕
│type │count│
╞════════════╪═════╡
│"<ACTED_IN" │5 │
├────────────┼─────┤
│"<DIRECTED" │2 │
├────────────┼─────┤
│"<PRODUCED" │1 │
├────────────┼─────┤
│"ACTED_IN>" │0 │
├────────────┼─────┤
│"DIRECTED>" │0 │
├────────────┼─────┤
│"PRODUCED>" │0 │
└────────────┴─────┘
The second query returns rows for empty relationship types, but this can be neglected.
I can only profile the first cypher-only query, because custom procedures like APOC can't be profiled.
There is actually a faster approach that also fixes a potential problem in your current queries. If n
has any self-relationships (relationships that start/end at n
), then such relationships would be counted twice (as both inbound and outbound relationships). In other words, the sum of the counts could be greater than the actual number of relationships.
This query should be fast and also solve the self-relationship problem:
MATCH (n)-[r]-() WHERE id(n) = 0
RETURN
CASE n WHEN ENDNODE(r) THEN '<' ELSE '' END +
TYPE(r) +
CASE n WHEN STARTNODE(r) THEN '>' ELSE '' END AS type,
COUNT(*) AS count
An inbound REL
relationship would be represented as <REL
, an outbound one as REL>
, and a self-relationship as <REL>
. And the sum of the counts would equal the actual number of relationships.
Here is a slightly altered query that returns a count of each distinct combination of type and end node labels (a node can have multiple labels):
MATCH (n)-[r]-(m) WHERE id(n) = 0
RETURN
CASE n WHEN ENDNODE(r) THEN '<' ELSE '' END +
TYPE(r) +
CASE n WHEN STARTNODE(r) THEN '>' ELSE '' END AS type,
LABELS(m) AS endNodeLabels,
COUNT(*) AS count
Reading how aggregating functions work will help you understand these 2 queries.