I have the following query:
PROFILE
MATCH (c1:Criterion)-[rc1t:CONTAINS ]->(c1t:Translation)
WHERE c1.id IN [5890, 1666, 1283, 4548, 25157, 966, 9320, 2344, 8202, 1707, 461, 7279, 14897, 4147, 4598, 2966, 17784, 77304, 762, 10490, 6012, 3230]
RETURN count(rc1t)
which returns only 5 relationships.
Cypher version: 5, planner: COST, runtime: SLOTTED. 172821 total db hits in 45 ms.
This is the plan:
I am surprised by how many DB hits the database makes for this relationship: (c1:Criterion)-[rc1t:CONTAINS ]->(c1t:Translation)
But I have
MATCH()-[rc1t:CONTAINS]-() RETURN count(rc1t)
3295092
relatioships...
Is there anything that can be done to improve the performance of this query?
Your query found 22 Criterion
nodes, and they have a total of 86,352 outgoing CONTAINS
relationships. But only 5 of the nodes on the other end of those relationships have the Translation
label.
To make this use case more performant, you could create CONTAINS_TRANSLATION
relationships between appropriate Criterion/Translation
pairs, and then change your query to the following. You can keep the existing CONTAINS
relationships between the same node pairs if that helps with your other uses cases.
Note that c1t
no longer needs to specify a label, which further speeds up the query.
MATCH (c1:Criterion)-[rc1t:CONTAINS_TRANSLATION]->(c1t)
WHERE c1.id IN [5890, 1666, 1283, 4548, 25157, 966, 9320, 2344, 8202, 1707, 461, 7279, 14897, 4147, 4598, 2966, 17784, 77304, 762, 10490, 6012, 3230]
RETURN count(rc1t)
If you want to avoid adding a new relationship type (and multiple relationships between some node pairs), you could try this instead:
Create a relationship index for a new dest
property on the CONTAINS
relationship. (Not all CONTAINS
relationships will need to have that property.)
CREATE INDEX CONTAINS_dest FOR ()-[c:CONTAINS]-() ON (c.dest)
Add a dest: 'Translation'
property to the CONTAINS
relationships that end in a Translation
node.
Hint in your query that you want the Cypher planner to generate a plan that uses the new relationship index instead of the index you already have on the Criterion
nodes:
MATCH (c1:Criterion)-[rc1t:CONTAINS]->(c1t)
USING INDEX rc1t:CONTAINS(dest)
WHERE
rc1t.dest = 'Translation' AND
c1.id IN [5890, 1666, 1283, 4548, 25157, 966, 9320, 2344, 8202, 1707, 461, 7279, 14897, 4147, 4598, 2966, 17784, 77304, 762, 10490, 6012, 3230]
RETURN count(rc1t)
Note, however, that this approach will only be faster if there are fewer than 86,352 CONTAINS
relationships with the desired dest
value. Also, if all CONTAINS
relationships start from a Criterion
node, you can speed up the query further by not bothering to filter by that label (i.e., use MATCH (c1)-[rc1t:CONTAINS_TRANSLATION]->(c1t)
).