neo4jcypher

Performant way to get a list of Nodes along with the latest related node?


I have the following relationship:

(Thread)-[:HAS_MESSAGE]->(Message)

A Thread can have anywhere from dozens to thousands of Messages.

Here is my query to find a list of Threads, and also return the latest Message from each one if exists.

MATCH (t:Thread)
OPTIONAL MATCH (t)-[:HAS_MESSAGE]->(m:Message)
WITH t, m
ORDER BY m.createdAt DESC
WITH t, COLLECT(m)[0] AS latestMessage
RETURN t, latestMessage

I'm struggling to understand if there's a way to rewrite this to avoid COLLECT(m)[0] and somehow use just the latest message?


Solution

  • The following query is more memory-efficient. It uses a CALL subquery to find the latest message for one thread at a time. This is unlike your query, which has to gather in memory all the messages for every thread.

    MATCH (t:Thread)
    CALL {
      WITH t
      MATCH (t)-[:HAS_MESSAGE]->(m:Message)
      RETURN m
      ORDER BY m.createdAt DESC
      LIMIT 1
    }
    RETURN t, m AS latestMessage