neo4jgraph-databasesdatabase-performancegraphenedb

Neo4J save query performance (GrapheneDB)


I have created a .Net application that utilizes a Neo4J Graph Database (with GrapheneDB as a provider). I am having performance issues when I save a new graph object. I am not keeping a history of the graph so each time I save, I first delete the old one including nodes and relationships, then I save the new one. I have not indexed my nodes yet. I don't think this is the problem because loading multiple of these graphs at a time is very fast.

My save method steps through each branch and merges the nodes and relationships. (I left the relationships out of each step for cleanliness). After the full query is created the code is executed in one shot.

  1. merge the root node 37 and node 4
  2. merge type1 node 12-17 with 4
  3. merge type2 node 18-22 with 4
  4. merge 2 with 37
  5. merge 7-11 with 2
  6. merge 5 with 37 (creates relationships)
  7. merge 23-26 with 5
  8. merge 6 with 37 (creates relationships)
  9. merge 30-27 with 6

Nodes 2, 4, 5, 6 can have 100-200 leaf nodes. I have about 100 of these graphs in my database. This save can take the server 10 - 20 seconds on production and sometimes times out.

enter image description here

I have tried saving a different way, and it takes longer but doesn't timeout as frequently. I create groups of nodes first. Each node stores the root id 37. Each group is created in a separate execution. After the nodes are created I create relationships by selecting child nodes and the root node. This splits the query up into separate smaller queries.

How can I improve the performance of this save? Loading 30 of these graphs takes 3-5 seconds. I should also note that the save got significantly less performant as more data was added.


Solution

  • Since you delete all the nodes (and their relationships) beforehand, you should not be using MERGE at all, as that requires a lot of scanning (without the relevant indexes) to determine whether each node already exists.

    Try using CREATE instead (as long as the CREATEs avoid creating duplicates).