neo4jneo4j-apocneo4j-browser

Queries on 200 GB graph


I am in need to use a scalable solution to create a Geohash connected graph.

I find Cypher for APache Spark a project that let use cypher on spark dataframes to create a graph, however it can only create immutable graphs by mapping the different data-frames,so i didn't get the graph that i need.

I can get the graph that i need if i run some other cypher queries on a Neo4j Browser, however my stored graph is about 200 GB.

So i'm asking if that logic and fast to run queries on 200 GB of graph data using Neo4j browser and apoc functions ?


Solution

  • If you're asking if Neo4j can handle databases of this size, then the answer is yes. But you'll see different results depending on how your data is modeled and the kind of queries you want to run.

    Performance correlates not necessarily with the size of the graph, but on the portion of the graph touched and traversed by your queries. Graph-wide analytical queries must touch the entire graph, while tightly defined queries that touch a smaller local part of the graph will be quite quick.

    Anything you can do in your queries to constrain the portion of the graph you have to traverse or filter will help out your query speed, so good modeling and usage of indexes and constraints is key.