I'm using kbastani/spark-neo4j with docker-compose to analyze Betweenness Centrality of my graph.
my nodes are built like so:
(n1:Node {id:1})-[r:NEXT {count:100}]->(n2:Node {id:2})
I've ignored the log exception (Since I do not know how to tackle it):
mazerunner_1 | 16/11/29 08:27:51 INFO FileInputFormat: Total input paths to process : 1
mazerunner_1 | Exception in thread "main" org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Zero blocklocations for /neo4j/mazerunner/jobs/edgeList.txt. Name node is in safe mode.
mazerunner_1 | The reported blocks 0 needs additional 28 blocks to reach the threshold 0.9990 of total blocks 28.
mazerunner_1 | The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
mazerunner_1 | at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1678)
mazerunner_1 | at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1651)
mazerunner_1 | at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1625)
mazerunner_1 | at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:497)
mazerunner_1 | at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322)
mazerunner_1 | at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
mazerunner_1 | at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
mazerunner_1 | at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
mazerunner_1 | at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
mazerunner_1 | at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
mazerunner_1 | at java.security.AccessController.doPrivileged(Native Method)
mazerunner_1 | at javax.security.auth.Subject.doAs(Subject.java:415)
mazerunner_1 | at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
mazerunner_1 | at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
I'm running my job from Neo4J Browser like so:
:GET /service/mazerunner/analysis/betweenness_centrality/NEXT
And I can see in the maze runner log:
graphdb_1 | /var/lib/neo4j-community-2.2.3/..
graphdb_1 | [*] Waiting for messages. To exit press CTRL+C
graphdb_1 | 08:50:31.608 [qtp198725683-33] WARN o.a.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
graphdb_1 | Records exported: 20000
graphdb_1 | Records exported: 40000
graphdb_1 | Records exported: 60000
graphdb_1 | Records exported: 80000
graphdb_1 | Mazerunner Export Status: 100%
graphdb_1 | [x] Sent '{"path":"hdfs://hdfs:9000/neo4j/mazerunner/jobs/edgeList.txt","analysis":"betweenness_centrality","mode":"Unpartitioned"}'
And then nothing ... for a long while.
Q: How can I make it run?
In My log
16/11/29 08:27:51 INFO FileInputFormat: Total input paths to process : 1
That was an old job in the que. I had to remove the hdfs image, and re run docker-compose up. it solved my issue