hadoophadoop-yarngiraph

OutOfMemory error while reading bytes from edges in yarn


I'm doing a BFS algorithm in yarn, and i make a custom value for the data on my vertex (Vertex Data). But, after i did this, something went wrong for the process of reading edges.

I trace the error to the following lines of code:

I'm not sure why this started happening, but previous to using custom vertex data, this problem does not exist.

The full log is here (i'm testing directly from eclipse, because in a pseudo distributed cluster was far more difficult):

2015-08-20 01:52:21,103 INFO  [LocalJobRunner Map Task Executor #0] utils.ProgressableUtils (ProgressableUtils.java:waitFor(315)) - waitFor: Future result not ready yet java.util.concurrent.FutureTask@b2dd686
2015-08-20 01:52:21,103 INFO  [LocalJobRunner Map Task Executor #0] utils.ProgressableUtils (ProgressableUtils.java:waitFor(197)) - waitFor: Waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
2015-08-20 01:53:12,527 ERROR [LocalJobRunner Map Task Executor #0] graph.GraphMapper (GraphMapper.java:run(101)) - Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
    at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193)
    at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151)
    at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136)
    at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99)
    at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233)
    at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316)
    at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409)
    at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629)
    at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284)
    at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:202)
    at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312)
    at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185)
    ... 17 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.giraph.edge.ByteArrayEdges.readFields(ByteArrayEdges.java:193)
    at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:541)
    at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98)
    at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
    at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
    at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
    ... 4 more
2015-08-20 01:53:12,532 ERROR [LocalJobRunner Map Task Executor #0] worker.BspServiceWorker (BspServiceWorker.java:unregisterHealth(777)) - unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_local1113753160_0001/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/localhost_0 on superstep -1
2015-08-20 01:53:12,558 INFO  [Thread-13] mapred.LocalJobRunner (LocalJobRunner.java:runTasks(456)) - map task executor complete.
2015-08-20 01:53:12,562 WARN  [Thread-13] mapred.LocalJobRunner (LocalJobRunner.java:run(560)) - job_local1113753160_0001
java.lang.Exception: java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
    at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:104)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
    at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193)
    at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151)
    at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136)
    at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99)
    at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233)
    at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316)
    at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409)
    at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629)
    at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284)
    at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
    ... 8 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:202)
    at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312)
    at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185)
    ... 17 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.giraph.edge.ByteArrayEdges.readFields(ByteArrayEdges.java:193)
    at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:541)
    at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98)
    at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
    at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
    at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
    ... 4 more

The line from terminal used for executing this is:

$HADOOP_HOME/bin/yarn jar $GIRAPH_HOME/gaph-examples/target/giraph-examples-1.1.0-for-hadoop-2.4.0-jar-with-dependencies.jar algoritmos.masivos.BusquedaDeCaminosNavegacionalesWikiquotesMasivo lectura_de_grafo.BusquedaDeCaminosNavegacionalesWikiquote -vif pruebas.IdTextWithValueDoubleInputFormat -vip /user/hduser/input/wiki-graph-chiquito.txt -vof pruebas.IdTextWithValueTextOutputFormat -op /user/hduser/output/caminosNavegacionales -w 2 -yh 250

Maybe i should use a EdgeInputFormat?

Thanks for reading.


Solution

  • I see the actual problem as the insufficient memory allocated to the Maptask container which causes Java heap space error.

    To fix this quickly you may prefer expanding the memory container of the yarn map/reduce nodes by allocating more memory in the configurations.

    Please prefer allocating more memory for the following set of properties in the yarn-site.xml.

    mapreduce.map.memory.mb
    mapreduce.reduce.memory.mb
    
    mapreduce.map.java.opts
    mapreduce.reduce.java.opts
    

    [Note: the *.memory.mb properties should be higher than *.java.opts properties]