I’m testing my Single Node Cluster Giraph installation using PageRankBenchmark example, in the following way:
$HADOOP_HOME/bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/giraph-examples-1.1.0-for-hadoop-2.4.0-jar-with-dependencies.jar org.apache.giraph.benchmark.PageRankBenchmark -v -V 1000 -e 1 -s 5 -w 1
But afters the mappers complete their job, reducer's don't start (map 100% reduce 0%, according to the console). This is the appropriate behaviour for this algorithm?
If a mapper get executed, a reducer has to start to take the map's output as an input and finish the work (at least, in a lot of other implementations of PageRank algorithm in the internet, there is always a "Reducer"). But I google it, and always PageRankBenchmark giraph example ends with Reduce at 0% in several results of this algorithm, ran by other people.
So, I don't now if it's OK that in PageRankBenchmark, and I’m hoping that someone can help me around here ;)
I'm using hadoop 2.4, with Phadoop_yarn profile, and Giraph 1.1.0.
According to several other questions that i read, the main issue for the "Reduce 0% stuck" problem, it's probably in the mappers log, but I don't find anything there (I’m attaching them also).
Here are my logs:
Cheers!
Giraph following map-only paradigm. In other words, each worker is associated with a map task. all of computations is performed within just map task and communication between map-tasks in order to send / receive messages is done by using of zookeeper. So, it is unlike of traditional map-reduce programming paradigm which map outputs transmitted to reducers. Therefore, there is no reduce task and no map output as well.