giraph

Apache Giraph : Number of vertices processed by each partition


I am a newbie trying to understand the working of Giraph 1.2.0. with hadoop 1.2.1.

Is there any way to figure out the number of vertices processed by each mapper?


Solution

  • The call method of org.apache.giraph.graph.ComputeCallable class is executed once per superstep. Inside this function, for each partition owned by this map task, the computePartition function is called. So, you can easily define an integer (counter) to this class. Then, in computePartition, if compute method of the vertex is called, increment the counter. Finally, at the end of call method print your counter. So, for each superstep of each mapper, it prints the number of vertices processed.