I am trying to run Stanford Network Analysis Program (SNAP) graphs on Apache Giraph using Hadoop. The link is provided below http://snap.stanford.edu/snap/
Currently I am trying to run the facebook graph which is in the simple edge list format source_id destination_id .. Link is : http://snap.stanford.edu/data/egonets-Facebook.html
I am not able to determine which format does Apache Giraph accept to run the SimpleShortestPathsCompute or any other Java Program for accepting input of the simple edge list format.
I was successfully able to run SimpleShortestPathsCompute and PageRankComputation Algorithms which are in the examples folder of the Giraph package on input files with JSON Format. [source_id, source_value, [[destination_id, edge_value], [destination_id, edge_value],..]]
For all those people who are facing problems trying to run example Java programs given in the Jar package.
In my case I write an algorithm in Java which converts the given input file in the simple edge list format to the Json Based Format.
The simple edge list format has the following form source_id, destination_id ...
Since the graph I was working on was an undirected graph (i.e facebook snap graph), an edge written once between any two vertices (nodes) is not repeated a second time. e.g if I have a graph that has an edge between the vertices 1 and 20 will be written as 1 20 . . . and 20 1 will be avoided.
So first convert the graph in the following format which has both the edges. Since an undirected graph is a directed graph with directed edges in both the directions between any two vertices connected by an edge. After that write an algorithm that converts this format into the JSON format and store the output in the output file and then run the SingleSourceShortestPathsCompute and PageRank and other sample algorithms on this graph.