Brand new to Apache Spark and I'm a little confused how to make updates to a value that sits outside of a .mapTriplets
iteration in GraphX. See below:
def mapTripletsMethod(edgeWeights: Graph[Int, Double], stationaryDistribution: Graph[Double, Double]) = {
val tempMatrix: SparseDoubleMatrix2D = graphToSparseMatrix(edgeWeights)
stationaryDistribution.mapTriplets{ e =>
val row = e.srcId.toInt
val column = e.dstId.toInt
var cellValue = -1 * tempMatrix.get(row, column) + e.dstAttr
tempMatrix.set(row, column, cellValue) // this doesn't do anything to tempMatrix
e
}
}
I'm guessing this is due to the design of an RDD
and there's no simple way to update the tempMatrix
value. When I run the above code the tempMatrix.set
method does nothing. It was rather difficult to try to follow the problem in the debugger.
Does anyone have an easy solution? Thank you!
I've made an update above to show that stationaryDistribution
is a graph RDD.
You could make tempMatrix be of type RDD[((Int,Int), Double)]
-- that is, each entry is a pair where the first element is in turn a (row,col)
pair. Then use the PairRDDFunctions class to combine that with ((row,col),weight) triplets generated by your mapTriplets
call. (So, don't think of it as updating the tempMatrix, but rather combining two RDDs to get a third.)
If you need to support stationary distribution graphs where there is more than one edge per vertex pair it gets a little tricky: you'll probably need to combine those edges in a reduction pass to create an RDD with one entry per pair, with a list of weights, and then apply all the weights to a given (row,col) pair at the same time. Otherwise it's very simple.
Notice that `PairRDDFunctions' on the one hand give you ways to combine multiple RDDs into one, or on the other hand to pull the values out into a Map on the master. Assuming that the distribution matrix is large enough to merit an RDD in the first place, I think you should do the whole thing on RDDs.
Another approach is to make the tempMatrix be a GraphRDD too, which may or may not make sense depending on what you're going to do with it next.