I am trying to calculate the dot product (scalar product) of two sparse vectors in Scala. The code I have written is doing everything that I want it to, except when multiplying the similar elements of the two vectors, it is not accounting for the 0 values.
I expect to get 72 as my answer as 3 and 18 are the only keys that are both non-zero and they evaluate to: (3 -> 21) + (18 -> 51) = 72
I used withDefaultValue(0) hoping it would "fill in" the unmentioned key/value pairs but I do not think this is the case, and I believe this is where my problem is coming from, in the very beginning. I think my question could also be "How to generate a Sparse Vector in Scala using the Standard Library".
If I enter the corresponding 0's and the two Maps (vectors) have the same number of key/value pairs, my code works properly.
```
val Sparse1 = Map(0 -> 4, 3 -> 7, 6 -> 11, 18 -> 17).withDefaultValue(0)
val Sparse2 = Map(1 -> 3, 3 -> 3, 11 -> 2,18 -> 3, 20 -> 6).withDefaultValue(0)
//println(Sparse2.toSeq)//to see what it is....0's missing
val SparseSum = (Sparse1.toSeq ++ Sparse2.toSeq).groupBy(_._1).mapValues(_.map(_._2).sum)
//println(SparseSum)
val productOfValues = ((Sparse1.toSeq ++ Sparse2.toSeq).groupBy(_._1).mapValues(_.map(_._2).reduce(_*_)))
//println(productOfValues)
var dotProduct = 0
for ((h,i) <- productOfValues) {
dotProduct += i
}
//println(dotProduct)
//If I specify some zero values, lets see what happens:
val Sparse3 = Map(0 -> 4, 1 -> 0, 3 -> 7, 6 -> 11, 11 -> 0, 18 -> 17, 20 -> 0).withDefaultValue(0)
val Sparse4 = Map(0 -> 0, 1 -> 3, 3 -> 3, 6 -> 0, 11 -> 2,18 -> 3, 20 -> 6).withDefaultValue(0)
val productOfValues2 = ((Sparse3.toSeq ++ Sparse4.toSeq).groupBy(_._1).mapValues(_.map(_._2).reduce(_*_)))
var dotProduct2 = 0
for ((l, m) <- productOfValues2) {
dotProduct2 += m
}
println(productOfValues2)
println(dotProduct2)//I get 72
```
I can create a Sparse Vector this way, and then update the values
import scala.collection.mutable.Map
val Sparse1 = Map[Int, Int]()
for (k <- 0 to 20) {
Sparse1 getOrElseUpdate (k, 0)
}
val Sparse2 = Map[Int, Int]()
for (k <- 0 to 20) {
Sparse2 getOrElseUpdate (k, 0)
}
But I'm wondering if there is a "better" way. More along the lines of what I tried and failed to do using "withDefaultValue(0)"
Since you are using sparse vectors, you can ignore all keys that are not on both vectors.
Thus, I would compute the intersection
between both keys sets and then perform a simple map-reduce to compute the dot product.
type SparseVector[T] = Map[Int, T]
/** Generic function for any type T that can be multiplied & summed. */
def sparseDotProduct[T: Numeric](v1: SparseVector[T], v2: SparseVector[T]): T = {
import Numeric.Implicits._
val commonIndexes = v1.keySet & v2.keySet
commonIndexes
.map(i => v1(i) * v2(i))
.foldLeft(implicitly[Numeric[T]].zero)(_ + _)
}
Then, you can use it like this:
// The withDefault(0) is optional now.
val sparse1 = Map(0 -> 4, 3 -> 7, 6 -> 11, 18 -> 17).withDefaultValue(0)
val sparse2 = Map(1 -> 3, 3 -> 3, 11 -> 2, 18 -> 3, 20 -> 6).withDefaultValue(0)
sparseDotProduct(sparse1, sparse2)
// res: Int = 72
type SparseVector[T] = Map[Int, T]
/** Generic function for any type T that can be multiplied & summed. */
def sparseDotProduct[T](v1: SparseVector[T], v2: SparseVector[T])(implicit N: Numeric[T]): T = {
val commonIndexes = v1.keySet & v2.keySet
commonIndexes
.map(i => N.times(v1(i), v2(i)))
.foldLeft(N.zero)((acc, element) => N.plus(acc, element))
}
One can modify the above method to work for any kind of vector, not just spare.
In this case, we would need the union
of the keys, and take into account cases where one key does not exist on the other.
type MyVector[T] = Map[Int, T]
/** Generic function for any type T that can be multiplied & summed. */
def dotProduct[T: Numeric](v1: MyVector[T], v2: MyVector[T]): T = {
import Numeric.Implicits._
val zero = implicitly[Numeric[T]].zero
val allIndexes = v1.keySet | v2.keySet
allIndexes.map { i =>
v1.getOrElse(
key = i,
default = zero
) * v2.getOrElse(
key = i,
default = zero
)
}.foldLeft(zero)(_ + _)
}