scalamapreduceinformation-retrieval

Group, map & reduce with two different reducer-operators


I have these tuples:

("T1",2,"x1"),
("T1",2,"x2"),
// … etc

And i want to reduce it to ("T1", 4, List("x1", "x2")). How can i do this ?

I did something like .group(_._1).map{case (key,list) => key-> list.map(_._2).reduce(_+_)} But this is not working, and just sums the numbers without appending the list.


Solution

  • With groupMapReduce:

    val xs = List(
      ("T1",40,"x1"),
      ("T1",2,"x2"),
      ("T2",58,"x3")
    )
    
    println(xs.groupMapReduce(_._1)
      (e => (e._2, List(e._3)))
      ({ case ((x, y), (z, w)) => (x + z, y ++ w)})
    )
    

    with groupBy:

    val xs = List(
      ("T1",40,"x1"),
      ("T1",2,"x2"),
      ("T2",58,"x3")
    )
    println(xs.groupBy(_._1)
      .view
      .mapValues(ys => (ys.view.map(_._2).sum, ys.map(_._3)))
      .toMap
    )
    

    If you want to do it in one pass per list, and not use ++ you could try sth. like this:

    xs.groupBy(_._1)
      .view
      .mapValues(ys =>
         ys.foldRight((0, List.empty[String])){
           case ((_, n, x), (sum, acc)) => (n + sum, x :: acc)
         }
      )
      .toMap
    

    All three variants give

    Map(T2 -> (58,List(x3)), T1 -> (42,List(x1, x2)))
    

    Note that combining many lists with ++ might become very inefficient if the number of lists becomes large. It depends on your use-case whether this is acceptable or not.