In Python I'm able to group consecutive elements with the same key by using itertools.groupby
:
>>> items = [(1, 2), (1, 5), (1, 3), (2, 9), (3, 7), (1, 5), (1, 4)]
>>> import itertools
>>> list(key for key,it in itertools.groupby(items, lambda tup: tup[0]))
[1, 2, 3, 1]
Scala has groupBy
as well, but it produces different result - a map pointing from key to all the values found in the iterable with the specified key (not the consecutive runs with the same key):
scala> val items = List((1, 2), (1, 5), (1, 3), (2, 9), (3, 7), (1, 5), (1, 4))
items: List[(Int, Int)] = List((1,2), (1,5), (1,3), (2,9), (3,7), (1,5), (1,4))
scala> items.groupBy {case (key, value) => key}
res0: scala.collection.immutable.Map[Int,List[(Int, Int)]] = Map(2 -> List((2,9)), 1 -> List((1,2), (1,5), (1,3), (1,5), (1,4)), 3 -> List((3,7)))
What is the most eloquent way of achieving the same as with Python itertools.groupby
?
If you just want to throw out sequential duplicates, you can do something like this:
def unchain[A](items: Seq[A]) = if (items.isEmpty) items else {
items.head +: (items zip items.drop(1)).collect{ case (l,r) if r != l => r }
}
That is, just compare the list to a version of itself shifted by one place, and only keep the items which are different. It's easy to add a (same: (a1: A, a2: A) => Boolean)
parameter to the method and use !same(l,r)
if you want custom behavior for what counts as the same (e.g. do it just by key).
If you want to keep the duplicates, you can use Scala's groupBy
to get a very compact (but inefficient) solution:
def groupSequential(items: Seq[A])(same: (a1: A, a2: A) => Boolean) = {
val ns = (items zip items.drop(1)).
scanLeft(0){ (n,cc) => if (same(cc._1, cc._2)) n+1 else n }
(ns zip items).groupBy(_._1).toSeq.sortBy(_._1).map(_._2)
}