scalastreamiteratoraccumulatorlazylist

Conditional concatenation of iterator elements - A Scala idiomatic solution


I have an Iterator of Strings and would like to concatenate each element preceding one that matches a predicate, e.g. for an Iterator of Iterator("a", "b", "c break", "d break", "e") and a predicate of !line.endsWith("break") I would like to print out

(Group: 0): a-b-c break
(Group: 1): d break
(Group: 2): e

(without needing to hold in memory more than a single group at a time)

I know I can achieve this with an iterator like below, but there has to be a more "Scala" way of writing this, right?

import scala.collection.mutable.ListBuffer

object IteratingAndAccumulating extends App {
  class AccumulatingIterator(lines: Iterator[String])extends Iterator[ListBuffer[String]] {
    override def hasNext: Boolean = lines.hasNext

    override def next(): ListBuffer[String] = getNextLine(lines, new ListBuffer[String])

    def getNextLine(lines: Iterator[String], accumulator: ListBuffer[String]): ListBuffer[String] = {
      val line = lines.next
      accumulator += line
      if (line.endsWith("break") || !lines.hasNext) accumulator
      else getNextLine(lines, accumulator)
    }
  }

  new AccumulatingIterator(Iterator("a", "b", "c break", "d break", "e"))
    .map(_.mkString("-")).zipWithIndex.foreach{
    case (conc, i) =>
      println(s"(Group: $i): $conc")
  }
}

many thanks,

Fil


Solution

  • Here is a simple solution if you don't mind loading the entire contents into memory at once:

      val lines: List[List[String]] = it.foldLeft(List(List.empty[String])) { 
         case (head::tail, x) if predicate(x) => Nil :: (x::head) :: tail
         case (head::tail, x) => (x::head ) :: tail
      }.dropWhile(_.isEmpty).map(_.reverse).reverse
    

    If you would rather iterate through the strings and groups one-by-one, it gets a little bit more involved:

    // first "instrument" the iterator, by "demarcating" group boundaries with None:
    val instrumented: Iterator[Option[String]] = it.flatMap { 
      case x if predicate(x) => Seq(Some(x), None)
      case x => Seq(Some(x))
    }
    
    // And now, wrap it around into another iterator, constructing groups:
    val lines: Iterator[Iterator[String]] = Iterator.continually { 
        instrumented.takeWhile(_.nonEmpty).flatten
    }.takeWhile(_.nonEmpty)