I have an Iterator of Strings and would like to concatenate each element preceding one that matches a predicate, e.g. for an Iterator of
Iterator("a", "b", "c break", "d break", "e")
and a predicate of
!line.endsWith("break")
I would like to print out
(Group: 0): a-b-c break
(Group: 1): d break
(Group: 2): e
(without needing to hold in memory more than a single group at a time)
I know I can achieve this with an iterator like below, but there has to be a more "Scala" way of writing this, right?
import scala.collection.mutable.ListBuffer
object IteratingAndAccumulating extends App {
class AccumulatingIterator(lines: Iterator[String])extends Iterator[ListBuffer[String]] {
override def hasNext: Boolean = lines.hasNext
override def next(): ListBuffer[String] = getNextLine(lines, new ListBuffer[String])
def getNextLine(lines: Iterator[String], accumulator: ListBuffer[String]): ListBuffer[String] = {
val line = lines.next
accumulator += line
if (line.endsWith("break") || !lines.hasNext) accumulator
else getNextLine(lines, accumulator)
}
}
new AccumulatingIterator(Iterator("a", "b", "c break", "d break", "e"))
.map(_.mkString("-")).zipWithIndex.foreach{
case (conc, i) =>
println(s"(Group: $i): $conc")
}
}
many thanks,
Fil
Here is a simple solution if you don't mind loading the entire contents into memory at once:
val lines: List[List[String]] = it.foldLeft(List(List.empty[String])) {
case (head::tail, x) if predicate(x) => Nil :: (x::head) :: tail
case (head::tail, x) => (x::head ) :: tail
}.dropWhile(_.isEmpty).map(_.reverse).reverse
If you would rather iterate through the strings and groups one-by-one, it gets a little bit more involved:
// first "instrument" the iterator, by "demarcating" group boundaries with None:
val instrumented: Iterator[Option[String]] = it.flatMap {
case x if predicate(x) => Seq(Some(x), None)
case x => Seq(Some(x))
}
// And now, wrap it around into another iterator, constructing groups:
val lines: Iterator[Iterator[String]] = Iterator.continually {
instrumented.takeWhile(_.nonEmpty).flatten
}.takeWhile(_.nonEmpty)