scala collections functional-programming

Optimizing linear search in Scala

I implemented a following algorithm:

val acceptedTopics: Set[String] = ...
val arr: List[JValue] = ... // JValue is an algebraic trait extended by JString, JInt, and other types.
val topics = arr.collect { case JString(topic) => topic }
if (topics.exists(acceptedTopics.contains)) 1 else 0

The downside of this algorithm is that it creates an intermediate topic list and performs two linear passes — first over arr, second over topics. I've rewritten it to:

Make a single pass
Not to create an additional list
Stop iterating as soon as a solution is found

Here is the algorithm:

arr.foreach {
    case JString(x) if acceptedTopics.contains(x) => return 1
    case _ =>
}

0

I think it's ugly and my team don't like it. Some of the people do not like the fact that it uses return keyword. According to them it should never be used in Scala.

Is there some other performant way to implement this kind of search?

Solution

You can use collectFirst

arr.collectFirst {
  case JString(x) if acceptedTopics.contains(x) => 1
}.getOrElse(0)

It uses PartialFunction to

find the first value that passes the case JStringx if acceptedTopics.contains(x)
turn it to 1
it returns Option[Int]
which you can pattern-match on, fold, or getOrElse.

The reason why people dislike return in Scala, is that you might want to use it like you did

arr.foreach {
    case JString(x) if acceptedTopics.contains(x) => return 1
    case _ =>
}

.foreach takes a closure/lambda, so bytecode return would return only inside the closure, but your intent is to quit the whole block and return value... which requires a hidden exception throwing

try {
  arr.foreach {
    case JString(x) if acceptedTopics.contains(x) => throw SomeSpecificExteption(1)
    case _ =>
  }
} catch {
  case SomeSpecificExteption(x) => x
}

This exception is stack-less, so it's a bit cheaper than usual, but it still has to allocate an object... and it won't work if you throw it inside e.g. Future

val a = Future {
  if (cond) return 1 // <-- it doesn't do what you think it does
  
  2
}

Also, one needs to figure out at which level exception should be caught:

arr.foreach { y => // here?
  arr.foreach { // <-- here?
    case JString(x) if acceptedTopics.contains(x) => return 1
    case _ =>
  }
}

In practice the spec says that it should be caught at the level of the closest def - on one hand it makes sense, on the other it produces a lot of pitfalls.

Meanwhile, you can usually rely on the fact that everything is an expression in Scala and the value returned by a whole block it's it last value, that if-else is an expression as well (no need for separate ternary operator), etc.

So while there are some places where return makes sense (e.g. if one is porting some nasty Java algorithm 1-to-1), it has a lot of pitfalls and it's easier to guarantee that the code would do something sane with plethora of build-in operators.