scalaconcurrencyfuturescalaj-httpsttp

Is synchronous HTTP request wrapped in a Future considered CPU or IO bound?


Consider the following two snippets where first wraps scalaj-http requests with Future, whilst second uses async-http-client

Sync client wrapped with Future using global EC

object SyncClientWithFuture {
  def main(args: Array[String]): Unit = {
    import scala.concurrent.ExecutionContext.Implicits.global
    import scalaj.http.Http
    val delay = "3000"
    val slowApi = s"http://slowwly.robertomurray.co.uk/delay/${delay}/url/https://www.google.co.uk"
    val nestedF = Future(Http(slowApi).asString).flatMap { _ =>
      Future.sequence(List(
        Future(Http(slowApi).asString),
        Future(Http(slowApi).asString),
        Future(Http(slowApi).asString)
      ))
    }
    time { Await.result(nestedF, Inf) }
  }
}

Async client using global EC

object AsyncClient {
  def main(args: Array[String]): Unit = {
    import scala.concurrent.ExecutionContext.Implicits.global
    import sttp.client._
    import sttp.client.asynchttpclient.future.AsyncHttpClientFutureBackend
    implicit val sttpBackend = AsyncHttpClientFutureBackend()
    val delay = "3000"
    val slowApi = uri"http://slowwly.robertomurray.co.uk/delay/${delay}/url/https://www.google.co.uk"
    val nestedF = basicRequest.get(slowApi).send().flatMap { _ =>
      Future.sequence(List(
        basicRequest.get(slowApi).send(),
        basicRequest.get(slowApi).send(),
        basicRequest.get(slowApi).send()
      ))
    }
    time { Await.result(nestedF, Inf) }
  }
}

The snippets are using

The former takes 12 seconds whilst the latter takes 6 seconds. It seems the former behaves as if it is CPU bound however I do not see how that is the case since Future#sequence should executes the HTTP requests in parallel? Why does synchronous client wrapped in Future behave differently from proper async client? Is it not the case that async client does the same kind of thing where it wraps calls in Futures under the hood?


Solution

  • Future#sequence should execute the HTTP requests in parallel?

    First of all, Future#sequence doesn't execute anything. It just produces a future that completes when all parameters complete. Evaluation (execution) of constructed futures starts immediately If there is a free thread in the EC. Otherwise, it simply submits it for a sort of queue. I am sure that in the first case you have single thread execution of futures.

    println(scala.concurrent.ExecutionContext.Implicits.global) -> parallelism = 6

    Don't know why it is like this, it might that other 5 thread is always busy for some reason. You can experiment with explicitly created new EC with 5-10 threads.

    The difference with the Async case that you don't create a future by yourself, it is provided by the library, that internally don't block the thread. It starts the async process, "subscribes" for a result, and returns the future, which completes when the result will come.

    Actually, async lib could have another EC internally, but I doubt.

    Btw, Futures are not supposed to contain slow/io/blocking evaluations without blocking. Otherwise, you potentially will block the main thread pool (EC) and your app will be completely frozen.