javamultithreadingasynchronousasynchttpclientforkjoinpool

Using ForkJoinPool together with AsyncHttpClient - does it make sense?


My questions is somewhat related to this question about ForkJoinPool and IO-oriented operations, but it is slightly more general (and the question I linked to didn't receive a definite answer). In short - if I want to send many HTTP requests in parallel, and am already using an asynchronous HTTP client (such as AsyncHttpClient), is there a point in submitting the requests in parallel as well using a ForkJoinPool?

Initially, I thought doing so defeats the purpose of using an asynchronous HTTP client which already enables sending the requests in parallel. However, reading this related question about ForkJoinPool, which mentioned that ForkJoinPool might improve performance even "when all tasks are async and submitted to the pool rather than forked", made me doubt my understanding of how asynchronous operations work in Java, and how they should be performed. Is there still an advantage to using a ForkJoinPool in my situation, and if so, how come?

I've also read this question about how to send HTTP requests in Parallel in Java, and there all answers mention either using an ExecutorService or AsyncHttpClient, but no answer mentions both.


Solution

  • They're orthogonal concepts, so that's why you don't see them mentioned in the same answers.

    The aim of AsyncHttpClient is (among other things) to use a single thread for all network communication (performed internally by Netty through Java's NIO in a non-blocking asynchronous way), instead of the traditional thread-per-request model. On top of the network layer's single thread, the library has worker threads that perform the application level asynchronous handling visible to the users of AsyncHttpClient, meaning it already has a (small) internal pool of threads.

    ForkJoinPool strives to maximize CPU usage by having many threads (the common pool has by default CPU cores - 1, ones you create can have more/less) and through work stealing so the threads aren't idle, and this is most efficient with small recursive tasks.

    The link discusses that work stealing is also efficient with non-recursive tasks. The word "async" threw you off there, but it's just referring to a regular task that you submit to a pool, and it finishes asynchronously.

    So you can either do thread-per-request (i.e. Basic I/O, or "old I/O") with a thread-pool, or you can use single threaded non-blocking NIO (i.e. New I/O) without a thread-pool.

    Combining them doesn't make sense. You would migrate from a thread-per-request model to a non-blocking model to improve performance.