multithreadingkotlinnettyktor

Ktor (Netty) Configuration Values for Expected Load


When configuring Ktor embeddedServer with Netty, there are three separate thread pools:

connectionGroupSize = XX  // accept new connections and start call processing
workerGroupSize = YY      // process connections, parse messages, do engine's internal work
callGroupSize = ZZ        // process application calls

There is documentation about what they are (included in the comments above) but I couldn't find anything that explained what each of those phases actually means in the context of processing a request or how they relate to total capacity.

If I have a server that is mostly accepting GET requests and returning content from a database with very little computational work (not even compression), how should these be configured to handle, say, 400 requests per second with up to 30 concurrent ones?

(Assume that threads will not be able to operate as "coroutines", instead blocking for database lookups.)


Solution

  • To really understand how connectionGroupSize, workerGroupSize, and callGroupSize work in Ktor with Netty and how to configure them for your specific scenario, you should look into the threading model of Netty.

    I found this Medium article that discusses how threading works in Netty. The article references the book Netty in Action, which does a really good job explaining how EventLoops and Threading work (Chapter 7. EventLoop and threading model).

    Now to answer your question as to what connectionGroupSize, workerGroupSize and callGroupSize actually do:

    Netty uses a structure where each EventLoopGroup holds several EventLoops, with each EventLoop running on a separate thread to handle network tasks efficiently. This setup prevents task overlap and minimizes the need for switching between tasks, making the process more straightforward and error-free.

    For server-side applications, Netty uses a ServerBootstrap object, which initializes with two EventLoopGroup instances: one for accepting connections (the connectionGroupSize in Ktor) and another for handling the actual connection's data processing (the workerGroupSize and callGroupSize in Ktor). The separation of these groups helps in balancing the load between accepting new connections and processing existing ones.

    As for this part of your question:

    If I have a server that is mostly accepting GET requests and returning content from a database with very little computational work (not even compression), how should these be configured to handle, say, 400 requests per second with up to 30 concurrent ones?

    connectionGroupSize: Since the server mainly deals with accepting connections and the overhead for this is relatively low, you can keep this number small (2-4 would suffice).

    workerGroupSize: This group for processing the connections and handling the network I/O. Given the high rate of requests, the workerGroupSize should be sufficiently large to handle the I/O operations smoothly. Starting with a value like 8-16 might be a good approach.

    callGroupSize: This size should reflect the ability to handle concurrent application-level operations such as database queries. Since the operations are not computationally intensive but might be I/O bound due to database access, setting callGroupSize to a number like 30-50 can help manage the concurrent requests effectively, ensuring that database lookups do not become a bottleneck.

    If you have any further questions related to the subject, I don't think you're going to find anything detailed enough in the Ktor documentation. You're going to need to dive deeper into how Netty works under the hood.

    Goodluck!