javavert.xreactorhigh-load

Vert.x: simplest server with 1000 rps


Suppose you need to write a server with 1000 rps. The load may grow in the future. Server is serving only one kind of request - getGender(name) which accept a name, and return Male/Female. Determinition of sex is simplest operation which requires single index lookup, where index is an in-memory data structure.

If understand it right - you create single ServerVerticle, and run Runtime.getRuntime().availableProcessors() worker verticles to which delegate the job (see code below).

Questions:

  1. Is this the best scheme for the task of 1000 rps?
  2. What will happen on request peak when there will be unsufficient 15 workers? Suppose one worker can handle 100 rps. You have 15 workers. But at peak time you have 3000 rps.
    • Suppose NetServer can handle 3000 rps, but workers stuck to handle them. Have Vert.x got any queue to keep waiting requests? How to do that? If it has - what happens on worker fail?
    • Suppose NetServer can't handle 3000 rps - just run few instances of a server. No pitfalls, right?
  3. Is TCP the better choice for the task?
  4. Vert.x is multi-reactor, which like Node runs it event-loop. ServerVerticle is run in the same thread as event-loop, right?
  5. If you have 16 cores, 1 core is dedicated for event-loop, so Vert.x will run 15 GenderVerticles, right? No more thereads?

ServerVerticle.java

public class ServerVerticle extends AbstractVerticle {

    public static void main(String[] args) {
        Consumer<Vertx> runner = vertx -> vertx.deployVerticle("ServerVerticle", new DeploymentOptions());
        Vertx vertx = Vertx.vertx();
        runner.accept(vertx);
    }

    @Override
    public void start() throws Exception {
        NetServerOptions options = new NetServerOptions();
        NetServer server = vertx.createNetServer(options);
        server.connectHandler(socket -> {
            socket.handler(buffer -> {
                vertx.eventBus.send("get.gender", buffer, res -> socket.write(res.toString()));
            });
        });
        server.listen(1234, "localhost");

        //Deploy worker verticles
        DeploymentOptions deploymentOptions = new DeploymentOptions()
            .setInstances(Runtime.getRuntime().availableProcessors())
            .setWorker(true);
       vertx.deployVerticle("GenderServiceVerticle", deploymentOptions);
    } 
}

GenderVerticle.java

public class GenderVerticle extends AbstractVerticle {

    @Override
    public void start() throws Exception {
        vertx.eventBus().consumer("get.gender", message -> {
            String gender = singleIndexLookup(message);
            message.reply(gender);
        });
    }

    singleIndexLookup() { ... }
}

Solution

  • There are several questions here and some misconceptions about vert.x. Once you implement your code using Verticles you do not need to implement your own main method, since under the wood what the internal main method will do it to make sure you can have the full capacity of your CPU power and you do not need to scale it yourself with:

    //Deploy worker verticles
    DeploymentOptions deploymentOptions = new DeploymentOptions()
      .setInstances(Runtime.getRuntime().availableProcessors())
    

    You should read the following section of the documentation.

    Second you are referring to your GenderVerticle as a worker because it will do some operation for you. Note that in vertx, worker means that it should be executed on a dedicated thread pool since it could happen that the code in that verticle will perform some blocking IO.

    Using the worker mode will introduce a performance penalty since you loose the benefits of asynchronous IO and your requests need to queue for a thread from the pool.

    Since your example explains that all your code does is a in memory look-up I assume it is CPU bounded and not IO bound which means you should avoid deploying it as a worker.

    Going back to your example, you have 1 verticle that handles all the HTTP trafic and a second one that processes it. For top performance you might want to have just 1 verticle since there are less hops but this solutions does not scale horizontally (the reason of your question) how do I handle 3000rps assuming one node can only do 1000rps.

    Now you're already going on the right path, you split http handling from business handling, it has a little penalty but if you know that 1 node can process 1000rps and you must at least handle 3000rps all you need to do is deploy the GenderVerticle on 3 extra machines.

    Once you do this and enable clustering, and you do this by adding the dependency (e.g: hazelcast):

    <dependency>
      <groupId>io.vertx</groupId>
      <artifactId>vertx-hazelcast</artifactId>
      <version>3.3.3</version>
    </dependency> 
    

    And by starting your application with the flag --cluster. You will have a cluster of 4 machines where the requests will be load balanced in a round robin fashion to each of the GenderVerticles.

    Since the HTTP code is highly optimized by netty you will probably not need more than one server, if that is not the case, one option you can do it to add a traffic load balancer in front of your servers and again deploy another ServerVerticle in another machine in your cluster, now the traffic load balancer will load balance the HTTP traffic between the 2 servers which will round robin to the GenderVerticles.

    So I guess you start to see the pattern that once your monitoring tells you your CPU/NetworkIO is being maxed out you add more machines to the cluster.