apache-kafkaperformance-testinglatencythroughput

Apache Kafka Producer Throughput and Latency


I made a Kafka Cluster on my local machine and I was testing creating producers with different Throughput to see what happens to the latency.

I used the kafka-test-perf benchmark to these tests https://docs.cloudera.com/runtime/7.2.10/kafka-managing/topics/kafka-manage-cli-perf-test.html

I made different tests changing the throughput for the kafka producer.

Test 1: 2 Throughput
Test 2: 200 Throughput
Test 3: 2,000 Throughput
Test 4: 20,000 Throughput
Test 5: 200,000 Throughput

Throughput for Kafka Producer

For my perspective the throughput is the number of messages that arrive in a given amount of time.

For all tests the throughput it´s equal to the records sent by sec, except for Test 5, where the records sent by sec is 22k records/sec. Does this mean that my producer can not handle this type of throughput?

I am trying to understand the meaning of this.

I ran a lot of tests.


Solution

    1. I don't see big difference between test 4 and test 5 which means that you reached the maximum throughput for the given hardware configuration or you need to properly tune Kafka for high loadsenter link description here
    2. Running load generator and the application under test on the same machine is not the best idea due to race conditions. Also using a dedicated load testing tool like Apache JMeter can give you better control over the workload model and reporting
    3. Running performance tests against scaled down environment won't tell you the full story and you won't be able to extrapolate the results especially for complex applications like Kafka, you need to run your tests against production or production-like environment, this way you will be able to get accurate metrics
    4. I would recommend increasing the load gradually, this way you will be able to correlate the increasing load with increasing throughput, will be able to determine the saturation point and the bottleneck more precisely.