apache-kafkaconfluent-platform

Kafka max.request.size VS compression.type


I am testing a program to verify if compression helps to reduce the size of the topic message. My sample topic has config 'max.message.bytes=1024000' which is ~1MB and on the producer side config I have set the same value to 'max.request.size' and then I attempted to send across a string that is of size 1573015 which is ~1.5MB and this throws below error which is expected.

org.apache.kafka.common.errors.RecordTooLargeException: The message is 1573015 bytes when 
serialized which is larger than 1048576, which is the value of the max.request.size configuration.

Next, since I want the compression responsibility at producer level, I set compression config at producer with the property 'compression.type' as 'zstd' (I have also tried gzip) but producer throws the same error. I expected the compression config to reduce the message size to <1MB on the producer before it is being sent.

I also observed same behaviour when I test 'compression.type' at topic level or producer level or compression.type property set at both topic and producer (I would like to avoid setting this property at broker level thought since I want this to take effect only for a specific topic or producers of that topic).

I wanted to understand whether compression.type actually reduces the message size that is sent across from producer to Kafka broker where broker unpacks it and verifies the size of uncompressed message and throws this error ? or is it that there may be configuration error on producer due to which the compression is not happening in the first place ?

Much appreciated if anyone can shed some light into the inner workings of property max.request.size with regard to compression.type.

Using a standalone program I did verify that the message sample I used for this test can be compressed to <1MB using gzip and zstd. The kafka version that I used for this test is Confluent Kafka Platform 8.0 which is running on a single node cluster locally on Ubuntu WSL.

Compression property reference: https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html#compression-type


Solution

  • Message is never sent to broker, because your Producer will do a size validation first.

    There is completely different configuration for a broker - message.max.bytes "Sets the maximum size for a message that can be accepted by the broker. The default is 1 MB.". In both cases if you go over the limit RecordTooLargeException is thrown, but yours contains max.request.size which indicates Producer configuration.

    Documentation clearly states that producer is responsible for compression, but broker can do recompression, if topics compression.type is different than producer compression.type.

    https://www.confluent.io/blog/apache-kafka-message-compression/#configuring-compression-type