message-queueapache-kafkaretentionstream-compaction

How to test whether log compaction is working or not in Kafka?


I have made the changes in server.properties file in Kafka 0.8.1.1 i.e. added log.cleaner.enable=true and also enabled cleanup.policy=compact while creating the topic. Now when I am testing it, I pushed the following messages to the topic with following (Key, Message).

Now I pushed the 4th message with a same key as an earlier input, but changed the message. Here log compaction should come into picture. And using Kafka tool, I can see all the 4 offsets in the topic. How can I know whether log compaction is working or not? Should the earlier message be deleted, or the log compaction is working fine as the new message has been pushed. Does it have to do anything with the log.retention.hours or topic.log.retention.hours or log.retention.size configurations? What is the role of these configs in log compaction. P.S. - I have thoroughly gone through the Apache Documentation, but still it is not clear.


Solution

  • Actually, the log compaction is visible only when the number of logs reaches to a very high count eg 1 million. So, if you have that much data, it's good. Otherwise, using configuration changes, you can reduce this limit to say 100 messages, and then you can see that out of the messages with the same keys, only the latest message will be there, the previous one will be deleted. It is better to use log compaction if you have full snapshot of your data everytime, otherwise you may loose the previous logs with the same associated key, which might be useful.