apache-kafka

Figure out if all messages were consumed in Kafka


I'm new to Kafka. I'm sending a couple of messages from a Lafka topic with multiple partitions and I'm consuming them. At the end of it, if all related messages are consumed, I'll send a notification.

For example:

Create a topic with 3 partitions

for 1 to n:
   sendItemToKafkaTopic(item-unique-key) 
end

then consume it:

processItem (1 to n) with scalable system and finally in n. item send notification as completed.

Note: producer side should work as concurrently so there won't be a single bulk operation.

Is there any suggestion? BTW, Kafka solution is not necessary. If you have other options, please share.


Solution

  • Found an amazing post about this: https://medium.com/@debyroth340/identify-job-completion-in-multi-phase-kafka-consumers-33ee8a974963

    1. Complete producer steps.
    2. Take offset snapshot of topic and create a record.
    3. Continue consumer processes and take snapshot of consumer offsets.
    4. If current offset snapshot is greater than the record, then notify it as completed.

    To take snapshot, you can use AdminClient in kafka.

    TopicPartition tp = new TopicPartition(topicName, 0);
    Map<TopicPartition, OffsetSpec> topicPartitionOffsets = Map.of(
           tp,OffsetSpec.latest());
    var offsets = adminClient.listOffsets(topicPartitionOffsets);
    

    To find consumer-groups: Apache Kafka get list of consumers on a specific topic After finding the consumer groups, you can fetch offsets of the consumer groups.