pythonapache-kafkaconfluent-kafka-pythonlibrdkafka

Confluent Kafka poll, when does a message get committed


I have a Python application that has autocommit=True and it is using poll() to get messages with a interval of 1 second. I was reading on the documentation and it mentions that polling reads message in a background thread and queues them so that the main thread can take them afterwards. I was a bit confused there on what happens if I have multiple messages queued and my consumer crashes. Would those messages queued from the background thread have been committed already and hence get lost?


Solution

  • As mentioned in the docs, every auto.commit.interval.ms, any polled offsets will get committed.

    If you are concerned about missing data, you should always disable auto-commits, in any Kafka client, and handle commits on your own after you know you've actually processed those records.