apache-kafkakafka-consumer-apispring-kafkarebalancing

Consumer 'group_name' group is rebalancing forever


I am using Kafka: 2.11-1.0.1. The application contains consumers with concurrency=5 for the topic 'X' with partitions=5.

When the application is restarted and the message is published on topic 'X' before partition assignment, 5 consumers of topic 'X' find group coordinator and send the join group request to the group coordinator. It is expected to get a response from the group coordinator but no response is received.

I have Checked Kafka server logs but I could not find relevant logs found DEBUG log level.

When I run describe consumer group command, the following observation is made:

  1. consumer group is rebalancing
  2. Old consumers with some lag
  3. New consumers with some random names. As time goes new consumer numbers are increasing.

New messages are published on the topic 'X', but it is not being received by the consumers.

heartbeat and session.time.out is set as default.

This problem occurs if the message is published before the partition assignment for the topic 'X' and its consumers.

My doubt is: Why rebalancing is not getting complete so that new consumer starts consuming the newly produced message?


Solution

  • Application have below consumers in consumer group

    What is happening on application restart and if one of the topic has already published message

    Root Cause:

    Group Coordinator did not wait to all consumers initialization after application restart therefore first unnecessary rebalance happened therefore consumerA1 fetched the message from partition and started processing it.

    Solution: To avoid such unnecessary initial rebalance , kafka provides one configuration in which Group Coordinator waits till consumer join new consumer group. Documentation

    group.initial.rebalance.delay.ms

    Checked my kafka server.properties , it was set to 0. Tries with default i.e. 3 seconds. Initial rebalance avoided , GC wait 3 seconds on application restart and in this time all other consumers initialized.All consumers sent join group request , as all GC got request from all consumers. GC responded without any delay , rebalancing procedded and completed successfully.