apache-kafkakafka-consumer-api

What is the difference in Kafka between a Consumer Group Coordinator and a Consumer Group Leader?


I see references to Kafka Consumer Group Coordinators and Consumer Group Leaders...

  1. What is the difference?

  2. What is the benefit from separating group management into two different sets of responsibilities?


Solution

  • 1. What is the difference?

    The consumer group coordinator is one of the brokers while the group leader is one of the consumer in a consumer group.

    The group coordinator is nothing but one of the brokers which receives heartbeats (or polling for messages) from all consumers of a consumer group. Every consumer group has a group coordinator. If a consumer stops sending heartbeats, the coordinator will trigger a rebalance.

    2. What is the benefit from separating group management into two different sets of responsibilities?

    Short answer

    It gives you more flexible/extensible assignment policies without rebooting the broker.

    Long answer

    The key point of this separation is that group leader is responsible for computing the assignments for the whole group.

    It means that this assignment strategy can be configured on a consumer (see partition.assignment.strategy consumer config parameter).

    If a partitions assignment was handled by a consumer group coordinator, it would be impossible to configure a custom assignment strategy without rebooting the broker.

    For more details see Kafka Client-side Assignment Proposal.

    Quotes from documentation

    From the "Kafka The Definitive Guide" [Narkhede, Shapira & Palino, 2017]:

    When a consumer wants to join a consumer group, it sends a JoinGroup request to the group coordinator. The first consumer to join the group becomes the group leader. The leader receives a list of all consumers in the group from the group coordinator (this will include all consumers that sent a heartbeat recently and are therefore considered alive) and it is responsible for assigning a subset of partitions to each consumer. It uses an implementation of the PartitionAssignor interface to decide which partitions should be handled by which consumer.

    [...] After deciding on the partition assignment, the consumer leader sends the list of assignments to the GroupCoordinator which sends this information to all the consumers. Each consumer only sees his own assignment - the leader is the only client process that has the full list of consumers in the group and their assignments. This process repeats every time a rebalance happens.