apache-kafkakafka-cluster

How many Kafka controllers are there in a cluster and what is the purpose of a controller?


The Kafka controller in a Kafka cluster is in charge of managing partition leaders and replication.

If there are 100 brokers in a Kafka cluster, is the controller just one Kafka broker? So out of the 100 brokers, is the controller the leader?

How would you know which broker is the controller?

Is the management of the Kafka Controller critical to Kafka system management?


Solution

  • Within a Kafka cluster, a single broker serves as the active controller which is responsible for state management of partitions and replicas. So in your case, if you have a cluster with 100 brokers, one of them will act as the controller.

    More details regarding the responsibilities of a cluster controller can be found here.

    In order to find which broker is the controller of a cluster you first need to connect to Zookeeper through ZK CLI:

    ./bin/zkCli.sh -server localhost:2181 
    

    and then get the controller

    [zk: localhost:2181(CONNECTED) 0] get /controller
    

    The output should look like the one below:

    {"version":1,"brokerid":100,"timestamp":"1506423376977"}
    cZxid = 0x191
    ctime = Tue Sep 26 12:56:16 CEST 2017
    mZxid = 0x191
    mtime = Tue Sep 26 12:56:16 CEST 2017
    pZxid = 0x191
    cversion = 0
    dataVersion = 0
    aclVersion = 0
    ephemeralOwner = 0x15ebdd241840002
    dataLength = 56
    numChildren = 0
    

    Zookeeper is the storage of the state of a Kafka cluster. It is used for the controller election either in the very beginning or when the current controller crashes. The controller is also responsible for telling other replicas to become partition leaders when the partition leader broker of a topic fails/crashes.