We are trying to implement Kafka HA using kafka cluster. While doing R&D, we found that minimum number of nodes recommended for zookeeper & kafka brokers are 3.
We understand that why zookeeper should have minimum 3 nodes, because for leader election minimum (n+1)/2 nodes should be up & running.
But its not clear, why minimum 3 kafka brokers are required. Why can't we implement HA with 2 kafka brokers & 3 zookeepr nodes?
The minimum number of nodes of Zookeeper is 3 because of the quorum attribute. It should be odd because the even number of nodes is no used. e.g: Zookeeper with total nodes of 8 can be downgraded to 7. Many nodes in Zookeepers also isn't good because of the consensus algorithm. (e.g: Paxos)
For the Kafka cluster, personally I think it is okay for setting 2 brokers. But it is better with 3 brokers. The reason because of maintaining the ISR - In Sync Replicas.
Let say your Kafka cluster has 2 brokers. To maintain the high availability and the consistency of the data, we will set the replicas and the ISR both to 2. The interesting part is the min-ISR attribute. If you set the min-ISR to 1 then the leader fails, likely you don't have any remaining replicas. If you set the min-ISR to 2, when either the leader or the follower fails, nor the producer and consumer can work.
If our Kafka cluster has 3 brokers and we set the ISR equals to 3, the min-ISR equals to 2. With this configuration, we accept the risk of losing 1 replica (either leader or follower) while working. For instance, if we lose the leader, there has at least one follower that in-sync for switching. If we lose one of the followers, we still have a remaining follower to keep the min-ISR to 2.