apache-kafkacloudera-cdh

How to scale single node Kafka to multiple node cluster?


I am going to install Kafka for company messaging. The plan is to first install the Kafka on a single huge machine and scale it to 4-5 machines (a cluster) later if needed.

I have little experience about Kafka. I want to ask whether it is possible to scale by just changing the parameter in broker configuration and install Zookeeper on newly joined machine.

Or how can I roughly do this in the easiest way? More specifically Cloudera Kafka in CDH.


Solution

  • To scale Kafka you will have to add more partitions to topics if needed to using kafka-topics.sh. And than reassign partitions to your new brokers using kafka-reassign-partitions.sh.

    The reassign utility will replicate and dispatch your data automatically. You can do it for a whole topic or for a selective set of partitions.

    The complete documentation is here. Just take a look at section 6.