apache-kafkakafka-consumer-apiscalability

no of instances of a microservice crossing the no of partitions in a topic


I am trying to learn kafka with microservices (spring boot) - now, I came across a video lecture which says that the no. of partitions in a topic is static and can't be changed dynamically - that means that lets say if I assign 3 partitions to a topic and 3 instances of a microservice (all in the same consumer group) consuming from that topic, in case the load increases and the 4th instance of the microservice is created (in the same consumer group) it would have to sit idle as there won't be any parition it can read from if it is in the same consumer group and if I add that instance to another consumer group then there is a sure chance of messages from that topic to be read multiple times by the same microservice (although by a different instance).

This entire thing puts a limit to an increase in no. of instances. If there was a provision to increase the no. of partitions in that topic when the count of instances crosses the no. of partitions then it would've helped but that would've it's own problems like rebalancing the messages (payload) across the partitions (including the newly created one) - this would be too volatile and not very stable.

Please tell me where I am going wrong in understanding this and what are the best practises.


Solution

  • That summary is correct; the size of the consumer group (threads or processes/instances) is limited by the partition count.

    Increasing partitions would impact ordering of existing data not just group rebalancing, and partitions cannot be reduced.

    Idle consumers are a warm standby, but otherwise are wasted resources, and there's no way around them without disabling consumer group usage altogether and manually handling your offset commits and retrieval. Even then, you wouldn't want more than one consumer handling the same partition, but different segments of offset ranges.