spring-bootapache-kafkaconcurrencymicroserviceskafka-partition

Spring boot kafka: Microservice multi instances, concurrency and partitions


I have a question about the way of publishing and reading messages in kafka for microservices arquitectures with multiple instance of the same microservices for writing and reading. My main problem here is that the microservices that publish and read are configure with an autoscaling but a default numer of instances of 1.

The point is that I have an entity, let call it "Event" that are stored in the DDBB and each entity has its own ID in the DDBB. When some specific command are executed in a specific entity (let say with entityID = ajsha87) it must be published a message that will be readed by a consumer. if each of this messages for the same entity is writen in diferent partitions and cosumed at the same time (Concurrency issue) I will have a lot of problems.

My question is about if according to the entityID for example I can set in which partitions all events of this specific entity will be published. For another entity with different ID I dont care about the partion but the messages for the same entity must be always published in the same partition to avoid that a consumer will read a messages (2) published after a message (1). There is any mechanism to do that, or each time I save the entity I have randomly store in the DDBB the partition ID in which its messages will be published?

Same happens with consumers. Only one consumer can read a partition at the same time because if not, a consumer number 1 can read the message (1) from partition (1) realted with entity (ID=78198) and then another can read the message (2) from partition (1) ralated with the same entity and process the message 2 before number one.

There is any mechanish about subscribe each instance only to one partition according to the microservice autoscaling?

Another option it will be to assign dinamically for each new publisher instance a partition, but I dont know how to configure that dinamically to set diferent particions IDs according to the microservice instance

I am using spring boot by the way

Thanks for you answer and recomendations and sorry if my english is not good enough.


Solution

  • If you use Hash Partitioner as the partitioner in producer config (This is the default partitioner in many libraries), and use same key for same entity (let say with entityID = ajsha87) kafka manages to send all messages with same key to same partition.

    If you are using group consumer, One consumer instance take the responsibility of one partition and all messages published to that partition consumes by that instance only. Instance can be changed if there is rebalancing when upscaling. but still messages in same partition will read from one consumer instance.