One application (producer) is publishing messages and these messages are being consumed by another application (with multiple consumers). Producer sends data with field country and we will have multiple consumers in our application, each consumer will subscribe to specific country.
From what I have been reading so far, we can have 2 approaches to filter message:
My question is should we choose option 1 or 2? We are expecting to receive hundreds of messages every few seconds.
In my experience typically the first approach is used.
The second option is problematic. What if you add a new country? You will need to add a partition to the topic, which is possible but not straightforward. You will also need to change the logic on the producer and conusmer side. If consumers are just subscribed to the topic, then in case of failure partitions will be automatically assigned to the alive consumers inside the consumer group. In your case you will need to handle the failures with the programming logic.
Another approach is to have a topic per country.
One more approach is to publish all the data into one topic and then distribute data to other topics(each per consumer) with Kafka Streams application. If the requirements change then you change the implementation of Kafka Streams app.