I'm designing an event driven distributed system.
One of the events we need to distribute needs 1- Low Latency 2- High availability
Durability of the message and consistency between replicas is not that important for this event type.
Reading the Kafka documentation it seems that consumers need to wait until all sync replicas for a partition have applied the message to their log before consumers can read it from any replica.
Is my understanding correct? If so is there a way around it
If configured improperly; consumers can read data that has not been written to replica yet.
As per the book,
Data is only available to consumers after it has been committed to Kafka—meaning it was written to all in-sync.
If you have configured min.insync.replicas=1 then only Kafka will not wait for replicas to catch-up and serve the data to Consumers.
Recommended configuration for min.insync.replicas depends on type of application. If you don't care about data then it can be 1, if it's critical piece of information then you should configure it to >1.
There are 2 things you should think: