apache-kafkabigdatacap-theorem

Where is Apache Kafka placed in the PACELC-Theorem


I am starting to learn about Apache Kafka. This https://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka article states that Kafka is a CA system inside the CAP-Theorem. So it focuses on consistency between replicas and also on overall availability.

I recently heard about an extension of the CAP-Theorem called PACELC (https://en.wikipedia.org/wiki/PACELC_theorem). This theorem could be visualized like this:

enter image description here

My question is how Apache Kafka could be described in PACELC. I would think that Kafka focuses on consistency when a partition occurs but what otherwise if no partition occurs? Is the focus on low latancy or strong consistency?

Thanks!


Solution

  • This would depend on your configuration.

    Kafka is backed by CP ZooKeeper for operations that require strong consistency, such as controller election (which decides on partition leaders), broker registration, dynamic configs, acl-s etc.
    As for the data you send to kafka - guarantees are configurable on producer level, per-topic basis or/and change broker defaults.

    Out of the box with default config (min.insync.replicas=1, default.replication.factor=1) you are getting AP system (at-most-once).

    If you want to achieve CP, you may set min.insync.replicas=2 and topic replication factor of 3 - then producing a message with acks=all will guarantee CP setup (at-least-once), but (as expected) will block in cases when not enough replicas (<2) are available for particular topic/partition pair. (see design_ha, producer config docs)

    Kafka pipeline can be further tuned in exactly-once direction..

    CAP and PACELC
    In terms of PACELC some latency-improving decisions were already made into defaults. For example kafka by default does not fsync each message to disc - it writes to pagecache and let OS to deal with flushing. Defaults prefer to use replication for durability. Its configurable as well - see flush.messages, flush.ms broker/topic configurations.

    Due to generic nature of messages it receives (its just a bytestream) - it cannot do any post partition merging, or using CRDTs tricks to guaranty availability during partition, and eventually restore consistency.

    I dont see/know how you can give up consistency for latency during normal operation in kafka-s generic bytestream case. You might give up strong consistency (linearizability) and try to have 'more consistency' (covering a bit more failure scenarios, or reducing size of data loss), but this is effectively tuning AP system for higher consistency rather that tuning CP for lower latency.

    You might see AP/CP trade offs and configurations to be presented as at-least-once vs at-most-once vs exactly-once.

    Testing
    In order to understand how this parameters affect latency - I think the best way is to test your setup with different params. Following command will generate 1Gb of data:

    kafka-producer-perf-test --topic test --num-records 1000000 --record-size 100 --throughput 10000000 --producer-props bootstrap.servers=kafka:9092 acks=all`
    

    Then try to use different producer params:

    acks=1  
    acks=all  
    acks=1 batch.size=1000000 linger.ms=1000  
    acks=all batch.size=1000000 linger.ms=1000  
    

    Its easy to start cluster and start/stop/kill nodes to test some failure scenarios e.g. with compose

    Links and references
    You might check (unfortunately outdated, but still relevant to topic) jepsen test and follow-up, just to add some context on how this was evolving over time.

    I highly encourage check some papers, which will give a bit more perspective:
    A Critique of the CAP Theorem. Martin Kleppmann
    CAP Twelve years later: How the "Rules" have Changed. Eric Brewer