apache-kafkaapache-kafka-streamsspring-cloud-streamreactor-kafka

KStream<T> vs Flux<Message<T>>


Looking projects that use Spring Cloud Stream Kafka binding (functional style) I've met that two styles are used - KStreams and reactive pipelines Flux<Message<T>>. They somewhat overlapping as an idea but just partially. What are the characteristics of Flux in Kafka world? Is this another kind of "KStream", kind of "durable reactive stream"? Does it usage guarantee no lag or at least minimal because of backpressure?


Solution

  • Spring Cloud Stream currently provides three different binders specific to Apache Kafka.

    1. The regular message channel-based Kafka binder (spring-cloud-stream-binder-kafka)
    2. Kafka Streams binder (spring-cloud-stream-binder-kafka-streams)
    3. spring-cloud-stream-binder-kafka-reactive (only available in 4.0.x version of Spring Cloud Stream).

    The first one uses Spring's messaging channel as a conduit for message transport. For example, if you have the function, Function<String, String> uppercase(), then you should use this binder. The binder consumes data from a Kafka topic and provides it to a message channel. One processing is done, the binder publishes to another output message channel which will in turn be produced to a Kafka topic.

    The second binder above is strictly used for Kafka Streams. When using this binder, the messages are directly bound to Kafka Streams types such as KStream, KTable etc. This binder should be used if you want to write stream processing apps using Spring Cloud Stream and Kafka Streams.

    The third reactive binder above is for reactive streams based applications in Kafka. For functions like Function<Flux<String>, Flux<String>> uppercase(), this reactive binder should be used. This gives proper reactive capabilities behind the scenes such as backpressure etc.

    To answer your original question regarding the difference between Kafka Streams and reactive, I am afraid that is a broader topic as they tackle different application spaces. Kafka Streams library is used for writing stateful (and stateless) real-time streaming applications. In contrast, reactive streams are used to achieve backpressure and use the programming model available through the implementation (such as Flux). Both Kafka Streams and Flux provide DSL capabilities, but the comparison stops there, and they diverge from there as they target entirely different application use cases.