apache-kafkaapache-kafka-streamsstream-processing

What are stream-processing and Kafka-streams in layman terms?


To understand what is kafka-streams I should know what is stream-processing. When I start reading about them online I am not able to grasp an overall picture, because it is a never ending tree of links to new concepts.
Can any one explain what is stream-processing with a simple real-world example?
And how to relate it to kafka-streams with producer consumer architecture?

Thank you.


Solution

  • Stream Processing

    Stream Processing is based on the fundamental concept of unbounded streams of events (in contrast to static sets of bounded data as we typically find in relational databases).

    Taking that unbounded stream of events, we often want to do something with it. An unbounded stream of events could be temperature readings from a sensor, network data from a router, order from an e-commerce system, and so on.

    enter image description here

    Let's imagine we want to take this unbounded stream of events, perhaps its manufacturing events from a factory about 'widgets' being manufactured.

    We want to filter that stream based on a characteristic of the 'widget', and if it's red route it to another stream. Maybe that stream we'll use for reporting, or driving another application that needs to respond to only red widgets events:

    enter image description here

    This, in a rather crude nutshell, is stream processing. Stream processing is used to do things like:

    As you mentioned, there are a large number of articles about this; without wanting to give you yet another link to follow, I would recommend this one.

    Kafka Streams

    Kafka Streams a stream processing library, provided as part of Apache Kafka. You use it in your Java applications to do stream processing.

    In the context of the above example it looks like this:

    enter image description here

    Kafka Streams is built on top of the Kafka producer/consumer API, and abstracts away some of the low-level complexities. You can learn more about it in the documentation.