javastreamamazon-kinesisamazon-kcl

Process multiple Kinesis streams within single Java process


I would like to process multiple Kinesis streams using KCL within the same Java process.

The idea is simple: make a new KCL instance for each stream and then run the workers concurrently.

My question is whether in this case all KCL instances are using the same thread pool, and whether this idea is a good/common practice when dealing with stream processing.

Thank you


Solution

  • Sure, you can do this - just spin up multiple KCL Worker instances, each pointing at a different stream (with their own configuration, etc). Each Worker instance should manage its own ShardConsumer threads independently of other Workers.

    However, a more common/recommended practice would be to have each Worker be run in its own process - this provides more compartmentalization, which will improve:

    1. failure cases - prevent one failure from impacting all Workers
    2. deployments/updates - more control over number of Workers down for updates at once
    3. hardware management - one Worker per process is easier to distribute across multiple small hosts, especially as your processing requirements grow
    4. development complexity - while KCL supports multiple Workers in one process, it's much easier to develop for each Worker as its own process