[SOLVED] Prevent key based repartitioning in kafka-streams

Prevent key based repartitioning in kafka-streams

I have a slightly strange use case where our applications are not using standard kafka partitioning. Instead we have a custom partitioning strategy, where we use a specific field within a compound key to decide how to partition. This is generally the CustomerId, so that all records for a single customer are contained within a single partition, however the key also contains the other Ids that make the message unique so that compaction still works.

e.g.

topic-1-key

{
  orderId,
  customerId
}

topic-2-key

{
  addressId,
  customerId
}

I want to join these 2 records together, in order to do this with the DSL, my only option is to rekey both records to the customer Id, and do the join. However when I do this, Kafka-streams automatically decides the key-changing operations have occurred, and creates repartition topics for me. Is there any way to override this behaviour whilst using the DSL?

I'm aware I could do this manually using the processor api and state stores, but wondered if there's a way to do it with the DSL, or if its not an option.

Solution

It's not possible right now, ie, up to Apache Kafka 3.6.

There is already WIP to add a new operator markAsPartitioned() to close this gap. KIP-759 is already accepted and will most likely ship with 3.7.0 release.