apache-kafkaapache-flinkflink-streamingapache-kafka-mirrormaker

Kafka Migration with MM2 and Flink: How to Handle Offset Changes and Savepoints?


I'm currently migrating a Kafka cluster using MirrorMaker 2 (MM2). My plan is to switch my Flink Kafka connector to the new cluster once the migration is complete.

Ideally, I want Flink to resume consumption from where it left off. However, I'm concerned about the offset changes after migration. Since the topic-partition-offset-states information stored in the savepoint will be outdated, how should I handle this situation?

Are there any recommended solutions or best practices for managing Flink savepoints and offset changes during a Kafka migration with MM2? Any advice or insights would be greatly appreciated!

It seems like I need an external program that can:

Load the topic-partition-offset-states information from the Flink savepoint file.

Read the corresponding MM2 topics to translate the old offsets to the new ones in the migrated cluster.

Modify the savepoint file with the translated offsets.

Is this a viable approach? What potential risks or challenges should I be aware of if I go down this path?


Solution

  • I think you should be following exactly the same process as is defined for when you're upgrading your Flink installation to the latest connector version, as is documented at https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/datastream/kafka/#upgrading-to-the-latest-connector-version, in other words: