apache-kafkagoogle-cloud-dataflowapache-beamapache-beam-kafkaio

How to manually commit kafka offset after FileIO in apache beam?


I have a FileIO writing a Pcollection<GenericRecord> to files and returns WriteFilesResult<DestinationT>.

I would like to create a DoFn after writing files to commit the offset of written records to kafka but since my offsets are stored in my GenericRecords I can no longer access them in the output of FileIO.

What is the best way to solve this ?


Solution

  • For anyone interested, here is how I did: