flink-statefun

Flink StatefulFunction reacting to a checkpoint?


My stateful function is writing data to a database, but it is nondeterministic so restoring from a checkpoint can result in the database containing inconsistent data. My idea is to 'buffer' data in Flink's persistent state and only write the data to the database once a checkpoint has finished. I guess I could sort of achieve this by using context.sendAfter(Duration duration, Address address, Object input);, by setting the duration higher than the checkpointing interval.

Is there a better solution available that would enable the function to react to checkpoints explicitly?


Solution

  • It's planned that Stateful Functions 2.2 will support Flink datastreams as ingresses and egresses -- see https://github.com/apache/flink-statefun/pull/133 -- which should then allow you use a Flink sink connector that meets your needs. If Flink doesn't already include a suitable sink, you could implement one based on the generic two-phase commit sink (which participates in the checkpointing process).

    Another option would be to somehow make it possible for stateful functions to be aware of checkpointing, but so far that hasn't been done (or even discussed, as far as I know).