apache-flinkflink-streamingcheckpointing

What would happen if I configured a local file system for Flink checkpointing?


I have saw a video named Managing State in Apache Flink - Tzu-Li (Gordon) Tai. In this video, it stores data with distributed file system.

I'm wondering that what would happen if I configured a local file system for Flink checkpointing?

eg:

env.setStateBackend(new RocksDBStateBackend(getString("flie:///tmp/checkpoints"), true));

I assume that every node of Flink cluster will keep their own data. Would it work well?


Solution

  • I assume that every node of Flink cluster will keep their own data.

    That is correct.

    Would it work well?

    With a local file system and distributed nodes you would may be able to checkpoint just fine (even that is not certain, as the directory may be getting created by the JobManager so the TaskManager instances potentially will fail with the directory not existing) however you would not be able to restore, as the JobManager reads that and distributes that out to the operators as needed.

    Strictly speaking, it does not matter if the file system is local or distributed to flink. What is important is that the JobManager as restore time is able to see all of the checkpoint data. If you are running with everything on the same machine, then a local file system would work just fine.

    I think in principle you could even have all nodes write locally and then manually use a synchronization process to move the data to somewhere that the JobManager could see it during an attempted restore, however that is certainly not a recommended approach.