amazon-web-servicesapache-flinkflink-streaming

Does an uninitialized ValueState occupy memory in checkpoints in Flink?


I'm using a ValueState with TTL and I want to understand the difference (if any) in the checkpointed state size/memory between two scenarios:

First scenario

I create/obtain the ValueState but never call update(...) on it:

ValueStateDescriptor<MyState> myStateDescriptor =
    new ValueStateDescriptor<>("MyState", MyState.class);
myStateDescriptor.enableTimeToLive(ttl);
this.myValueState = getRuntimeContext().getState(myStateDescriptor);

Second scenario

I create the ValueState and update it once (e.g., myValueState.update(someValue)).

Questions:

Environment notes: AWS Managed Apache Flink 1.18 with RocksDB


Solution

  • In the first scenario, nothing is stored in the state backend (for any of the state backends).

    In the second scenario, the value stored persists forever in the state backend, unless you've configured it to be removed by the state TTL mechanism.

    With RocksDB, you'll want to be sure to use incremental checkpointing. Note that execution.checkpointing.incremental defaults to false.