I need to count daily orders per user. The data may arrive late for quite a long time. So i think i can use a MapState[String,Long] for key is date and value is the order num according to the date to store the count.But as time goes by, the key-value will be one more per key per day and the state size could be too big someday. Since the data won't be late for longer than a day ,i just need to store two days of data. In this situation, i need to remove the earliest date when the size of MapState[String,Int] reaches 3. But i find out that there isn't a size method for fink MapState.
I know i can use iterator to achieve this and this is exactly what i did. But since the java.util.Map has size method, why isn't there a size method for flink MapState?
check out this: https://issues.apache.org/jira/browse/FLINK-5917 and it explains the reason. size() has been removed since then.
Here's the tickets description at time of writing;
I'm proposing to remove size() because it is a prohibitively expensive operation and users might not be aware of it. Instead of size() users can use an iterator over all mappings to determine the size, when doing this they will be aware of the fact that it is a costly operation.
Right now, size() is only costly on the RocksDB state backend but I think with future developments on the in-memory state backend it might also become an expensive operation there.