javaapache-kafkaapache-kafka-streamsrocksdb-java

To close or to not close RocksDB Cache and WriteBufferManager in kafka streams app


I am currently playing around with a custom RocksDB configuration in my streams app by extending RocksDBConfigSetter interface. I see conflicting documentation around closing cache & writeBufferManager instances.

Right now, I see that the javadoc & one of the documentation page suggests that we need to close all the instances that extend RocksObject (both Cache & WriteBufferManager instances extend this class) in the overridden RocksDBConfigSetter#close() method.

However, the memory management documentation page suggests that we create these instances as static instances and not close the Cache and WriteBufferManager instances in the overridden RocksDBConfigSetter#close() method.

Not sure what to follow here. Would appreciate if anyone can help me understand which documentation is correct and what is the preferred way if we would want to limit the memory usage by passing in a custom rocksdb configuration.

Is it ok to not close these instances if we declare them as static?


Solution

  • Both documentations are correct.

    In the first documentation you mention the cache is a field of the object. If you do not close the cache in close(), it will leak off-heap memory after Kafka Streams closed the corresponding RocksDB state store until the JVM exits.

    In the second documentation you mention the cache and the write buffer manager are static. If you close them in a close() the first RocksDB state store that is closed by Kafka Streams will close both and all other RocksDB state stores will most likely crash because their cache and write buffer manager was closed.

    You would need to close a static cache and a static write buffer manager when the class is unloaded by the class loader for which we do not have a callback. I think unloading happens when the JVM exits, so no off-heap memory is leaked until the JVM exits and afterwards the off-heap memory is freed anyways.

    Regarding your question about limiting the memory usage of RocksDB the answer depends on what you want to limit. Do you want to limit the memory used by one single instance of RocksDB or do you want to limit the memory of all RocksDB instances that are used within one Kafka Streams client? For the former you should use the example in the first documentation. For the latter, you should use the example in the second documentation.