memory-managementcassandraheap-memorybloom-filter

Does Cassandra uses Heap memory to store blooms filter ,and how much space does it consumes for 100GB of data?


I come to know that cassandra uses blooms filter for performance ,and it stores these filter data into physical-memory.

1)Where does cassandra stores this filters?(in heap memory ?)

2)How much memory do these filters consumes?


Solution

  • When running, the Bloom filters must be held in memory, since their whole purpose is to avoid disk IO.

    However, each filter is saved to disk with the other files that make up each SSTable - see http://wiki.apache.org/cassandra/ArchitectureSSTable

    The filters are typically a very small fraction of the data size, though the actual ratio seems to vary quite a bit. On the test node I have handy here, the biggest filter I can find is 3.3MB, which is for 1GB of data. For another 1.3GB data file, however, the filter is just 93KB...

    If you are running Cassandra, you can check the size of your filters yourself by looking in the data directory for files named *-Filter.db