cassandranodetool

Cassandra Nodes have no space to perform compaction (garbagecollect), can we safely delete SSTables and repair?


I'm working with a Cassandra cluster with 10 nodes. Unfortunately on these nodes, we are using 2 separate keyspaces: 2 tables in 1st one, and 1 table in 2nd one.

The problem we have seems to arise from one of the tables in the first keyspace. We have a replication factor of 10, so I'd assume that copy of the whole table is available on each node, but these 4 nodes have significantly higher disk usage, compared to the other 6. Around 30%/40%% more.

I've tried running "nodetool -h localhost garbagecollect *" on one of the problematic nodes, but unfortunately the process is not even started, due to the following: "Not enough space for compaction, estimated sstables = 1, expected write size = 61055287773". While I managed to run it on one of the nodes, which has more available disk space.

My guess was that at some point, the garbagecollect process couldn't be initiated, due to the above issue, and from that point onwards, no expired rows were deleted at all. The graphs of the storage change over the previous months seems to suggest something similar. ALso, the compactionhistory has no data about any operations, completed on this table.

So what I'm thinking about, as a potential fix, is deleting the SSTables for this table on the 4 problematic nodes, considering the table is no longer being inserted into. And then running: "nodetool -h localhost repair --full " on each one, in order for them to copy the data for this table from the remaining 6 nodes (and in the process recreating the SSTables?). Is this a reasonable, or at all achievable option? Should I flush the commitlog and Memtable of this table in advance, before deleting the SSTables?

Unfortunately, as already mentioned, the 2nd keyspace on these nodes is very important to us, as it even has a smaller replication factor of 5, so a copy is not available on all nodes. And I'm worried that deleting the SSTables of the no longer used table and repairing it may cause issue for the important table on the other keyspace.


Solution

  • In general, if compaction stops due to lack of disk space and you run out of disk space entirely, if you're sure you have a consistent copy of the data elsewhere in the cluster, you can safely bring a node offline, delete the data, then run a full repair once back online.

    Just be sure you have a good copy of the data on the nodes that will remain online. If you are unsure, you could always start by removing only the keyspace data that has an RF of 10, then try to get compaction caught up on the RF 5 keyspace before repairing the RF 10 keyspace. If you go that route, be sure you read with a consistency level higher than ONE/LOCAL_ONE and fix a single node at a time to ensure you're reading consistent data.