cassandracassandra-3.0

Huge delete on Cassandra ti reclaim disk space


I have a 3 node Cassandra v3.0.15 cluster with replication factor 3. My disk size is 2Tb and it's already 95% full. Almost 1.5T of data is considered stale and must deleted and retain only last 2 years data. With the limited free disk space I have how can I delete the data and reclaim disk space?

Other details:

  1. There are 3 tables where 2 tables use LeveledCompactionStrategy and 1 table use DateTieredCompactionStrategy (with max_sstable_age_days = 365).
  2. All 3 tables have gc_grace_seconds = 10 days.

I've approaches but I hit roadblock in all of them:

  1. Set gc_grace_seconds=0 and create an python script will delete the records in batches with pause between each batch. (i)This could create a lot of tombstones and fill up the remaining disk space. (ii)Reducing gc_grace_seconds=0 doesn't ensure compation will be triggered. (iii)There is a table which use DateTieredCompactionStrategy with max_sstable_age_days = 365 so tables older than 365 days(which is of our interest) will not be compacted.
  2. Turn off two nodes in the cluster and identify the old SST files by date and delete them manually then resync the other two nodes. I'm not sure if this is safe or will lead to table corruption.
  3. Create new table and copy only eligible dataset and then update table details in application. There isn't enough disk space to perform this.

Please suggest safe approach that can be implemented.


Solution

  • If you do have files for user tables which were created > 2 years ago, the option of taking the nodes offline and manually removing the files will work. Once the nodes are back online and a repair is run, a percentage of the data will be reinstated, since the files removed will not have a perfect overlap.

    You would need to do this on all the nodes prior to running the repair, otherwise it will reinstate all of the data.

    It's not clear from the original post though, if the 1.5 TB you mention is already identified as being in files which you can identify as being > 2 years old or not.

    Backing up to external / network storage the files you are going to delete, prior to deleting them, will at least give you a route back - in that you can down the nodes again and add the files back.