cassandratombstone

Avoid zombie data in cassandra


Recently I faced an issue in a customer setup with a 3 node cluster, where one node went down and came online only after 12 days. The default gc_grace_seconds for most of the table has been set to 1 day in our scenario and there are a lot of tables.

When this down node came up, stale data from this node got replicated to the other nodes leading to zombie data in all the three nodes.

A solution that I could think of was to clean the node before making it join the cluster and then run a repair which could prevent the occurrence of zombie data. Could there be any other possible solution to avoid this issue where I don't need to clean the node.


Solution

  • You should never bring a node back online if it has been down for longer than the shortest gc_grace_seconds.

    This is a challenge in environments where GC grace is set to a very low value. In these situations, the procedure is to completely rebuild the node as if it was never part of the cluster:

    1. Completely wipe all contents of data/, commitlog/ and saved_caches/.
    2. Remove the node's IP from its seeds list if it is listed as a seed node.
    3. Replace the node with itself using the replace_address flag.

    Cheers!