cassandradatabase-administration

Restored snapshot, seeing zero rows in the tables after restart or repair


I’m encountering an issue during snapshot restoration in Cassandra.

Scenario:

  1. Backup taken from: 3-node cluster (A)
  2. Restored on: 3-node cluster (B)

After successfully restoring the snapshot, the data size on the new cluster (B) was around 10 GB (same as backup size). However, when I either restart the Cassandra services or run nodetool repair, the data size reduces drastically (e.g., from 10 GB to 500 MB), and I’m seeing zero rows in the tables.

Has anyone experienced this before or have insights into why this might be happening? Any suggestions would be appreciated!

I tried with the below steps.

  1. Create snapshot backup.
  2. Shutdown Cassandra.
  3. Recreate keyspace and tables.
  4. Copy snapshots to relevant data directories.
  5. Start Cassandra.
  6. nodetool repair

Solution

  • Your description is incomplete but I suspect that the destination cluster B does not have an identical configuration as the source cluster A.

    When restoring using the "refresh method" (either with nodetool import or nodetool refresh), the destination node must have the same token range(s) as the corresponding source node otherwise the partitions in SSTables you restored will not necessarily belong to the destination node.

    Only the partitions which the node owns will be kept. The rest of the partitions the node doesn't own will be dropped.

    I've previously documented how to clone data to another cluster using the refresh method on DBA Stack Exchange. But if the clusters don't have identical configuration, you'll need to clone the snapshots to another cluster using sstableloader. Cheers!