What is the best approach(industry standard) to change rowkey design on a table that is already containing approx. 1.5 mil rows with one cell with JSONs and is live in production? Multiple systems access the table, so it is preferred to end up with the same table name at the end of the process. We have about 40 capacity nodes and usually under 1MB per cell. Estimated table size is under 10GB(but probably even under 2GB)
At this moment we are thinking of
The con of this approach is that it will take ages to convert, we would pay the toll of attempting first read with new key that doesn't exist yet, rollback is complicated
The con is that only way to rename a table is to clone snapshots. Unknown impact on compaction => performance of given table
The con is downtime and risk of triggering compaction that would hinder performance for a long time
Can you think of a better approach? Or which one would you suggest?
A lot depends on the size of your cluster and how heavily it is being accessed in real-time. But here are some things to keep in mind: