apachecassandranodetool

Cassandra - Data not replicating across all nodes


I'm running a query across all three nodes. One of the queries results in displaying ten rows, while the same query is showing two rows on the other two.

The replication factor is set to 3:

keyspace_name      | durable_writes | replication
--------------------+----------------+-------------------------------------------------------------------------------------

table name |           True | {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}

Nodetool Netstats:

nodetool netstats
Mode: NORMAL
Not sending any streams.
Read Repair Statistics:
Attempted: 16519
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool Name                    Active   Pending      Completed   Dropped
Large messages                  n/a         1             13         4
Small messages                  n/a         0         320422         4
Gossip messages                 n/a         0       12972040       470

Nodetool repair has been ran across all of the nodes.


Solution

  • Based on your comment, the issue could be prevented using a consistency level of QUORUM or higher. One thing to consider is that increasing consistency may have an impact on performance and on resiliency. For instance, using a consistency level of ALL will ensure to always have accurate data, but if there is an issue with one of the instances of the cluster, the queries will fail as the consistency level won't be satisfied. The best consistency level will depend on your use case and your SLA's.

    How often have you executed repairs (nodetool repair) on your cluster? Repairs will address the root cause for the different data retrieved from each node.