distributed-system

Contradictions in replication in the dynamo paper


This is the paper in question: https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf.

From section 4.3 aka replication

Each key, k, is assigned to a coordinator node (described in the previous section). The coordinator is in charge of the replication of the data items that fall within its range. In addition to locally storing each key within its range, the coordinator replicates these keys at the N-1 clockwise successor nodes in the ring

This indicates that in the consistent hashing ring, only one node for a key range is the coordinator. It writes locally and it fans out updates to the other N nodes in the preference list

From section 4.5

A node handling a read or write operation is known as the coordinator. Typically, this is the first among the top N nodes in the preference list. If the requests are received through a load balancer, requests to access a key may be routed to any random node in the ring. In this scenario, the node that receives the request will not coordinate it if the node is not in the top N of the requested key’s preference list. Instead, that node will forward the request to the first among the top N nodes in the preference list

From this paragraph, I infer that any of the n nodes in the preference list can be the coordinator. My understanding from the vector clocks section further seems to suggest that any of the n nodes in the preference list can be the coordinator.

This seems to directly contradict the previous section in the paper.

Is my understanding that the coordinator can be any of the top N nodes in the preference list for a key range correct? Is it the case, assuming healthy nodes and no network partitions, that only the first node in the preference list acts as the coordinator or could you have the other nodes in the preference list acting as the coordinator even in healthy scenarios?

In healthy scenarios could writing key foo with value bar go to 2 separate coordinator X, Y. nodes or do they both go to the same node, aka the first in the preference list for the key "foo".


Solution

  • To action the data, the node must be in the preference list for that range of values. But yes, any node can receive the request and just pass it to a node that's part of that range.

    And to be clear, this is Dynamo, not DynamoDB which often causes confusion.