I'd like to keep track of only the recent data and also employ the help of Vector clocks in resolving issues so I can easily discard data via L-W-W rule.(last write wins) Say we have 3 nodes:
- Node1
- Node2
- Node3
Then we would use Vector clocks to keep track of causality and concurrency on each events/changes. We represent Vector clocks initially with
{Node1:0, Node2:0, Node3:0}.
For instance Node1 gets 5 local changes it would mean we increment its clock by 5 increments that would result into
{Node1: 5, Node2:0, Node3:0}.
This would be normally okay right?
Then what if at the same time Node2 updates its local and also incremented its clock resulting into
{Node1:0, Node2:1, Node3:0}.
At some point Node1 sends an event to Node3 passing the updates and piggybacking its vectorclock. So Node3 which has a VC of {Node1:0, Node2:0, Node3:0}
would easily just merge the data and clock as there are no changes on it yet.
The problem I'm thinking about how to deal with is what would happen if Node2 sends an event to update into Node3 passing it's own VC and updates.
What would happen to the data and the clocks. How do I apply Last Write wins here when the first one that gets written to Node3 which was from Node1 would basically appear as the later write as it have a greater VC value on its own clock.
Node3's clock before merging: {Node1: 5, Node2: 0 , Node3: 1}
Node2's messagevc that Node3 received: {Node1:0, Node2:1, Node3:0}
How do I handle resolving data on concurrent VCs?
This is a good question. You're running into this issue because you are using counters in your vector clocks, and you are not synchronizing the counters across nodes. You have a couple of options: