marklogiccap-theoremdatabasenosql

How Marklogic can have consistency and availability?


The CAP theorem seems logical to me. I understand that:

If I have consistency on a distributed system, I have to wait for all transactions. The cost of ACID is the time to duplicate data on all the network.

But how Marklogic can have both. ACID and distributed system without lag?
So is it possible to have BASE and ACID properties on the same database?
So is CAP theorem wrong?


Solution

  • Availability in CAP Theorem is about the hosts that are on either side of the partition, not about the system as a whole.

    In CAP Theorem you are "Available" if all hosts on either side of a network partition can continue to accept both read and update transactions. Most of our customers don't care if all hosts remain available in the face of a network partition. They care that the database as a whole remain available during a network partition. So if the cluster has replicated or shared data so that there is enough data on both sides of the partition to continue to serve queries, and is smart enough to know which side of the partition should remain available and which should gracefully bow out, then the database can remain available in the face of a network partition, even if all hosts do not. That's what MarkLogic does within a cluster.

    Between clusters, MarkLogic has many options for how close to absolutely consistent you want to be. We use asynchronous replication to move data between clusters, so there if there is a network partition between clusters, the data may not be consistent between those clusters. You can control how long that lag limit is so that you can tune this, and if you need absolute consistency between clusters, we have ways of achieving that as well.

    Bottom line is that:

    Hope that helps.