iosnosqlcouchdbcloudanttouchdb

Cost of continuous replications vs one-shot replications (using TouchDB and Cloudant)


We have an app that uses Cloudant as a remote server. Nevertheless, Cloudant is not completely compatible with TouchDB's continuous replications from previous experience. So our alternative for now is to trigger manually one-shot replications at a fixed frequency. Nevertheless, we would like to know if that approach is going to cost us more money than continuous replications, since continuous replications use longpoll and doesn't need to query the server often. In other words, does one-shot pull replications with Cloudant as the target cost us a GET request?

Thank you, Paul


Solution

  • I think the issue you refer to is [1]. Cloudant's replication is 100% compatible with CouchDB. In this instance, TouchDB's logs indicate the iOS network stack passed on incomplete JSON to TouchDB. It's not clear who was to blame in this case for the replication failure.

    [1] https://github.com/couchbaselabs/TouchDB-iOS/issues/241

    For the cost question, a one-shot pull replication will result in a GET to the _changes feed each time it happens, plus the other requests required to replicate. This _changes request will be counted as a light HTTP request against your Cloudant account.

    However, whether this works out as more or fewer requests overall depends on the number of changes coming down from the remote server.

    It's also important to remember that the number of _changes calls are very small relative to the number of other calls involved (e.g., getting the content of the changes themselves and particularly if there are many attachments).

    While this question is specific to TouchDB, and I mention specific behaviours of that codebase, this answer deals with the requests involved in replication between any two systems speaking the CouchDB replication protocol[2].

    [2] http://www.dataprotocols.org/en/latest/couchdb_replication.html

    Let's take a contrived example: 1 update per 10 second window to the source database for the replication, where a TouchDB database is the target. Let's take a 5 minute poll vs. a continuous replication. For simplicity of call-counting, let's also take attachments out of the picture. We'll also assume the device has a constant network connection.

    For the continuous case, every 10s TouchDB will receive an update in the _changes feed. This causes the longpoll connection to close. TouchDB then runs through the changes, requesting the updates from the source database; one or more GET requests on the remote server. While this is happening, TouchDB has to open up another longpoll request to _changes. So in a five minute period, you'd end up with perhaps 30 calls to _changes, plus all the calls to get documents and record checkpoints.

    Compare this with a one-shot replication every five minutes. You'd receive notification of the 30 updates in one _changes feed call. TouchDB implements an optimisation[3] whereby it will call _all_docs to get updated documents for 1- revs, so you might end up with a single call to get all 30 documents (not possible in the continuous case as you've received a single change). Then you've the checkpoint documents to record. At best fewer than 5 HTTP calls, at most about a third of the continuous case as you've avoided extra _changes requests.

    [3] https://github.com/couchbaselabs/TouchDB-iOS/wiki/Replication-Algorithm#performance

    It comes down to the frequency of updates you expect to the source database. One-shot replication is likely to provide a smoother price curve as you're in better control of the number of requests you make.

    A further question is how often connections will drop because of the network disconnects which happen regularly with mobile devices. TouchDB's continuous replications will fire back up each time the user comes on line (if added via the _replicator database). This is a further source of unpredictable costs.

    However, the benefits from more immediate visibility of changes may certainly be worth the uncertainty.