javascriptnode.jscouchdbcouchdb-nano

Is there a bug with couch-nano changesReader batching, or have I missed something?


This concerns the changesReader API of the couchdb-nano library.

Expected Behavior

I would expect the following code to wait 10 seconds and then return a batch of messages from the last position in the changes feed. So if I updated the DB 5 times within that 10 second timeout I would expect to see: "a batch of 5 changes has arrived" after 10 seconds.

messages.changesReader
  .start({ timeout: 10000 })
  .on('batch', (batch) => {
    console.log('a batch of', batch.length, 'changes has arrived');
  }).on('error', (e) => {
    console.error('error', e);
  });

Current Behavior

What is happening is that no batching is taking place. Every time I update the database within that timeout I immediately get the log "a batch of 1 changes has arrived".

Have I missed something obvious or misunderstood how nano's batching works?


Solution

  • There is no problem with nano. The super short answer is to use get but even then your interpretation of timeout (which seems much more like a pause or accumulate) is flawed.

    Consider the documentation for start

    (start) to listen to changes indefinitely by repeated "long poll" requests. This mode continues to poll for changes forever.

    OK that's clear, start polls "indefinitely". The timeout documentation states

    The number of milliseconds a changes feed request waits for data

    So timeout does not apply when using start. Yes, the documentation could be less puzzling.

    When does timeout apply? Here's the get method documentation

    (get) to listen to changes until the end of the changes feed is reached, by repeated "long poll" requests. Once a response with zero changes is received, the 'end' event will indicate the end of the changes and polling will stop.

    In other words if X milliseconds transpire without a change event, get polling concludes with the end event (maybe spool as well - I didn't test).

    Consider the node script below, which demonstrates the behavior with get. The end event will occur 10 seconds after the last batch event fires.

    Here's an example output

    $ node nano.js
    Processing start 3:38:00 PM
    a batch of 1 changes has arrived at 3:38:01 PM
    a batch of 2 changes has arrived at 3:38:01 PM
    a batch of 4 changes has arrived at 3:38:05 PM
    a batch of 1 changes has arrived at 3:38:05 PM
    a batch of 1 changes has arrived at 3:38:08 PM
    Processing ended at 3:38:18 PM
    

    Note the script ended 10 seconds (the timeout) after the last event.

    If one were to replace get with start, the end event never fires.

    const nano = require("nano")("http://uid:pwd@127.0.0.1:5984");
    const db = nano.db.use("my_scratch_db");
    const nowstr = () => new Date().toLocaleTimeString();
    
    console.log(`Processing start ${nowstr()}`);
    
    (async () => {
        db.changesReader
            .get({ timeout: 10000 })
            .on("batch", (b) => {
                console.log("a batch of", b.length, `changes has arrived at ${nowstr()}`);
            })
            .on("end", (e) => {
                console.log(`Processing ${nowstr()}`);
            })
            .on("error", (e) => {
                console.error("error", (e || "no reason given").toString());
            });
    })();
    
    setTimeout(() => {
        // insert some docs, one at a time
        const docs = [{ a: 1 }];
        db.bulk({ docs });
        db.bulk({ docs });
        db.bulk({ docs });
    }, 1000);
    
    setTimeout(() => {
        // insert a batch of docs
        const docs = [{ a: 3 }, { a: 4 }, { a: 5 }, { a: 6 }, { a: 7 }];
        db.bulk({ docs });
    }, 5000);
    
    setTimeout(() => {
        // insert one last doc
        const docs = [{ a: 1 }];
        db.bulk({ docs });
    }, 8000);
    

    Update

    I thought it important to add that the feed responses can vary depending on the disposition of the server and client. For example, a subsequent run the above script produced compact batch responses:

    Processing start 4:54:34 PM
    a batch of 3 changes has arrived at 4:54:35 PM
    a batch of 5 changes has arrived at 4:54:39 PM
    a batch of 1 changes has arrived at 4:54:42 PM
    Processing ended at 4:54:52 PM