javascriptnode.jsfetchonedrivenode-streams

What are the advantages of using a stream over `fetch()`?


I was trying to download a file using the OneDrive JS SDK, so I've used the code from Microsoft:

// Download a file from OneDrive
let fs = require('fs'); // requires filesystem module
client
    .api('/me/drive/root/children/Book.xlsx/content')
    .getStream((err, downloadStream) => {
        if (err) {
            console.log(err);
            return;
        }
        let writeStream = fs.createWriteStream('../Book1.xlsx');
        downloadStream.pipe(writeStream).on('error', console.log);
    });

As I want to get it working in a browser too (not just in Node), I've first tried some stream lib for browsers but couldn't get anything working. Eventually, I got it working with just the REST API and fetch() (the SDK is a wrapper over the REST API).

A simple fetch(url) did the job. So I'm wondering, why did MS go through the trouble of all the stream code above when a single line would do the job?

In particular are the performances of streams somehow better than fetch(). For example, would fetch freezes the app when downloading large files while streams wouldn't? Are there any other differences?


Solution

  • Streams are more efficient, in more than one way.

    You can perform processing as-you-go.

    For example, if you have a series of data that you want to perform processing on and it's in a remote location using a stream will allow you to perform processing on the data as it flows, therefore you can do the processing and the download, in-parallel.

    This is much more efficient than waiting for the data to download, then after it's downloaded you start processing it all in one-go.

    Streams consume much less memory.

    If you want to download a 1GB file without using streams you would be consuming 1GB of memory since the file is downloaded in one request, stored temporarily somewhere, e.g a variable and then you start reading off that variable to save to a file. In other words, you store all your data in a buffer before you start processing it

    In contrast, a stream would be writing to the file as content comes. Imagine a stream of water flowing into a jug.

    AFAIK this is the main reason that data downloads are usually handled with Streams.


    That being said, in most cases - apart from file downloads and real-time stuff - it doesn't make any sense to use Streams over the usual request/response scheme.

    Stream handling is generally more complex to implement and reason about.