node.jsamazon-web-servicesamazon-s3node-streams

Pipe a stream to s3.upload()


I'm currently making use of a node.js plugin called s3-upload-stream to stream very large files to Amazon S3. It uses the multipart API and for the most part it works very well.

However, this module is showing its age and I've already had to make modifications to it (the author has deprecated it as well). Today I ran into another issue with Amazon, and I would really like to take the author's recommendation and start using the official aws-sdk to accomplish my uploads.

BUT.

The official SDK does not seem to support piping to s3.upload(). The nature of s3.upload is that you have to pass the readable stream as an argument to the S3 constructor.

I have roughly 120+ user code modules that do various file processing, and they are agnostic to the final destination of their output. The engine hands them a pipeable writeable output stream, and they pipe to it. I cannot hand them an AWS.S3 object and ask them to call upload() on it without adding code to all the modules. The reason I used s3-upload-stream was because it supported piping.

Is there a way to make aws-sdk s3.upload() something I can pipe the stream to?


Solution

  • Wrap the S3 upload() function with the node.js stream.PassThrough() stream.

    Here's an example:

    inputStream
      .pipe(uploadFromStream(s3));
    
    function uploadFromStream(s3) {
      var pass = new stream.PassThrough();
    
      var params = {Bucket: BUCKET, Key: KEY, Body: pass};
      s3.upload(params, function(err, data) {
        console.log(err, data);
      });
    
      return pass;
    }