amazon-web-servicesamazon-s3uppy

Uppy Companion doesn't work for > 5GB files with Multipart S3 uploads


Our app allow our clients large file uploads. Files are stored on AWS/S3 and we use Uppy for the upload, and dockerize it to be used under a kubernetes deployment where we can up the number of instances.

It works well, but we noticed all > 5GB uploads fail. I know uppy has a plugin for AWS multipart uploads, but even when installed during the container image creation, the result is the same.

Here's our Dockerfile. Has someone ever succeeded in uploading > 5GB files to S3 via uppy? IS there anything we're missing?

FROM node:alpine AS companion
RUN yarn global add @uppy/companion@3.0.1
RUN yarn global add @uppy/aws-s3-multipart
ARG UPPY_COMPANION_DOMAIN=[...redacted..]
ARG UPPY_AWS_BUCKET=[...redacted..]


ENV COMPANION_SECRET=[...redacted..]
ENV COMPANION_PREAUTH_SECRET=[...redacted..]
ENV COMPANION_DOMAIN=${UPPY_COMPANION_DOMAIN}
ENV COMPANION_PROTOCOL="https"
ENV COMPANION_DATADIR="COMPANION_DATA"
# ENV COMPANION_HIDE_WELCOME="true"
# ENV COMPANION_HIDE_METRICS="true"
ENV COMPANION_CLIENT_ORIGINS=[...redacted..]
ENV COMPANION_AWS_KEY=[...redacted..]
ENV COMPANION_AWS_SECRET=[...redacted..]
ENV COMPANION_AWS_BUCKET=${UPPY_AWS_BUCKET}
ENV COMPANION_AWS_REGION="us-east-2"
ENV COMPANION_AWS_USE_ACCELERATE_ENDPOINT="true"
ENV COMPANION_AWS_EXPIRES="3600"
ENV COMPANION_AWS_ACL="public-read"
# We don't need to store data for just S3 uploads, but Uppy throws unless this dir exists.
RUN mkdir COMPANION_DATA

CMD ["companion"]

EXPOSE 3020

EDIT:

I made sure I had:

uppy.use(AwsS3Multipart, {
  limit: 5,
  companionUrl: '<our uppy url',
})

And it still doesn't work- I see all the chunks of the 9GB file sent on the network tab but as soon as it hits 100% -- uppy throws an error "cannot post" (to our S3 url) and that's it. failure.

Has anyone ever encountered this? upload goes fine till 100%, then the last chunk gets HTTP error 413, making the entire upload fail.

enter image description here

Thanks!


Solution

  • Here I'm adding some code samples from my repository that will help you to understand the flow of using the BUSBOY package to stream the data to the S3 bucket. Also, I'm adding the reference links here for you to get the package details I'm using.

    https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/clients/client-s3/index.html

    https://www.npmjs.com/package/busboy

    export const uploadStreamFile = async (req: Request, res: Response) => {
        const busboy = new Busboy({ headers: req.headers });
        const streamResponse = await busboyStream(busboy, req);
        const uploadResponse = await s3FileUpload(streamResponse.data.buffer);
        return res.send(uploadResponse);
    };
    
    const busboyStream = async (busboy: any, req: Request): Promise<any> {
         return new Promise((resolve, reject) => {
          try {
            const fileData: any[] = [];
            let fileBuffer: Buffer;
            busboy.on('file', async (fieldName: any, file: any, fileName: any, encoding: any, mimetype: any) => {
              // ! File is missing in the request
              if (!fileName)
                reject("File not found!");
    
              let totalBytes: number = 0;
              file.on('data', (chunk: any) => {
                fileData.push(chunk);
                // ! given code is only for logging purpose
                // TODO will remove once project is live
                totalBytes += chunk.length;
                console.log('File [' + fieldName + '] got ' + chunk.length + ' bytes');
              });
    
              file.on('error', (err: any) => {
                reject(err);
              });
    
              file.on('end', () => {
                fileBuffer = Buffer.concat(fileData);
              });
            });
    
            // ? Haa, finally file parsing wen't well
            busboy.on('finish', () => {
              const responseData: ResponseDto = {
                status: true, message: "File parsing done", data: {
                  buffer: fileBuffer,
                  metaData
                }
              };
              resolve(responseData)
              console.log('Done parsing data! -> File uploaded');
            });
            req.pipe(busboy);
          } catch (error) {
            reject(error);
          }
    
        });
      }
    
    const s3FileUpload = async (fileData: any): Promise<ResponseDto> {
        try {
          const params: any = {
            Bucket: <BUCKET_NAME>,
            Key: <path>,
            Body: fileData,
            ContentType: <content_type>,
            ServerSideEncryption: "AES256",
          }; 
          const command = new PutObjectCommand(params);
          const uploadResponse: any = await this.S3.send(command);
          return { status: true, message: "File uploaded successfully", data: uploadResponse };
        } catch (error) {
          const responseData = { status: false, message: "Monitor connection failed, please contact tech support!", error: error.message };
          return responseData;
        }
      }