javascriptnode.jsaxiosbusboy

Upload byte array from axios to Node server


Background

Javascript library for Microsoft Office add-ins allows you to get raw content of the DOCX file through getFileAsync() api, which returns a slice of up to 4MB in one go. You keep calling the function using a sliding window approach till you have read entire content. I need to upload these slices to the server and the join them back to recreate the original DOCX file.

My attempt

I'm using axios on the client-side and busboy-based express-chunked-file-upload middleware on my node server. As I call getFileAsync recursively, I get a raw array of bytes that I then convert to a Blob and append to FormData before posting it to the node server. The entire thing works and I get the slice on the server. However, the chunk that gets written to the disk on the server is much larger than the blob I uploaded, normally of the order of 3 times, so it is obviously not getting what I sent.

My suspicion is that this may have to do with stream encoding, but the node middleware does not expose any options to set encoding.

Here is the current state of code:

Client-side

public sendActiveDocument(uploadAs: string, sliceSize: number): Promise<boolean> {
  return new Promise<boolean>((resolve) => {
    Office.context.document.getFileAsync(Office.FileType.Compressed,
      { sliceSize: sliceSize },

      async (result) => {
        if (result.status == Office.AsyncResultStatus.Succeeded) {

          // Get the File object from the result.
          const myFile = result.value;
          const state = {
            file: myFile,
            filename: uploadAs,
            counter: 0,
            sliceCount: myFile.sliceCount,
            chunkSize: sliceSize
          } as getFileState;

          console.log("Getting file of " + myFile.size + " bytes");
          const hash = makeId(12)
          this.getSlice(state, hash).then(resolve(true))
        } else {
          resolve(false)
        }
      })
  })
}

private async getSlice(state: getFileState, fileHash: string): Promise<boolean> {
  const result = await this.getSliceAsyncPromise(state.file, state.counter)

  if (result.status == Office.AsyncResultStatus.Succeeded) {

    const data = result.value.data;

    if (data) { 
      const formData = new FormData();
      formData.append("file", new Blob([data]), state.filename);

      const boundary = makeId(12);

      const start = state.counter * state.chunkSize
      const end = (state.counter + 1) * state.chunkSize
      const total = state.file.size

      return await Axios.post('/upload', formData, {
        headers: {
          "Content-Type": `multipart/form-data; boundary=${boundary}`,
          "file-chunk-id": fileHash,
          "file-chunk-size": state.chunkSize,
          "Content-Range": 'bytes ' + start + '-' + end + '/' + total,
        },
      }).then(async res => {
        if (res.status === 200) {
          state.counter++;

          if (state.counter < state.sliceCount) {
            return await this.getSlice(state, fileHash);
          }
          else {
            this.closeFile(state);
            return true
          }
        }
        else {
          return false
        }
      }).catch(err => {
        console.log(err)
        this.closeFile(state)
        return false
      })
    } else {
      return false
    }
  }
  else {
    console.log(result.status);
    return false
  }
}

private getSliceAsyncPromise(file: Office.File, sliceNumber: number): Promise<Office.AsyncResult<Office.Slice>> {
  return new Promise(function (resolve) {
    file.getSliceAsync(sliceNumber, result => resolve(result))
  })
}

Server-side

This code is totally from the npm package (link above), so I'm not supposed to change anything in here, but still for reference:

makeMiddleware = () => {
    return (req, res, next) => {
        const busboy = new Busboy({ headers: req.headers });
        busboy.on('file', (fieldName, file, filename, _0, _1) => {

            if (this.fileField !== fieldName) {  // Current field is not handled.
                return next();
            }

            const chunkSize = req.headers[this.chunkSizeHeader] || 500000;  // Default: 500Kb.
            const chunkId = req.headers[this.chunkIdHeader] || 'unique-file-id';  // If not specified, will reuse same chunk id.
            // NOTE: Using the same chunk id for multiple file uploads in parallel will corrupt the result.

            const contentRangeHeader = req.headers['content-range'];
            let contentRange;

            const errorMessage = util.format(
                'Invalid Content-Range header: %s', contentRangeHeader
            );

            try {
                contentRange = parse(contentRangeHeader);
            } catch (err) {
                return next(new Error(errorMessage));
            }

            if (!contentRange) {
                return next(new Error(errorMessage));
            }

            const part = contentRange.start / chunkSize;
            const partFilename = util.format('%i.part', part);

            const tmpDir = util.format('/tmp/%s', chunkId);
            this._makeSureDirExists(tmpDir);

            const partPath = path.join(tmpDir, partFilename);

            const writableStream = fs.createWriteStream(partPath);
            file.pipe(writableStream);

            file.on('end', () => {
                req.filePart = part;
                if (this._isLastPart(contentRange)) {
                    req.isLastPart = true;
                    this._buildOriginalFile(chunkId, chunkSize, contentRange, filename).then(() => {
                        next();
                    }).catch(_ => {
                        const errorMessage = 'Failed merging parts.';
                        next(new Error(errorMessage));
                    });
                } else {
                    req.isLastPart = false;
                    next();
                }
            });
        });

        req.pipe(busboy);
    };
}

Update

So it looks like I have found the problem at least. busboy appears to be writing my array of bytes as text in the output file. I get 80,75,3,4,20,0,6,0,8,0,0,0,33,0,44,25 (as text) when I upload the array of bytes [80,75,3,4,20,0,6,0,8,0,0,0,33,0,44,25]. Now need to figure out how to force it to write it as a binary stream.


Solution

  • Figured out. Just in case it helps anyone, there was no problem with busboy or office.js or axios. I just had to convert the incoming chunk of data to Uint8Array before creating a blob from it. So instead of:

    formData.append("file", new Blob([data]), state.filename);
    

    like this:

    const blob = new Blob([ new Uint8Array(data) ])
    formData.append("file", blob, state.filename);
    

    And it worked like a charm.