javascriptamazon-s3gzippako

Browser Javascript: Compress Json to gzip and upload to S3 presigned URL


Any advice would be appreciated. I've got a json variable in my web application that I'd like to gzip and upload to S3 through a presigned URL.

I'm able to upload JSON successfully, but I fail to gzip the JSON and then upload it.

The three separate different ways I've tried to build the gzipped json is:

// example json
const someJson = { testOne: 'a', testTwo: 'b' };

// Attempt one
const stringUtf16 = JSON.stringify(someJson);
const resultAsBinString = pako.gzip(stringUtf16);

// Attempt two
const stringUtf16 = JSON.stringify(someJson);
const resultAsBinString = pako.gzip(stringUtf16, { to: 'string' });

// Attempt three
const stringUtf16ThatWeNeedInUtf8 = JSON.stringify(someJson);
const stringUtf8 = unescape(encodeURIComponent(stringUtf16ThatWeNeedInUtf8));
const resultAsBinString = pako.gzip(stringUtf8);

For each attempt, I uploaded the resultAsBinString through Angular's HTTP client, with the headers Content-Type: 'application/x-gzip' and Content-Encoding: 'gzip'

But when (and if, oftentimes it gives a network error) the file is afterwards downloaded from S3, when trying to unzip with gzip or gunzip in the terminal, an error message is given: 'not in gzip format'

Sources I've tried to follow:

https://github.com/nodeca/pako/issues/55 https://github.com/nodeca/pako/blob/master/examples/browser.html


Solution

  • The following process worked for me:

    Generate the presigned URL with Content-Type: 'application/json'. The provided filename should include the .gz at the end. In the returned presigned URL, scanning through the URL should verify the Content Type is application/json.

    Because I'm certain my JSON contains no strings that would break the conversion to UTF-8, I then do the following (code in Angular, but it conveys the structure):

    const headers = new HttpHeaders({
        'Content-Type': 'application/json',
        'Content-Encoding': 'gzip'
    });  //1
    const httpOptions = {
        headers: headers
    };
    const str = JSON.stringify(geoJson); //2
    const utf8Data = unescape(encodeURIComponent(str)); //3
    const geoJsonGz = pako.gzip(utf8Data); //4
    const gzippedBlob = new Blob([geoJsonGz]); //5
    upload = this.httpClient.put(presignedUploadUrl, gzippedBlob, httpOptions); //6
    

    Steps followed in the code:

    1. The Content Type header is application/json, and the Content-Encoding is gzip.
    2. Stringify the JSON
    3. Convert the string to UTF-8
    4. Gzip the string
    5. Create a file from the zipped data
    6. Upload the file to the presigned URL

    You can then download the gzipped file from S3 (it should automatically be unzipped by the browser) and open it to verify that it contains the same results.