javaencodingzipbytearrayoutputstreamcharset

ByteArrayOutputStream - entries generated with special characters wrongly


I am generating a zip file using ByteArrayOutputStream and ZipOutputStream. Filenames at ZipEntry are OK, with the correct encoding. The problem comes when it is called "ByteArrayOutputStream.toByteArray()" and the zip file is created correctly but zipEntries are generated at "cp866". At controller the return type is ResponseEntity<byte[]>.

Below it is part of the code.

    byteArrayOutputStream = new ByteArrayOutputStream();
    zipOutputStream = new ZipOutputStream(byteArrayOutputStream);
    zipOutputStream.putNextEntry(new ZipEntry("ação.pdf")); //Here it is all OK
    zipOutputStream.write(inputStream.readAllBytes());
    zipOutputStream.closeEntry();

And the code that return to RestController is something like:

    HttpHeaders headers = new HttpHeaders();
    contentDisposition =
        ContentDisposition.builder("attachment").filename("download.zip").build();

    headers.setContentDisposition(contentDisposition);
    headers.setContentLength(byteArrayOutputStream.size());
    headers.setContentType(MediaType.parseMediaType("application/zip; charset-utf-8"));
    return ResponseEntity.ok().headers(headers).body(byteArrayOutputStream.toByteArray());

At this scenario, a file called "ação.pdf" is generated as "a├з├гo.pdf", and if I specify charset ISO8859-1 at ZipEntry, it is generated as "aчуo.pdf".

What I have tried:

Without success in all possibilities


Solution

  • Solution: Use the right ZipOutputStream constructor, the one that takes both an OutputStream and an encoding.

    Sidenote: You have a typo. This:

    headers.setContentType(MediaType.parseMediaType("application/zip; charset-utf-8"));

    is incorrect. It's charset=utf-8. Note the equals sign. However, that's not correct either. ZIP files do not have a charset encoding. It's binary data. JPG files don't have one either, same reason. The correct mime type is just application/zip with nothing more.

    Full explanation

    You cannot convey the charset encoding of entries in your zip file with mime headers. It is encoded in the zip file itself.

    So, encode it in the zip file itself. It's easy to do!

    new ZipOutputStream(byteArrayOutputStream, StandardCharsets.UTF_8);
    

    That's the only update you need to make.