javaamazon-s3compressioninputstreambzip

How do I uncompress .bz file contents and convert it to plain enlish


I have .bz file contents in s3 bucket.

Using S3Object , I'm able to read the file :

S3Object object = s3.getObject(bucket, path);
S3ObjectInputStream inputStream = object.getObjectContent();

Now I want to uncompress this content.

Tried, converting using the below code, but it is still giving me machine readable text but not english text.

    String text = new BufferedReader(
      new InputStreamReader(inputStream, StandardCharsets.UTF_8))
        .lines()
        .collect(Collectors.joining("\n"));

how do I get the uncompressed text here.


Solution

  • You could use a library.

    For example Apache Commons Compress.

    S3Object object = s3.getObject(bucket, path);
    S3ObjectInputStream inputStream = object.getObjectContent();
    
    BZip2CompressorInputStream bzInputStream = new BZip2CompressorInputStream(inputStream);
    
    // Then just write to a string.
    // This is Java 9+.
    
    String plaintext = new String(bzInputStream.readAllBytes(), StandardCharsets.UTF_8);