javahttpjsoupbrotli

How to read a brotli compressed string?


I'm getting a brotli compressed json string from a website. I want to decompress and read it.

When I use input stream from response, I'm able to read it properly using

new BufferedReader(new InputStreamReader(new BrotliInputStream(response.getEntity().getContent())));

Whereas when I have saved the response in a String and reading it

BufferedReader rd = new BufferedReader(new InputStreamReader(new BrotliInputStream(IOUtils.toInputStream(responseAsString, "UTF-8"))));
StringBuilder result = new StringBuilder();
String line = "";
while ((line = rd.readLine()) != null) {
    result.append(line);
}
System.out.println(result);

I'm getting below exception:

Exception in thread "main" java.io.IOException: Brotli stream decoding failed
    at org.brotli.dec.BrotliInputStream.read(BrotliInputStream.java:167)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:161)
    at java.io.BufferedReader.readLine(BufferedReader.java:324)
    at java.io.BufferedReader.readLine(BufferedReader.java:389)
    at com.brotli.test.BrotliStringTest.main(BrotliStringTest.java:113)
Caused by: org.brotli.dec.BrotliRuntimeException: Unused space
    at org.brotli.dec.Decode.readHuffmanCodeLengths(Decode.java:226)
    at org.brotli.dec.Decode.readHuffmanCode(Decode.java:296)
    at org.brotli.dec.HuffmanTreeGroup.decode(HuffmanTreeGroup.java:53)
    at org.brotli.dec.Decode.readMetablockHuffmanCodesAndContextMaps(Decode.java:528)
    at org.brotli.dec.Decode.decompress(Decode.java:621)
    at org.brotli.dec.BrotliInputStream.read(BrotliInputStream.java:161)
    ... 8 more

Edit 1:

I tried using Jsoup and found that it supports only Gzipped streams and doesn't support BrotliInputStream in its HttpConnection class. Any pointers on this?


Solution

  • I had resolved it like this:-

    import org.brotli.dec.BrotliInputStream;
    
    if(response.getLastHeader("content-encoding").getValue().equals("br")) { // check if getting brotli compressed stream
        rd = new BufferedReader(new InputStreamReader(new BrotliInputStream(response.getEntity().getContent())));
    }
    else {
        rd = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
    }
    

    This worked for brotli/non-brotli(gzipped, etc) streams.