javahttpgzipinputstreamhttp-range

Can't get Java's GZIPInputStream to read "gzip" response from server when using "Range" header


I've got a bit of code I've been using for a while to fetch data from a web server and a few months ago, I added compression support which seems to be working well for "regular" HTTP responses where the whole document is contained in the response. It does not seem to be working when I use a Range header, though.

Here is the code doing the real work:

    InputStream in = null;

    int bufferSize = 4096;

    int responseCode = conn.getResponseCode();

    boolean error = 5 == responseCode / 100
        || 4 == responseCode / 100;

    int bytesRead = 0;

    try
    {
        if(error)
            in = conn.getErrorStream();
        else
            in = conn.getInputStream();

        // Buffer the input
        in = new BufferedInputStream(in);

        // Handle compressed responses
        if("gzip".equalsIgnoreCase(conn.getHeaderField("Content-Encoding")))
            in = new GZIPInputStream(in);
        else if("deflate".equalsIgnoreCase(conn.getHeaderField("Content-Encoding")))
            in = new InflaterInputStream(in, new Inflater(true));

        int n;
        byte[] buffer = new byte[bufferSize];

        // Now, just write out all the bytes
        while(-1 != (n = in.read(buffer)))
        {
            bytesRead += n;
            out.write(buffer, 0, n);
        }
    }
    catch (IOException ioe)
    {
        System.err.println("Got IOException after reading " + bytesRead + " bytes");
        throw ioe;
    }
    finally
    {
        if(null != in) try { in.close(); }
        catch (IOException ioe)
        {
            System.err.println("Could not close InputStream");
            ioe.printStackTrace();
        }
    }

Hitting a URL with the header Accept-Encoding: gzip,deflate,identity works just great: I can see that the data is returned by the server in compressed format, and the above code decompressed it nicely.

If I then add a Range: bytes=0-50 header, I get the following exception:

Got IOException after reading 0 bytes
Exception in thread "main" java.io.EOFException: Unexpected end of ZLIB input stream
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:116)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at [my code]([my code]:511)

Line 511 in my code is the line containing the in.read() call. The response includes the following headers:

Content-Type: text/html
Content-Encoding: gzip
Content-Range: bytes 0-50/751
Content-Length: 51

I have verified that, if I don't attempt to decompress the response, I actually get 51 bytes in the response... it's not a server failure (at least that I can tell). My server (Apache httpd) does not support "deflate", so I can't test another compression scheme (at least not right now).

I've also tried to request much more data (like 700 bytes of the total 751 bytes in the target resource) and I get the same kind of error.

Is there something I'm missing?

Update Sorry, I forgot to include that I'm hitting Apache/2.2.22 on Linux. There aren't any server bugs I'm aware of. I'll have a bit of trouble verifying the compressed bytes that I retrieve from the server, as the "gzip" Content-Encoding is quite bare... e.g. I believe I can't just use "gunzip" on the command-line to decompress those bytes. I'll give it a try, though.


Solution

  • Sigh switching to another server (happens to be running Apache/2.2.25) shows that my code does in fact work. The original target server appears to be affected by AWS's current outage in the US-EAST availability zone. I'm going to chalk this up to network errors and close this question. Thanks to those who offered suggestions.