compressionzlibdeflateinflateminiz

What guarantees does zlib's inflate/deflate make about avail_in and avail_out?


What guarantees does zlib give on the state of avail_in an avail_out after a call to inflate? I am seeing peculiar behaviour with miniz that I want to make sure is not a misunderstanding of the zlib API. Effectively, after calling inflate, I have avail_in non-zero, and avail_out also non-zero, so some input looks like it is not getting processed. More details below.

I have been using miniz to inflate/deflate a file I stream to/from disk. My inflate/deflate loop is identical to the zlib sample in zpipe.c, including using MZ_NO_FLUSH.

This loop has almost always worked, but today I inflated a stream deflated earlier and got an MZ_DATA_ERROR consistently. After adding the proper header though, gzip was able to inflate it fine and my data was intact.

The source of my issues came down to what would be the last call to mz_inflate. I include the typical inflate loop here:

/* decompress until deflate stream ends or end of file */
do {
    strm.avail_in = fread(in, 1, CHUNK, source);
    if (ferror(source)) {
        (void)inflateEnd(&strm);
        return Z_ERRNO;
    }
    if (strm.avail_in == 0)
        break;
    strm.next_in = in;

    /* run inflate() on input until output buffer not full */
    do {
        strm.avail_out = CHUNK;
        strm.next_out = out;
        ret = inflate(&strm, Z_NO_FLUSH);
        assert(ret != Z_STREAM_ERROR);  /* state not clobbered */
        switch (ret) {
        case Z_NEED_DICT:
            ret = Z_DATA_ERROR;     /* and fall through */
        case Z_DATA_ERROR:
        case Z_MEM_ERROR:
            (void)inflateEnd(&strm);
            return ret;
        }
        have = CHUNK - strm.avail_out;
        if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
            (void)inflateEnd(&strm);
            return Z_ERRNO;
        }
    } while (strm.avail_out == 0);

    /* done when inflate() says it's done */
} while (ret != Z_STREAM_END);

The inner do loop repeats until all of the current chunk has been processed and avail_out has extra room. However, on the last chunk of this particular stream, inflate did not return an error, but rather would reduce avail_in to some non-zero number, and would reduce avail_out also to some (other) non-zero number. So the inner do loop exits, as avail_out is non-zero, and we go try and get more data into next_in and avail_in, even though not all of avail_in has been processed, since avail_in is non-zero. This clobbers whatever was in next_in and avail_in and the inflate fails on the next call.

My workaround was to change the inner loop's termination condition from

strm.avail_out == 0

to

strm.avail_out == 0 || strm.avail_in > 0

but I have no idea if this is correct. I feel this may be a bug in miniz but am not sure. I would have thought that if avail_in indicated there was still data to be processed, that avail_out must be zero.

In case it is relevant: the input buffer size I am using is 512KB and the output buffer is 2MB.


Solution

  • If inflate() returns Z_OK or Z_BUF_ERROR, and avail_out is not zero, then avail_in is zero.

    Can you provide the compressed data in question?