compressiongzipdeflateinformation-theorylibz

What's the most that GZIP or DEFLATE can increase a file size?


It's well known that GZIP or DEFLATE (or any compression mechanism) can increase file size sometimes. Is there a maximum (either percentage or constant) that a file can be increased? What is it?

If a file is X bytes, and I'm going to gzip it, and I need to budget for file space in advance - what's the worst case scenario?

UPDATE: There are two overheads: GZIP adds a header, typically 18 bytes but essentially arbitrarily long. What about DEFLATE? That can expand content by a multiplicative factor, which I don't know. Does anyone know what it is?


Solution

  • gzip will add a header and trailer of at least 18 bytes. The header can also contain a path name, which will add that many bytes plus a trailing zero.

    The deflate implementation in gzip has the option to store 16383 bytes per block, with an overhead of five bytes. It will always choose to do so if the alternative would take more bytes. So the maximum number of compressed bytes for n input bytes is:

    n+5(floor(n/16383)+1)