gitblobdeflate

How to DEFLATE with a command line tool to extract a git object?


I'm looking for a command line wrapper for the DEFLATE algorithm.

I have a file (git blob) that is compressed using DEFLATE, and I want to uncompress it. The gzip command does not seem to have an option to directly use the DEFLATE algorithm, rather than the gzip format.

Ideally I'm looking for a standard Unix/Linux tool that can do this.

edit: This is the output I get when trying to use gzip for my problem:

$ cat .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7 | gunzip

gzip: stdin: not in gzip format

Solution

  • UPDATE: Mark Adler noted that git blobs are not raw DEFLATE streams, but zlib streams. These can be unpacked by the pigz tool, which comes pre-packaged in several Linux distributions:

    $ cat foo.txt 
    file foo.txt!
    
    $ git ls-files -s foo.txt
    100644 7a79fc625cac65001fb127f468847ab93b5f8b19 0   foo.txt
    
    $ pigz -d < .git/objects/7a/79fc625cac65001fb127f468847ab93b5f8b19 
    blob 14file foo.txt!
    

    Edit by kriegaex: Git Bash for Windows users will notice that pigz is unavailable by default. You can find precompiled 32/64-bit versions here. I tried the 64-bit version and it works nicely. You can e.g. copy pigz.exe directly to c:\Program Files\Git\usr\bin in order to put it on the path.

    Edit by mjaggard: Homebrew and Macports both have pigz available so you can install with brew install pigz or sudo port install pigz (if you do not have it already, you can install Homebrew by following the instructions on their website)


    My original answer, kept for historical reasons:

    If I understand the hint in the Wikipedia article mentioned by Marc van Kempen, you can use puff.c from zlib directly.

    This is a small example:

    #include <assert.h>
    #include <string.h>
    #include "puff.h"
    
    int main( int argc, char **argv ) {
        unsigned char dest[ 5 ];
        unsigned long destlen = 4;
        const unsigned char *source = "\x4B\x2C\x4E\x49\x03\x00";
        unsigned long sourcelen = 6;    
        assert( puff( dest, &destlen, source, &sourcelen ) == 0 );
        dest[ 4 ] = '\0';
        assert( strcmp( dest, "asdf" ) == 0 );
    }