cembeddedlzma

How LZMA decompression algorithm works?


I am stuck of understanding how lzma decompress algorithm works, more precisely this function

int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen, 
        const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode);

which is in an infinite loop, with a fixed size for SizeT *destLen and SizeT *srcLen which have equal size, logically the destLen(uncompressed data) should be greater than srcLen(compressed data) with a predefined ratio, I didn't get how it works with an equal size or it could accept any size then it had a temp buff that will store data then and treat them with it's method.


Solution

  • This function is difficult to use indeed. It is built for incremental decompression. Thus the envisioned use case is like so:

    So simplified code for decompressing a file and writing the result to another file looks like so:

    
    file_in = open_file(...);
    file_out = open_file(...);
    while (!eof(file_in)) {
       len_in = read(file_in, buf_in, 1000);
       len_out = decompress(buf_in,len_in, buf_out, 2000);
       write(file_out, buf_out, len_out);
    }
    close(file_in);
    close(file_out);
    

    So the relevant parameters of LzmaDec_DecodeToBuf are:

    srLen can be tricky. In most cases, all data passed to the function will be processed. But in case the decompressed data does not fit into the destination buffer, it will only process part of the data. So the decompressed data needs to be written to the file to free up the destination buffer and the function needs to be called again with the remaining data. This has been omitted from the simplified code.

    An additional omission is the finalization. At the end of the input file, LzmaDec_DecodeToBuf might need to be called one last time with a different finish mode.