I'm using LibLZMA (aka xz-utils) to decompress binary file produced by apache-commons-compression library. I made sure the Java program outputs XZ binary, and I've tried decompress it using 7zip which says its format is lzma2:23 crc64
This is the decompressor function:
void decompress_content(const uint8_t *compressed_data, int compressed_length)
{
lzma_stream strm = LZMA_STREAM_INIT;
lzma_ret ret = lzma_stream_decoder(&strm, UINT32_MAX, LZMA_CONCATENATED);
if (ret != LZMA_OK)
throw std::runtime_error("Failed to initialize XZ decoder.");
std::vector<uint8_t> decompressed_data;
uint8_t outbuf[65536];
strm.next_in = compressed_data;
strm.avail_in = compressed_length;
strm.next_out = outbuf;
strm.avail_out = sizeof(outbuf);
try
{
while (true)
{
ret = lzma_code(&strm, LZMA_RUN);
// Check if output data was produced
if (strm.avail_out < sizeof(outbuf))
{
size_t write_size = sizeof(outbuf) - strm.avail_out;
decompressed_data.insert(decompressed_data.end(), outbuf, outbuf + write_size);
strm.next_out = outbuf;
strm.avail_out = sizeof(outbuf);
}
if (ret == LZMA_STREAM_END)
{
break;
}
if (ret == LZMA_OK)
{
continue;
}
if (ret == LZMA_BUF_ERROR && strm.avail_in == 0)
{
// No more input data, and decoder cannot make progress
lzma_end(&strm);
throw std::runtime_error("Compressed data is truncated or corrupted.");
}
// Other errors
lzma_end(&strm);
handle_lzma_error(ret);
}
}
catch (...)
{
lzma_end(&strm);
throw;
}
lzma_end(&strm);
parse_entries(decompressed_data);
}
By executing it, it encountered such error: Compressed data is truncated or corrupted.
, indicating LZMA_BUF_ERROR
.
XZ binary, if needed:
unsigned char rawData[80] = {
0xFD, 0x37, 0x7A, 0x58, 0x5A, 0x00, 0x00, 0x04, 0xE6, 0xD6, 0xB4, 0x46,
0x02, 0x00, 0x21, 0x01, 0x16, 0x00, 0x00, 0x00, 0x74, 0x2F, 0xE5, 0xA3,
0x01, 0x00, 0x17, 0x68, 0x65, 0x6C, 0x6C, 0x6F, 0x5F, 0x77, 0x6F, 0x72,
0x6C, 0x64, 0x2E, 0x74, 0x78, 0x74, 0x00, 0x00, 0x00, 0x00, 0x04, 0x74,
0x65, 0x73, 0x74, 0x00, 0x78, 0x83, 0x76, 0x08, 0x82, 0x49, 0x3C, 0x9C,
0x00, 0x01, 0x30, 0x18, 0x8E, 0x1B, 0xAC, 0xEC, 0x1F, 0xB6, 0xF3, 0x7D,
0x01, 0x00, 0x00, 0x00, 0x00, 0x04, 0x59, 0x5A
};
You should stop when LZMA_OK is returned but there is no more data. Also, it may be possible that LZMA_BUF_ERROR is returned even when decompression previously succeeded.
// ...
if (ret == LZMA_OK && strm.avail_in == 0) {
break;
} else if (ret == LZMA_OK) {
continue;
}
if (ret == LZMA_BUF_ERROR)
{
// decoder cannot make progress
lzma_end(&strm);
throw std::runtime_error("Compressed data is truncated or corrupted.");
}
// ...
You can see the loop working by setting the output buffer size to 8.
LZMA_BUF_ERROR
This error is not fatal. Coding can be continued normally by providing more input and/or more output space, if possible.
Typically the first call to lzma_code() that can do no progress returns LZMA_OK instead of LZMA_BUF_ERROR. Only the second consecutive call doing no progress will return LZMA_BUF_ERROR. This is intentional.