gocompressionfile-formatlzw

Why using unix-compress and go compress/lzw produce different files, not readable by the other decoder?


I compressed a file in a terminal with compress file.txt and got (as expected) file.txt.Z

When I pass that file to ioutil.ReadFile in Go,

buf0, err := ioutil.ReadFile("file.txt.Z")

I get the error (the line above is 116):

finder_test.go:116: lzw: invalid code

I found that Go would accept the file if I compress it using the compress/lzw package, I just used code from a website that does that. I only modified the line

outputFile, err := os.Create("file.txt.lzw")

I changed the .lzw to .Z. then used the resulting file.txt.Z in the Go code at the top, and it worked fine, no error.

Note: file.txt is 16.0 kB, unix-compressed file.txt.Z is 7.8 kB, and go-compressed file.txt.Z is 8.2 kB

Now, I was trying to understand why this happened. So, I tried to run

uncompress.real file.txt.Z

and it did not work. I got

file.txt.Z: not in compressed format

I need to use a compressor (preferably unix-compress) to compress files using lzw-compression then use the same compressed files on two different algorithms, one written in C and the other in Go, because I intend to compare the performance of the two algorithms. The C program will only accept the files compressed with unix-compress and the Go program will only accept the files compressed with Go's compress/lzw.

Can someone explain why that happened? Why are the two .Z files not equivalent? How can I overcome this?

Note: I am working on Ubuntu installed in VirtualBox on a Mac.


Solution

  • A .Z file does not only contain LZW compressed data, there is also a 3-bytes header that the Go LZW code does not generate because it is meant to compress data, not generate a Z file.