giflzw

How do you compress image data for LZW encoding for .GIF files?


I am having trouble understanding how to compress image data for the 89a specification for .gif files. Say for example I am trying to make a 3x2 .GIF. Let me construct a sample color code table and walk through an example [of what I think is correct].

Color code | Color
------------------
0          | Brown
1          | Red
2          | Green
3          | Black

The image I want to create is this.

3x2 pixels (6 pixels total)
----------
Br Br Br
Br R  Br

Compressing with LZW walks me through this process. This is the final code table I get.

Code table
----------
# | code
0 | 0
1 | 1
2 | 2
3 | 3
4 | clear
5 | eoi // end of information
6 | 0 0
7 | 0 0 0
8 | 0 1
9 | 1 0

With an eventual value of 4 0 6 0 1 0 5 that are my codes. Because I wrote out a code 0 0 0, this code value equals 7, so I had to increase my code size from 3 > 4 bits for subsequent codes. So, here are the bytes of my image data (from my code table).

100  - 4
000  - 0
110  - 6
0000 - 0
0001 - 1
0000 - 0
0101 - 5

I end up encoding my image data as

10000100 - 132
00100001 - 33
10100000 - 160
00000000 - 0

Which ends up looking like this in my final .gif file (I've put brackets around the values that correspond to the image data)

47 49 46 38 39 61 03 00 02 00 f1 00 00 b9 7a 56    
ff 00 00 00 ff 00 00 00 00 21 ff 0b 4e 45 54 53 
43 41 50 45 32 2e 30 03 01 ff ff 00 21 f9 04 04 
64 00 00 00 2c 00 00 00 00 03 00 02 00 00 [02 04     
84 21 a0 00 00] 3b 

// Explanation
02 - Minimum LZW code size
04 - Data sub-block of 4 bytes
84 - 132 in decimal
21 - 33 in decimal
a0 - 160 in decimal
00 - 0 in decimal
00 - Termination byte

My image looks something like this (why is there green in here instead of red?). I blew the image up since 2x3 pixels is a bit hard to read.

What the .GIF looks like

Is there something fundamental that I am missing? I appreciate your time to look at this with me.


Solution

  • Found the error, it lies in the code size when compressing LZW image data.

    When you are creating the code table when compressing image data with LZW, you need to increment your code size when you've added a code that equals to 2^(code size). So, instead of incrementing the code size by one after adding code 7 | 0 0 0 (as shown in the table above), I needed to instead increment the code size by one after adding 8 | 0 1 (because 8 = 2^(code size == 3)).

    This is how the image data changes by incrementing the code size as described

    100  - 4
    000  - 0
    110  - 6
    000  - 0
    0001 - 1
    0000 - 0
    0101 - 5
    

    And then, how the resulting image data bytes has changed.

    10000100 - 132
    00010001 - 17
    01010000 - 80
    

    I've put brackets around the data to show a comparison from the full .gif data to show what has changed (after applying the fix). This is the same .gif file from above.

    47 49 46 38 39 61 03 00 02 00 f1 00 00 b9 7a 56 
    ff 00 00 00 ff 00 00 00 00 21 ff 0b 4e 45 54 53
    43 41 50 45 32 2e 30 03 01 ff ff 00 21 f9 04 04
    64 00 00 00 2c 00 00 00 00 03 00 02 00 00 [02 03
    84 11 50 00] 3b 
    
    // Explanation
    02 - Minimum LZW code size
    03 - Data sub-block of 3 bytes
    84 - 132 in decimal
    11 - 17 in decimal
    50 - 80 in decimal
    00 - Termination byte