algorithmcompressioninformation-theorylzw

Difficulty understanding decompression with LZW algorithm


I compressed the following message: "ababcbababaaaaaaa" using LZW compression algorithm.

With a=1;b=2;c=3 i get the following message : "1 2 4 3 5 8 1 10 11 1", which matches the result my professor got in our exercise notes.

However, when i try decompressing the message, i get the following sequence of events:

Current: a  ; Next: b  ; Output: a  ; AddToDict: ab=4;

Current: b  ; Next: a  ; Output: b  ; AddToDict: ba=5;

Current: ab ; Next: c  ; Output: ab ; AddToDict: abc=6;

Current: c  ; Next: ba ; Output: c  ; AddToDict: cb=7;

Current: ba ; Next: 8? ; Output: ?  ; AddToDict: ?;

As you can see, my issue is that i dont have 8(which is supposed to be "bab") in the dictionary yet.

What am i doing wrong?

The complete dictionary gained from compression is

(1=a;2=b;3=c;4=ab;5=ba;6=abc;7=cb;8=bab;9=baba;10=aa;11=aaa;12=aaaa)


Solution

  • You should be careful! The first column is the "current sequence", but the second column is the "next character" (not the "next sequence"). Hence, you have a mistake when writing "next: ba". By the way, up to the the last row is correct:

    Current: ba  ; Next: b   ; Output: ba   ; AddToDict: bab = 8;   // here is 5
    Current: bab ; Next: a   ; Output: bab  ; AddToDict: baba = 9;  // here you can find 8
    Current: a   ; Next: a   ; Output: a    ; AddToDict: aa = 10;   // here is 1 
    Current: aa  ; Next: a   ; Output: aa   ; AddToDict: aaa = 11;  // here is 10
    Current: aaa ; Next: a   ; Output: aaa  ; AddToDict: aaaa = 12; // here is 11
    Current: a   ; Next: #   ; Output: a    ; #                     // here is 1