I compressed the following message: "ababcbababaaaaaaa" using LZW compression algorithm.
With a=1;b=2;c=3 i get the following message : "1 2 4 3 5 8 1 10 11 1", which matches the result my professor got in our exercise notes.
However, when i try decompressing the message, i get the following sequence of events:
Current: a ; Next: b ; Output: a ; AddToDict: ab=4;
Current: b ; Next: a ; Output: b ; AddToDict: ba=5;
Current: ab ; Next: c ; Output: ab ; AddToDict: abc=6;
Current: c ; Next: ba ; Output: c ; AddToDict: cb=7;
Current: ba ; Next: 8? ; Output: ? ; AddToDict: ?;
As you can see, my issue is that i dont have 8(which is supposed to be "bab") in the dictionary yet.
What am i doing wrong?
The complete dictionary gained from compression is
(1=a;2=b;3=c;4=ab;5=ba;6=abc;7=cb;8=bab;9=baba;10=aa;11=aaa;12=aaaa)
You should be careful! The first column is the "current sequence", but the second column is the "next character" (not the "next sequence"). Hence, you have a mistake when writing "next: ba". By the way, up to the the last row is correct:
Current: ba ; Next: b ; Output: ba ; AddToDict: bab = 8; // here is 5
Current: bab ; Next: a ; Output: bab ; AddToDict: baba = 9; // here you can find 8
Current: a ; Next: a ; Output: a ; AddToDict: aa = 10; // here is 1
Current: aa ; Next: a ; Output: aa ; AddToDict: aaa = 11; // here is 10
Current: aaa ; Next: a ; Output: aaa ; AddToDict: aaaa = 12; // here is 11
Current: a ; Next: # ; Output: a ; # // here is 1