algorithmbinarycompressionamiga

Decoding a file compressed with an obsolete language


I'm trying to decompress a data file that was originally compressed with an extension for AMOS Pro, the old Amiga BASIC language, that shipped with the AMOS Pro compiler. I've still got the programming language and have access to the compressor and decompressor, but I'm trying to decompress the files using C. I ultimately want to be able to view these files on modern hardware without having to resort to using an Amiga emulator first.

However, there's no documentation as to how the compressor worked, so I'm trying to reverse-engineer it solely from watching its behaviour. Here's what I've got so far.

This is a raw file (ASCII):

AABCDEFGHIJKLMNOPQRSTUVWXYZAABCDEFGHIJKLMNOPQRSTUVWXYZAABCDEFGHIJKLMNOPQRSTUVWXYZ

Here's the compressed version (hex):

D802C6B5
05048584
4544C5C4
2524A5A4
6564E5E4
15149594
5554D5D4
3534B591
00000007
AD763363
00000051

Testing with various files has given me to a few insights:

The first 4 byte seems to represent a sequence length. In the above example, the value 0xD8 is 11011000 in binary; mirror it (bits are in reverse) and you'll get 00011011, which is 0x1B in hex or 27 in decimal. That matches the sequence length.

However, I'm not making any more progress. Does this look like a standard compression algorithm? What do I try next?


Solution

  • As you've posted here, the compression function is called "squash", a function part of AMOS Pro.

    As such, my advice would be to try one of the following lines of attack:

    Read the source code

    The source code for AMOS Pro is apparently in the public domain now and can be found here:

    http://www.pianetaamiga.it/downloads/AMOSPro_Sources.zip

    It consists of 68000 assembly code and quite a few compiled object files.

    The unsquash function can be found in the file +header.s on line 1061 and onwards. It is not documented, except for its entry register values, which is good at least. It doesn't appear to be a very large function so this might be worth a shot.

    You will need to have, or obtain/learn, rudimentary 68000 machine code. It does not appear to call out to system libraries or anything and only seem to operate directly on memory, which would suggest this is actually doable (ie. understanding the code). Still, I've never written or read 68000 code in my life so what do I know.

    Contact the author of AMOS Pro

    The author of AMOS Pro is François Lionet, as is evident by the User Guide, he founded Clickteam in the mid-90s to make game- and multimedia-making software. He still seems to be situated in that company and according to forum posts from others looking into AMOS Pro he seems to be willing to answer email. Sadly I don't know his email but the Clickteam website above should give you a starting point.