retro-computingmac-classic

What is the structure of a MPW tool's main symbol?


This question is about Mac OS Classic, which has been obsolete for several years now. I hope someone still knows something about it!

I've been building a PEF executable parser for the past few weeks and I've plugged a PowerPC interpreter to it. With a good dose of wizardry, I would expect to be able to run (to some extent) some Mac OS 9 programs under Mac OS X. In fact, I'm now ready to begin testing with small applications.

To help me with that, I have installed an old version of Mac OS inside SheepShaver and downloaded the (now free) MPW Tools1, and I built a "hello world" MPW tool (just your classic puts("Hello World!") C program, except compiled for Mac OS 9).

When built, this generates a program with a code section and a data section. I expected that I would be able to just jump to the main symbol of the executable (as specified in the header of the loader section), but I hit a big surprise: the compiler placed the main symbol inside the data section.

Obviously, there's no executable code in the data section.

Going back to the Mac OS Runtime Architectures document (published in 1997, surprisingly still up on Apple's website), I found out that this is totally legal:

Using the Main Symbol as a Data Structure

As mentioned before, the main symbol does not have to point to a routine, but can point to a block of data instead. You can use this fact to good effect with plug-ins, where the block of data referenced by the main symbol can contain essential information about the plug-in. Using the main symbol in this fashion has several advantages:

  • The Code Fragment Manager returns the address of the main symbol when you programmatically prepare a fragment, so you do not need to call FindSymbol.
  • You do not have to reserve and document the specific name of an export for your plug-in.

However, not having a specific symbol name means that the plug-in’s purpose is not quite as obvious. A plug-in can store its name, icon, or information about its symbols in the main symbol data structure. Storing symbolic information in this fashion eliminates the need for multiple FindSymbol calls.

My conclusion, therefore, is that MPW tools run as plugins inside the MPW shell, and that the executable's main symbol points to some data structure that should tell it how to start.

But that still doesn't help me figure out what's in that data structure, and just looking at its hex dump has not been very instructive (I have an idea where the compiler put the __start address for this particular program, but that's definitely not enough to make a generic MPW shell "replacement"). And obviously, most valuable information sources on this topic seem to have disappeared with Mac OS 9 in 2004.

So, what is the format of the data structure pointed by the main symbol of a MPW tool?

1. Apparently, Apple very recently pulled the plug of the FTP server that I got the MPW Tools from, so it probably is not available anymore; though a google search for "MPW_GM.img.bin" does find some alternatives).


Solution

  • As it turns out, it's not too complicated. That "data structure" is simply a transition vector.

    I didn't realize it right away because of bugs in my implementation of the relocation virtual machine that made these two pointers look like garbage.

    Transition vectors are structures that contain (in this order) an entry point (4 bytes) and a "table of contents" offset (4 bytes). This offset should be loaded into register r2 before executing the code pointed to by the entry point.

    (The Mac OS Classic runtime only uses the first 8 bytes of a transition vector, but they can technically be of any size. The address of the transition vector is always passed in r12 so the callee may access any additional information it would need.)