elfportable-executablemach-o

A “universal” binary?


Is there a way to combine ELF, Mach-O and PE compiled code in the general case, so there could be one “binary” file for all modern systems? (Windows, macOS, Linux)

I remember there are stories of old game developers who hand-arranged the bytes of games on floppy disks so they worked on both Amstrad and Amiga devices — I’m wondering if the same could be done for, say, a trivial Go application that has been compiled for each of the modern operating systems (and potentially intel & ARM)

I know Mach-O executables for intel & ARM can be combined into “universal” binaries, which allows the one executable to work on both architectures (by placing both executables in one file), and I can see PE and ELF both support “fat binaries” — could something similar be possible cross-platform?


Solution

  • It depends on whether you're willing to loosen your definition of "binary" or not.

    If you want a binary that can be loaded directly by the respective kernels, with no shell or other kind of interpreter present, then this is not possible. This is due to the simple fact that PE, ELF and Mach-O have strict requirements for the first few bytes of the file, which are mutually exclusive:

    You should be able to create a polyglot that is both a fat Mach-O and a Java class file (I briefly looked into this some time ago, and I think the Java class file format should allow you to encode as much of the Mach-O header in the constant pool, unused).
    You can also create a polyglot of any of the above that is also a dmg file, since those have the "header" at the end of the file (this is how "pkgdmg" works, some files shipped by Apple are both valid pkg/xar files as well as valid dmgs).
    But you cannot stuff any two of PE, ELF and Mach-O into the same binary file header.

    But if you're willing to make use of a shell, then this is very doable. And it has been done, by Justine Tunney: Actually Portable Executable. Quoting the beginning of that blog post:

    One day, while studying old code, I found out that it's possible to encode Windows Portable Executable files as a UNIX Sixth Edition shell script, due to the fact that the Thompson Shell didn't use a shebang line. Once I realized it's possible to create a synthesis of the binary formats being used by Unix, Windows, and MacOS, I couldn't resist the temptation of making it a reality, since it means that high-performance native code can be almost as pain-free as web apps. Here's how it works:

    MZqFpD='
    BIOS BOOT SECTOR'
    exec 7<> $(command -v $0)
    printf '\177ELF...LINKER-ENCODED-FREEBSD-HEADER' >&7
    exec "$0" "$@"
    exec qemu-x86_64 "$0" "$@"
    exit 1
    REAL MODE...
    ELF SEGMENTS...
    OPENBSD NOTE...
    NETBSD NOTE...
    MACHO HEADERS...
    CODE AND DATA...
    ZIP DIRECTORY...
    

    Such a file will run natively on Windows, and on other OSes it will launch as a shell script, which can then detect the host environment to extract and run the correct embedded binary.
    But because it is a shell script (without a shebang line), it will only work from a shell, not if invoked directly with a syscall such as execve(). This would not be the case with a shebang line, but then you just transform the problem into making a polyglot of a (ba)sh script and something else.