I'm working on a project to implement a basic nm
using memory-mapping mmap
. I have been able to parse 64-bit binaries using the code:
void handle_64(char *ptr)
{
int ncmds;
struct mach_header_64 *header;
struct load_command *lc;
struct symtab_command *sym;
int i;
i = 0;
header = (struct mach_header_64 *)ptr;
ncmds = header->ncmds;
lc = (void *)ptr + sizeof(*header);
while (i < ncmds)
{
if (lc->cmd == LC_SYMTAB)
{
sym = (struct symtab_command *)lc;
build_list (sym->nsyms, sym->symoff, sym->stroff, ptr);
break;
}
lc = (void *) lc + lc->cmdsize;
i++;
}
}
According to this link the only difference between a mach-o and a fat binary is the fat_header
struct above it, but simply skipping over with
lc = (void *)ptr + sizeof(struct fat_header) + sizeof(struct mach_header_64);
doesn't get me to the load_command area (segfault). How do I access the load commands of a fat/universal binary.
I'm working on a 64-bit Mac running macOS High Sierra. Thank you.
You've got multiple problems:
struct fat_header
.Considering all of that, you need to parse the fat header (and not just ignore it) if you want any hope of getting useful results.
Now, fat_header
is defined as follows:
struct fat_header {
uint32_t magic; /* FAT_MAGIC or FAT_MAGIC_64 */
uint32_t nfat_arch; /* number of structs that follow */
};
Firstly, the magic value that I usually see for fat binaries is FAT_CIGAM
rather than FAT_MAGIC
, despite the comment stating otherwise (take care though - this means that integers in the fat header are big endian rather than little endian!). But secondly, it is indicated that certain structs follow this header, namely:
struct fat_arch {
cpu_type_t cputype; /* cpu specifier (int) */
cpu_subtype_t cpusubtype; /* machine specifier (int) */
uint32_t offset; /* file offset to this object file */
uint32_t size; /* size of this object file */
uint32_t align; /* alignment as a power of 2 */
};
This works the same way a "thin" Mach-O header does with its load commands. fat_arch.offset
is the offset from the very beginning of the file. Following that, it's quite simple to print all slices of a fat Mach-O:
#include <stdio.h>
#include <mach-o/fat.h>
#define SWAP32(x) ((((x) & 0xff000000) >> 24) | (((x) & 0xff0000) >> 8) | (((x) & 0xff00) << 8) | (((x) & 0xff) << 24))
void print_fat_header(void *buf)
{
struct fat_header *hdr = buf;
if(hdr->magic != FAT_CIGAM)
{
fprintf(stderr, "bad magic: %08x\n", hdr->magic);
return;
}
struct fat_arch *archs = (struct fat_arch*)(hdr + 1);
uint32_t num = SWAP32(hdr->nfat_arch);
for(size_t i = 0; i < num; ++i)
{
const char *name = "unknown";
switch(SWAP32(archs[i].cputype))
{
case CPU_TYPE_I386: name = "i386"; break;
case CPU_TYPE_X86_64: name = "x86_64"; break;
case CPU_TYPE_ARM: name = "arm"; break;
case CPU_TYPE_ARM64: name = "arm64"; break;
}
uint32_t off = SWAP32(archs[i].offset);
uint32_t magic = *(uint32_t*)((uintptr_t)buf + off);
printf("%08x-%08x: %-8s (magic %8x)\n", off, off + SWAP32(archs[i].size), name, magic);
}
}
Note that the above function is incomplete, as it does not know the length of buf
and thus cannot and does not check any accessed memory against it. In a serious implementation, you should make sure to never read outside the buffer you're given. The fact that your code segfaulted also hints at it not doing enough data sanitisation.