celfcoredump

Access core file memory image programmatically


What is the (correct) way to access the memory image of a process from the corresponding ELF core dump file? In a way that I would be able to examine specific addresses, say 0x12345678.

Bear in mind that no gdb can be used, just pure C approach. Library usage, except of libelf is discouraged.


Solution

  • What is the (correct) way to access the memory image of a process from the corresponding ELF core dump file?

    It's not entirely trivial. Also, the specific address may not even be in the core to begin with.

    Let's consider an example:

    // t.c

    #include <stdio.h>
    #include <stdlib.h>
    
    int main() {
        int i = 42;
        printf("&i = %p\n", &i);
        abort();
    }
    

    Compile it with:

    gcc -g t.c && ulimit -c unlimited && ./a.out
    &i = 0x7fffdfb20e1c
    Aborted (core dumped)
    

    Let's look at the core:

    readelf -l core.19477
    
    Elf file type is CORE (Core file)
    Entry point 0x0
    There are 18 program headers, starting at offset 64
    
    Program Headers:
      Type           Offset             VirtAddr           PhysAddr
                     FileSiz            MemSiz              Flags  Align
      NOTE           0x0000000000000430 0x0000000000000000 0x0000000000000000
                     0x000000000000084c 0x0000000000000000         0
      LOAD           0x0000000000001000 0x0000000000400000 0x0000000000000000
                     0x0000000000001000 0x0000000000001000  R E    1000
      LOAD           0x0000000000002000 0x0000000000600000 0x0000000000000000
                     0x0000000000001000 0x0000000000001000  R      1000
      LOAD           0x0000000000003000 0x0000000000601000 0x0000000000000000
                     0x0000000000001000 0x0000000000001000  RW     1000
      LOAD           0x0000000000004000 0x00007f85c6abf000 0x0000000000000000
                     0x0000000000001000 0x00000000001bb000  R E    1000
      LOAD           0x0000000000005000 0x00007f85c6c7a000 0x0000000000000000
                     0x0000000000000000 0x00000000001ff000         1000
      LOAD           0x0000000000005000 0x00007f85c6e79000 0x0000000000000000
                     0x0000000000004000 0x0000000000004000  R      1000
      LOAD           0x0000000000009000 0x00007f85c6e7d000 0x0000000000000000
                     0x0000000000002000 0x0000000000002000  RW     1000
      LOAD           0x000000000000b000 0x00007f85c6e7f000 0x0000000000000000
                     0x0000000000005000 0x0000000000005000  RW     1000
      LOAD           0x0000000000010000 0x00007f85c6e84000 0x0000000000000000
                     0x0000000000001000 0x0000000000023000  R E    1000
      LOAD           0x0000000000011000 0x00007f85c7084000 0x0000000000000000
                     0x0000000000003000 0x0000000000003000  RW     1000
      LOAD           0x0000000000014000 0x00007f85c70a3000 0x0000000000000000
                     0x0000000000003000 0x0000000000003000  RW     1000
      LOAD           0x0000000000017000 0x00007f85c70a6000 0x0000000000000000
                     0x0000000000001000 0x0000000000001000  R      1000
      LOAD           0x0000000000018000 0x00007f85c70a7000 0x0000000000000000
                     0x0000000000001000 0x0000000000001000  RW     1000
      LOAD           0x0000000000019000 0x00007f85c70a8000 0x0000000000000000
                     0x0000000000001000 0x0000000000001000  RW     1000
      LOAD           0x000000000001a000 0x00007fffdfb00000 0x0000000000000000
                     0x0000000000022000 0x0000000000022000  RW     1000
      LOAD           0x000000000003c000 0x00007fffdfbfc000 0x0000000000000000
                     0x0000000000002000 0x0000000000002000  R E    1000
      LOAD           0x000000000003e000 0xffffffffff600000 0x0000000000000000
                     0x0000000000001000 0x0000000000001000  R E    1000
    

    As you can see, the core contains a NOTE segment, followed by a few LOAD segments.

    The NOTE segment contains a few Elf64_Notes, describing registers at the time of the crash, and other things. It's quite interesting on its own (use readelf -n to examine it), but irrelevant for your specific question here.

    One of the LOAD segments "covers" the address we are interested in: 0x7fffdfb20e1c, this one:

      LOAD           0x000000000001a000 0x00007fffdfb00000 0x0000000000000000
                     0x0000000000022000 0x0000000000022000  RW     1000
    

    Note that it is writable (as one would expect), and

    0x7fffdfb00000 < 0x7fffdfb20e1c < 0x7fffdfb22000 (0x7fffdfb00000+0x22000)
    

    So the &i is located inside of that LOAD segment, at offset

    0x7fffdfb20e1c - 0x7fffdfb00000 == 0x20e1c
    

    The segment itself is located at file offset 0x1a000, which tells us that the value we seek is at file offset 0x1a000 + 0x20e1c == 0x3ae1c.

    Indeed, we find value 42 at that offset in the core:

    hexdump -s 0x3ae1c -n 4  -e  '4/1 "%02X "' core.19477
    2A 00 00 00
    

    So how can you do that programmatically?

    Pretty simple: read Elf64_Ehdr from the beginning of the core. That will tell you offset and number of Elf64_Phdrs. Iterate over them until you find one that "covers" your address. Now compute file offset (as I've done above), lseek(2) to it and read(2) your data.