memorygdbmassif

examining virtual memory block reported by pmap


I am seeing a leak in my program. It does not get caught with "valgrind memcheck" ( I confirmed this with summary report, it was no were near the top usage I can see). I could get something closer to my memory usage upon using "valgrind massif --pages-as-heap". However it does not report a full Traceback for the portion thats does mmap and allocates big portions of memory and I can't do a examine of memory allocation also because I can collect massif output only after program is killed. Another thing i tried was to inspect the memory blocks taking lot of RSS space. However I don't know how to look at the contents of the memory block reported by pmap. putting that addr on gdb dint help. I heard some address randomization is used by gdb. Can some one help me how to get the symbol that corresponds to the memory location reported by pmap output.


Solution

  • putting that addr on gdb dint help.

    I don't know what you mean by "putting that addr on gdb", but doing that correctly will help.

    I heard some address randomization is used by gdb.

    You heard wrong: GDB doesn't do any randomization by itself, and it (by default) disables randomization that OS performs, so as to make debugging easier and more reproducible.

    Can some one help me how to get the symbol that corresponds to the memory location reported by pmap output.

    You are confused: heap allocated memory doesn't have any symbols by definition.

    Ok, so let's work through example of examinining memory that is visible in pmap with GDB. Let's start by compiling this program, which builds a 1 million long linked list with some strings in it:

    #include <stdlib.h>
    #include <stdio.h>
    #include <unistd.h>
    
    typedef struct Node { struct Node *next; char payload[64]; } Node;
    
    int main()
    {
      int j;   
      Node *head = NULL;
    
      for (j = 0; j < 1000000; j++) {
        Node *n = malloc(sizeof(*n));
        n->next = head;
        sprintf(n->payload, "string %d", j);
        head = n;
      }
      return 0;
    }
    
    gcc -Wall -g -std=c99 t.c && gdb -q ./a.out
    
    (gdb) b 17
    Breakpoint 1 at 0x4005e3: file t.c, line 17.
    (gdb) r
    Starting program: /tmp/a.out
    
    Breakpoint 1, main () at t.c:17
    17    return 0;
    

    Now we can examine the program with pmap:

    (gdb) info prog
        Using the running image of child process 23785.
    Program stopped at 0x4005e3.
    It stopped at breakpoint 1.
    Type "info stack" or "info registers" for more information.
    (gdb) shell pmap 23785
    23785:   /tmp/a.out
    0000000000400000      4K r-x-- a.out
    0000000000600000      4K r---- a.out
    0000000000601000      4K rw--- a.out
    0000000000602000  78144K rw---   [ anon ]
    00007ffff7a11000   1784K r-x-- libc-2.19.so
    00007ffff7bcf000   2048K ----- libc-2.19.so
    00007ffff7dcf000     16K r---- libc-2.19.so
    00007ffff7dd3000      8K rw--- libc-2.19.so
    00007ffff7dd5000     20K rw---   [ anon ]
    00007ffff7dda000    140K r-x-- ld-2.19.so
    00007ffff7fd1000     12K rw---   [ anon ]
    00007ffff7ff6000      8K rw---   [ anon ]
    00007ffff7ff8000      8K r----   [ anon ]
    00007ffff7ffa000      8K r-x--   [ anon ]
    00007ffff7ffc000      4K r---- ld-2.19.so
    00007ffff7ffd000      4K rw--- ld-2.19.so
    00007ffff7ffe000      4K rw---   [ anon ]
    00007ffffffde000    132K rw---   [ stack ]
    ffffffffff600000      4K r-x--   [ anon ]
     total            82356K
    

    It seems pretty clear that the anon space of 78MiB starting at 0x602000 must be where most of our data is. (You can also verify this by stepping a few times through the loop.)

    How can we look at this data? Like so:

    (gdb) x/30gx 0x602000
    0x602000:   0x0000000000000000  0x0000000000000051
    0x602010:   0x0000000000000000  0x3020676e69727473
    0x602020:   0x0000000000000000  0x0000000000000000
    0x602030:   0x0000000000000000  0x0000000000000000
    0x602040:   0x0000000000000000  0x0000000000000000
    0x602050:   0x0000000000000000  0x0000000000000051
    0x602060:   0x0000000000602010  0x3120676e69727473
    0x602070:   0x0000000000000000  0x0000000000000000
    0x602080:   0x0000000000000000  0x0000000000000000
    0x602090:   0x0000000000000000  0x0000000000000000
    0x6020a0:   0x0000000000000000  0x0000000000000051
    0x6020b0:   0x0000000000602060  0x3220676e69727473
    0x6020c0:   0x0000000000000000  0x0000000000000000
    0x6020d0:   0x0000000000000000  0x0000000000000000
    0x6020e0:   0x0000000000000000  0x0000000000000000
    

    Immediately you can notice that at 0x602018, at 0x602068 and at 0x6020b8 there are ASCII strings.

    You can examine these strings like so:

    (gdb) x/s 0x602018
    0x602018:   "string 0"
    (gdb) x/s 0x602068
    0x602068:   "string 1"
    (gdb) x/s 0x6020b8
    0x6020b8:   "string 2"
    

    You can also notice that at 0x602060 there is a pointer to 0x602010, and at 0x6020b0 there is a pointer to 0x602060.

    That gives you a guess that there is a Node at 0x602060, and another at 0x6020b0. You can confirm this guess:

    (gdb) p *(Node*)0x602060
    $1 = {next = 0x602010, payload = "string 1", '\000' <repeats 55 times>}
    (gdb) p *(Node*)0x6020b0
    $2 = {next = 0x602060, payload = "string 2", '\000' <repeats 55 times>}
    

    And that's all there is to it.