I have a program that is unexpectedly using a large amount of heap (about 3GB). I ran it through valgrind memcheck which reported no leaks, claiming that all the heap memory is still reachable.
So I rebuilt all my libraries with debug options, and ran the prog through valgrind massif. I am using Valgrind-3.8.1 which I just downloaded and built on my box today. The command line was:
valgrind --tool=massif myprog
Valgrind produced no errors or warnings. The resulting output file is reporting all the allocated memory, but all the stack traces for the large allocations fail to identify the function names or code locations, e.g.:
97.34% (2,595,141,447B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->88.67% (2,363,882,948B) 0xCA6ACC0: ???
| ->88.67% (2,363,882,948B) 0xC7D7A71: ???
| ->88.67% (2,363,882,948B) 0xC7D705E: ???
| ->88.67% (2,363,882,948B) 0xA6ACB65: ???
| ->59.01% (1,573,247,120B) 0xA6AC9BA: ???
| | ->59.01% (1,573,247,120B) 0x9410C08: ???
| | ->59.01% (1,573,247,120B) 0x94123E2: ???
| | ->59.01% (1,573,247,120B) 0x940B3E9: ???
| | ->59.01% (1,573,247,120B) 0x9428BC0: ???
| | ->59.01% (1,573,247,120B) 0x98B0564: ???
| | ->59.01% (1,573,247,120B) 0x9AF0DA0: ???
| | ->59.01% (1,573,247,120B) 0x9AF09BE: ???
| | ->59.01% (1,573,247,120B) 0x9AF0E6C: ???
| | ->59.01% (1,573,247,120B) 0x4CE6438: run_S (Thread.cpp:98)
| | ->59.01% (1,573,247,120B) 0x3A9A40683B: start_thread (in /lib64/libpthread-2.5.so)
| | ->59.01% (1,573,247,120B) 0x3A994D503B: clone (in /lib64/libc 2.5.so)
I am a bit stuck now. I wondered if the libraries I have built actually did not have debug enabled - but when I run my code in gdb it does appear to have all the debug info. Also, there are a few other (much smaller) memory allocation results in the massif output that identify a function name and location from my code.
Do these results indicate stack traces in system or external libraries? Is that why there is no info? Can anyone suggest how I can track these allocations down?
Think the answer is RTFM...see the valgrind FAQ section 4.2:
Also, for leak reports involving shared objects, if the shared object is unloaded before the program terminates, Valgrind will discard the debug information and the error message will be full of ??? entries. The workaround here is to avoid calling dlclose on these shared objects.
My code does indeed explicitly unload its shared libs before exit. I am rebuilding with the library unloads suppressed - hope for a better result :)