I'm working on C++ project (a ray tracer) and I've been going through and attempting to optimize the codebase. I'm not a C++ expert, and I'm sure I've made lot's of beginner mistakes especially around accidental copies (see below). I've been trying to find hot areas of my code but I notice that the top couple of lines of output in callgrind_annotate
are mostly internal memory calls:
--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
195,515,649 PROGRAM TOTALS
--------------------------------------------------------------------------------
Ir file:function
--------------------------------------------------------------------------------
14,526,540 src/matrix.cpp:matrix::operator*(tuple const&) const [./rt.debug.exe]
10,603,101 ???:_platform_memmove$VARIANT$Haswell [/usr/lib/system/libsystem_platform.dylib]
7,481,047 ???:tiny_free_no_lock [/usr/lib/system/libsystem_malloc.dylib]
7,290,995 ???:tiny_free_list_add_ptr [/usr/lib/system/libsystem_malloc.dylib]
6,463,127 ???:szone_malloc_should_clear [/usr/lib/system/libsystem_malloc.dylib]
6,411,076 ???:szone_size [/usr/lib/system/libsystem_malloc.dylib]
6,268,455 src/tuple.cpp:tuple::tuple(double const&, double const&, double const&, double const&) [./rt.debug.exe]
4,726,152 src/scene_objects/scene_object.cpp:scene_object::intersect(ray) [./rt.debug.exe]
4,705,650 ???:tiny_malloc_from_free_list [/usr/lib/system/libsystem_malloc.dylib]
4,415,582 ???:free [/usr/lib/system/libsystem_malloc.dylib]
...
Clearly there's some extra memory allocation occurring that shouldn't that I could remove if I knew how to track it.
So, how can I better track what is causing those extra memory allocations/deallocations?
Note: I'm compiling with -g
and -O3
already.
valgrind provides several ways to track down the code that does a lot of allocation/deallocations.
Among others, the valgrind memcheck and massif tools provide a way to record the complete set of calls to allocation and deallocation functions. The recorded data can then be visualized with tools such as kcachegrind.
For example, with memcheck, you can do:
valgrind --tool=memcheck --xtree-memory=full <your_program> <your_args>
This will then (by default) produce a file xtmemory.kcg., that you can visualise with kcachegrind.
See https://www.valgrind.org/docs/manual/manual-core.html#manual-core.xtree for more information.
The dhat tool also allows to examine which memory is allocated but is not used a lot or long.