I'd like to be able to see how "expensive" each thread in my application is using callgrind. I profiled with the --separate-thread=yes
option which gives you a callgrind file for the whole app and then one per-thread.
This is useful for viewing the profile of any given thread, but what I really want is just a sorted list of CPU time from each thread so I can see which threads are the the biggest hogs.
Valgrind/Callgrind doesn't allow this behaviour. Neither kcachegrind does, but I think it will be a good improvement. Maybe some answers could be found on their mailing-list.
A working but really boring way could be to use option --separate-thread=no
, and update your code to use for each thread a different function name or class name. Depending your code complexity, it could be the answer (using 1computeData(), 2computeData(), ..)