c++gccldcallgrind

interpreting _dl_runtime_resolve_xsave'2 in callgrind output


Looking at the output of callgrind for my program run, I see that 125% !!! of the cycles are spent in _dl_runtime_resolve_xsave'2 (apparently part of the dynamic linker) while 100% is spent in main. But it also says that almost all the time spent inside _dl_runtime_resolve_xsave'2 is actually spent in inner methods (self=0%) but callgrind does not show any callees for this method. Moreover, it looks like _dl_runtime_resolve_xsave'2 is called from several places in the program I am profiling.

I can understand that some time could be spent outside of main because the program I am profiling is using the prototype pattern and many objects prototypes are being built when their dynamic library are loaded but this cannot amount anywhere close to 25% of the time of that particular run (because if I do that run with no input data it takes orders of magnitude less time than the run I am profiling now).

Also the program is not using dlopen to open shared objects after the program start. Everything should be loaded at the start.

Here is a screenshot of the kcachegrind window: enter image description here

How can I interpret those calls to _dl_runtime_resolve_xsave'2? Do I need to be concerned by the time spent in this method?

Thank you for your help.


Solution

  • _dl_runtime_resolve_xsave is used in the glibc dynamic loader during lazy binding. It looks up the function symbol during the first call to a function and then performs a tail call to the implementation. Unless you use something like LD_BIND_NOT=1 in the environment when launching the program, this is one-time operation that happens only during the first call to the function. Lazy binding has some cost, but unless you have many functions that are called exactly once, it will not contribute much to the execution cost. It is more likely a reporting artifact, perhaps related to the tail call or the rather exotic XSAVE instruction used in _dl_runtime_resolve_xsave.

    You can disable lazy binding by launching the program with the LD_BIND_NOW=1 environment variable setting, the dynamic loader trampoline will not be used because all functions will be resolved on startup. Alternatively, you can link with -Wl,-z,now to make this change permanent (at least for the code you link, system libraries may still use lazy binding for their own function symbols).