cdebuggingassemblycompilation

Is compiler in debug mode forced to reload all registers at every line?


I have been studying how debuggers work and studying assembly output of compilers in debug mode. One thing I noticed is that they seem to reload registers at every single line, even if no code could possibly change the value:

// push    rbp
// mov     rbp, rsp
// mov     DWORD PTR [rbp-20], edi
int square(int x) {
// mov     eax, DWORD PTR [rbp-20]
// mov     DWORD PTR [rbp-4], eax
    int result = x;
// mov     eax, DWORD PTR [rbp-4]
// imul    eax, DWORD PTR [rbp-20]
// mov     DWORD PTR [rbp-4], eax
    result *= x;
// mov     eax, DWORD PTR [rbp-4]
    return result;
// pop     rbp
// ret
}

In example above, we can see that x = edi = [rbp-20] and result = eax = [rbp-4]. Between the lines result *= x; and return result; the compiler stores result on the stack only to immediately load it again.

Is this behavior only to facilitate debuggers, so that they work correctly if I set a breakpoint at return result; and modify it, or is there some other reason for it?

If I turn on optimisations, the entire function becomes just imul edi, edi; mov eax, edi; ret; but the debugger jumps over most lines.

The same behavior and equivalent assembly can be seen for all compilers and architectures (GCC/Clang/MSVC, x86/ARM/RISC-V).


Solution

  • There is no such thing as a "debug mode" for compiler. What you are talking about is compilation with optimizations turned off (-O0 for GCC/Clang). Sometimes it is conflated with debug because optimizations tend to mess with debugging experience - the more optimizations are done the harder it is to map resulting code back to the original source (and harder to recover local variables state during debugging from CPU and memory state).

    Compilers usually start by translating C into pseudocode (intermediate representations) in the simplest way possible, and that results in such unoptimal code you see. Compiler would then apply numerous optimization passes to clean that up and remove inefficiencies, but -O0 disables almost all of that. To answer your question: NO, compilers are not forced to reload registers for each statement; it is just a byproduct of the compilers' usual architecture.

    Compiler authors has long recognized a need for some middle ground between full-fledged optimization hell of -O2 and blabbering inefficiency of -O0, and at least GCC and Clang offer -Og optimization level which is specifically designed to be debugger-friendly. For your particular example, the resulting code is exactly the same three instructions as in fully optimized case. However - and this is both funny and tragic - the gdb behaviour does change between the code optimized to -Og (where it skips first line but steps through the rest) and -O2 (where it does skip multiplication completely like you pointed out). Same exact machine code yet different behaviour, because the compiler emitted more debugging information for -Og mode... Not sure why, but I suspect some optimization passes in -O2 might've destroyed some of that information somewhere during IR transformations even though they didn't result in any machine code changes.