The switch statements in the following two functions
int foo(int value) {
switch (value) {
case 0:
return 0;
case 1:
return 0;
case 2:
return 1;
}
}
int bar(int value) {
switch (value) {
case 0:
case 1:
return 0;
case 2:
return 1;
}
}
compile to different assembly in the x86-64 and RISC-V editions of the GNU C Compiler, when not optimizing.
Why? Is there any way to subvert this? I am using consteval lambdas in function templates to programmatically generate the return values of functions like this per output and they have a lot more assembly instructions than if I went in and hand-wrote all of the output handling and made all the identical output handlings into one case-case-case chain.
Edit: There are no compile flags used.
Since you used no compile flags, you get the default compiler optimization setting, which is "no optimization" (-O0
). From the GCC manual, section "Optimization Options":
Without any optimization option, the compiler’s goal is to reduce the cost of compilation and to make debugging produce the expected results. Statements are independent: if you stop the program with a breakpoint between statements, you can then assign a new value to any variable or change the program counter to any other statement in the function and get exactly the results you expect from the source code.
Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.
Also repeated below:
-O0
Reduce compilation time and make debugging produce the expected results. This is the default.
So, because you have not enabled optimizations, the compiler does not perform the optimization of coalescing the identical code between the two cases.
Indeed, in order to attain the -O0
goal of "make debugging produce expected results", the compiler must not coalesce these blocks of code. If you are debugging the function foo
and you set a breakpoint on line 4 (the return 0
in the 0:
case), you expect the breakpoint to be hit if foo(0)
is called, but not if foo(1)
is called (since abstractly foo(1)
never executes line 4). But if the compiler had coalesced the two return
statements, then a breakpoint set on one of them would be hit in both cases, which is not "expected results".
Conversely, in the function bar
, the compiler must emit a single block of code for the return
statement that covers both the 0:
and 1:
cases, because if you set a breakpoint on line 16, you expect it to be hit if either bar(0)
or bar(1)
is called.
See Why does clang produce inefficient asm with -O0 (for this simple floating point sum)? for more explanation of why unoptimized compilation tends to result in very inefficient looking assembly code.
If you do turn on optimization (-O
, -O2
, etc), then the compiler will indeed merge the two blocks of code, as it is no longer required to ensure that every line of source code matches a unique location in the machine code. In fact, what it does is to reduce both functions to the equivalent of return x > 1;
. The tradeoff is that breakpoints no longer work in an intuitive way. In fact, with optimizations on, when I set a breakpoint at line 4 of foo
, it was never hit at all.