I was reading the Wikibooks page on optimization and I came across the line:
For pipelined processors, comparisons are slower than differences, because they imply a branch.
Why is it the case that comparisons imply a branch? For example if:
int i = 2;
int x = i<5;
Is there a branch in this? It makes sense to me to branch for if statements with conditionals but I don't understand why a comparison alone causes a branch.
Preamble: Modern compilers are capable of eliminating branches in various ways. Thus, none of the examples necessarily results a branch in the final (assembler or machine) code.
So why does the logic basically imply branches?
The code
bool check_interval_branch(int const i, int const min_i, int const max_i)
{
return min_i <= i && i <= max_i;
}
can be logically rewritten to be:
bool check_interval_branch(int const i, int const min_i, int const max_i)
{
if (min_i <= i)
{
if (i < max_i) return true;
}
return false;
}
Here you obviously have two branches (where the second one is only executed if the first one is true - short circuit) which can be mispredicted by the branch predictor which in turn leads to a reroll of the pipeline.
Visual Studio 2013 (with optimization turned one) generates the following assembly containing two branches for check_interval_branch
:
push ebp
mov ebp, esp
mov eax, DWORD PTR _i$[ebp]
cmp DWORD PTR _min_i$[ebp], eax // comparison
jg SHORT $LN3@check_inte // conditional jump
cmp eax, DWORD PTR _max_i$[ebp] // comparison
jg SHORT $LN3@check_inte // conditional jump
mov al, 1
pop ebp
ret 0
$LN3@check_inte:
xor al, al
pop ebp
ret 0
The code
bool check_interval_diff(int const i, int const min_i, int const max_i)
{
return unsigned(i - min_i) <= unsigned(max_i - min_i);
}
is logically identical to
bool check_interval_diff(int const i, int const min_i, int const max_i)
{
if (unsigned(i – min_i) <= unsigned(max_i – min_i)) { return true; }
return false;
}
which contains only a single branch but executes two differences.
The generated code for check_interval_diff
of Visual Studio 2013 doesn't even contain a conditional jump.
push ebp
mov ebp, esp
mov edx, DWORD PTR _i$[ebp]
mov eax, DWORD PTR _max_i$[ebp]
sub eax, DWORD PTR _min_i$[ebp]
sub edx, DWORD PTR _min_i$[ebp]
cmp eax, edx // comparison
sbb eax, eax
inc eax
pop ebp
ret 0
(The trick here is that the subtraction done by sbb
is different by 1 according to the carry flag which in turn is set to 1 or 0 by the cmp
instruction.)
In fact you see three differences here (2x sub
, 1x sbb
).
See Mysticals answer here about branch prediction.
Your code int x = i<5;
is logically identical to
int x = false;
if (i < 5)
{
x = true;
}
which again contains a branch (x = true
only executed if i < 5
.)