c++ccomparisonbranch-predictionbranchless

Is using the result of comparison as int really branchless?


I have seen code like this in many answers, and the authors say this is branchless:

template <typename T> 
inline T imax (T a, T b)
{
    return (a > b) * a + (a <= b) * b;
}

But is this really branchless on current architectures? (x86, ARM...) And is there a real standard guarantee that this is branchless?


Solution

  • x86 has the SETcc family of instructions which set a byte register to 1 or 0 depending on the value of a flag. This is commonly used by compilers to implement this kind of code without branches.

    If you use the “naïve” approach

    int imax(int a, int b) {
        return a > b ? a : b;
    }
    

    The compiler would generate even more efficient branch-less code using the CMOVcc (conditional move) family of instructions.

    ARM has the ability to conditionally execute every instruction which allowed the compiler to compile both your and the naïve implementation efficiently, the naïve implementation being faster.