In programming, one often needs to check if a number is odd or even. For that, we usually use:
n % 2 == 0
However, my understanding is that the '%'
operator actually performs a division and returns its remainder; therefore, for the case above, it would be faster to simply check the last bit instead. Let's say n = 5;
5 = 00000101
In order to check if the number is odd or even, we just need to check the last bit. If it's 1
, the number is odd; otherwise, it is even. In programming, it would be expressed like this:
n & 1 == 0
In my understanding this would be faster than % 2
as no division is performed. A mere bit comparison is needed.
I have 2 questions then:
1) Is the second way really faster than the first (in all cases)?
2) If the answer for 1 is yes, are compilers (in all languages) smart enough to convert % 2
into a simple bit comparison? Or do we have to explicitly use the second way if we want the best performance?
Yes, a bit-test is much faster than integer division, by about a factor of 10 to 20, or even 100 for 128bit / 64bit = 64bit idiv on Intel. Esp. since x86 at least has a test
instruction that sets condition flags based on the result of a bitwise AND, so you don't have to divide and then compare; the bitwise AND
is the compare.
I decided to actually check the compiler output on Godbolt, and got a surprise:
return n % 2
from a signed int n
value instead of just testing it for non-zero (if (n % 2)
) produces slower code than return n & 1
. This is because (-1 % 2) == -1
, while (-1 & 1) == 1
, so the compiler can't just use a bitwise AND. Compilers still avoid integer division, though, and use some clever shift / and / add / sub sequence instead to get the -1 / 0 / +1 remainder, because that's still cheaper than an integer division. (gcc and clang use different sequences.)
So if you want to return a 0/non-zero int
value based on n % 2
, return n%2 != 0
. For 2's complement (and sign/magnitude) signed int, that's the same as n&1
. Most of us only care how our code optimizes for 2's complement machines, and C++20 dropped the possibility of sign/magnitude or one's complement, making two's complement the only possibility. (To check the opposite condition, n%2 == 0
. Not n%2 == 1
, that would force it to check the sign bit as well for signed types, if it can't prove the signed int
must be non-negative.)
You get the same benefit from using an unsigned
type, since that takes negative numbers out of the picture. Unsigned /
and %
by powers of 2 are just right shift and bitwise AND, unlike signed where division truncates toward 0 but arithmetic right shift (>>
) rounds toward -infinity. This also lets the compiler always optimize it to a single AND instruction. (On godbolt, you can flip to other architectures, like ARM and PowerPC, and see that the unsigned even
(%
) function and the int even_bit
(bitwise &
) function have the same asm code.)
Using a bool
(which must be 0 or 1, not just any non-zero value) is another option, return n%2
as a bool
is equivalent to return n%2 != 0
. But the compiler will have to do extra work to return (bool) (n % 4)
(or any test other than n%2
) even for unsigned
types. The return n & 3
version of that will be 0, 1, 2, or 3, so the compiler has to turn any non-zero value into a 1. (x86 has an efficient setcc
instruction that sets a register to 0 or 1, depending on the flags, so it's still only 2 instructions instead of 1. Clang/GCC use this, see aligned4_bool
in the godbolt asm output.)
With any optimization level higher than -O0
, gcc and clang optimize if (n%2)
to what we expect. The other huge surprise is that icc 13 doesn't. I don't understand WTF icc thinks it's doing with all those branches.