Here's my code:
int f(double x)
{
return isnan(x);
}
If I #include <cmath>
I get this assembly:
xorl %eax, %eax
ucomisd %xmm0, %xmm0
setp %al
This is reasonably clever: ucomisd sets the parity flag if the comparison of x with itself is unordered, meaning x is NAN. Then setp copies the parity flag into the result (only a single byte, hence the initial clear of %eax
).
But if I #include <math.h>
I get this assembly:
jmp __isnan
Now the code is not inline, and the __isnan
function is certainly no faster the the ucomisd
instruction, so we have incurred a jump for no benefit. I get the same thing if I compile the code as C.
Now if I change the isnan()
call to __builtin_isnan()
, I get the simple ucomisd
instruction instruction regardless of which header I include, and it works in C too. Likewise if I just return x != x
.
So my question is, why does the C <math.h>
header provide a less efficient implementation of isnan()
than the C++ <cmath>
header? Are people really expected to use __builtin_isnan()
, and if so, why?
I tested GCC 4.7.2 and 4.9.0 on x86-64 with -O2
and -O3
optimization.
Looking at <cmath>
for libstdc++ shipped with gcc 4.9 you get this:
constexpr bool
isnan(double __x)
{ return __builtin_isnan(__x); }
A constexpr
function could be aggressively inlined and, of course, the function just delegates the work over to __builtin_isnan
.
The <math.h>
header doesn't use __builtin_isnan
, rather it uses an __isnan
implementation which is kind of long to paste here but it's lines 430 of math.h
on my machineā¢. Since the C99 standard requires using a macro for isnan
et al (section 7.12 of the C99 standard) the 'function' is defined as follows:
#define isnan(x) (sizeof (x) == sizeof (float) ? __isnanf (x) \
: sizeof (x) == sizeof (double) ? __isnan (x) \
: __isnanl (x))
However, I see no reason why it can't use __builtin_isnan
instead of __isnan
so I suspect it's an oversight. As Marc Glisse points out in the comments, there is a relevant bug report for a similar issue using isinf
instead of isnan
.