c++coptimizationnanconstantfolding

Why does GCC implement isnan() more efficiently for C++ <cmath> than C <math.h>?


Here's my code:

int f(double x)
{
  return isnan(x);
}

If I #include <cmath> I get this assembly:

xorl    %eax, %eax
ucomisd %xmm0, %xmm0
setp    %al

This is reasonably clever: ucomisd sets the parity flag if the comparison of x with itself is unordered, meaning x is NAN. Then setp copies the parity flag into the result (only a single byte, hence the initial clear of %eax).

But if I #include <math.h> I get this assembly:

jmp     __isnan

Now the code is not inline, and the __isnan function is certainly no faster the the ucomisd instruction, so we have incurred a jump for no benefit. I get the same thing if I compile the code as C.

Now if I change the isnan() call to __builtin_isnan(), I get the simple ucomisd instruction instruction regardless of which header I include, and it works in C too. Likewise if I just return x != x.

So my question is, why does the C <math.h> header provide a less efficient implementation of isnan() than the C++ <cmath> header? Are people really expected to use __builtin_isnan(), and if so, why?

I tested GCC 4.7.2 and 4.9.0 on x86-64 with -O2 and -O3 optimization.


Solution

  • Looking at <cmath> for libstdc++ shipped with gcc 4.9 you get this:

      constexpr bool
      isnan(double __x)
      { return __builtin_isnan(__x); }
    

    A constexpr function could be aggressively inlined and, of course, the function just delegates the work over to __builtin_isnan.

    The <math.h> header doesn't use __builtin_isnan, rather it uses an __isnan implementation which is kind of long to paste here but it's lines 430 of math.h on my machineā„¢. Since the C99 standard requires using a macro for isnan et al (section 7.12 of the C99 standard) the 'function' is defined as follows:

    #define isnan(x) (sizeof (x) == sizeof (float) ? __isnanf (x)   \
      : sizeof (x) == sizeof (double) ? __isnan (x) \
      : __isnanl (x))
    

    However, I see no reason why it can't use __builtin_isnan instead of __isnan so I suspect it's an oversight. As Marc Glisse points out in the comments, there is a relevant bug report for a similar issue using isinf instead of isnan.