c++floating-pointprecision

Does the sign affect the precision and accuracy of floating point numbers?


In floating-point arithmetic, if two numbers have the same binary representation, then the result of any operation performed on these numbers should be the same, and equality comparisons using == should work as expected.

For example, if a and b are computed as 1.0/3.0, they will indeed have the same binary representation in a standard floating-point system. Therefore, x and y calculated as follows shall be identical and the assertion shall hold.

double a = 1.0/3.0;
double b = 1.0/3.0;
double x = a*a/Math::Pi;
double y = b*b/Math::Pi;
assert(x==y);

My question is, will the sign of the number affect the accuracy of the results? Will the following be always true?

double a = 1.0/3.0;
double b = -1.0/3.0;
double x = a*a/Math::Pi;
double y = -(-b*b/Math::Pi);
assert(x==y);

How about this? Will the assertion hold?

double a = 1.0/3.0;
double b = 1.0/7.0;
double x = a-b;
double y = -(b-a);
assert(x==y);

I mainly work on x86/x64 machines. I thought C/C++/ASM shall have the same behaviour, so I tagged both C and C++.


Solution

  • As a demonstration of how the rounding mode can matter, consider this code:

    #include <stdio.h>
    #include <fenv.h>
    #pragma STDC FENV_ACCESS ON
    
    int main()
    {
        double a =  1.0 / 3.0;
        double b = -1.0 / 3.0;
        if(a != -b) printf("unexpectedly unequal #1\n");
    
        fesetround(FE_DOWNWARD);
    
        a =  1.0 / 3.0;
        b = -1.0 / 3.0;
        if(a != -b) {
            printf("unexpectedly unequal #2:\n");
            printf("a = % .20f\n", a);
            printf("b = % .20f\n", b);
        }
    }
    

    When compiled under clang v. 14.0.3 on my Mac (as either C or C++) this code does print "unexpectedly unequal #2", with the values of a and b displayed as:

    a =  0.33333333333333331482
    b = -0.33333333333333337035
    

    [In retrospect, I'm impressed this worked the way it did. Either clang is declining to do floating point constant folding at compile time, or it is evaluating the effect of the fesetround call at compile time.]

    [Note, too, that the change to the rounding mode has affected the way printf renders the numbers. Under normal rounding, they would have been 0.33333333333333331483 and -0.33333333333333337034.]


    Update: this example code does not work (does not print "unexpectedly unequal #2") under gcc, I suspect because gcc is going ahead and folding the constants at compile time. Under gcc v. 13.1.0, at least, it suffices to create global variables double one = 1.0; and double three = 3.0; and then use one and three in the various computations of a and b.