I compiled paranoia floating point test suit on a pc386 system using GCC O2 level of optimization and got several failures but then compiled it without optimization with the same GCC and got correct result. I read about the flags which are enabled in O2 but none seems to be problematic. What may be the cause? The paranoia code can be found here and this is the taken output with O2 optimization :
*** PARANOIA TEST ***
paranoia version 1.1 [cygnus]
Program is now RUNNING tests on small integers:
TEST: 0+0 != 0, 1-1 != 0, 1 <= 0, or 1+1 != 2
PASS: 0+0 != 0, 1-1 != 0, 1 <= 0, or 1+1 != 2
TEST: 3 != 2+1, 4 != 3+1, 4+2*(-2) != 0, or 4-3-1 != 0
PASS: 3 != 2+1, 4 != 3+1, 4+2*(-2) != 0, or 4-3-1 != 0
TEST: -1+1 != 0, (-1)+abs(1) != 0, or -1+(-1)*(-1) != 0
PASS: -1+1 != 0, (-1)+abs(1) != 0, or -1+(-1)*(-1) != 0
TEST: 1/2 + (-1) + 1/2 != 0
PASS: 1/2 + (-1) + 1/2 != 0
TEST: 9 != 3*3, 27 != 9*3, 32 != 8*4, or 32-27-4-1 != 0
PASS: 9 != 3*3, 27 != 9*3, 32 != 8*4, or 32-27-4-1 != 0
TEST: 5 != 4+1, 240/3 != 80, 240/4 != 60, or 240/5 != 48
PASS: 5 != 4+1, 240/3 != 80, 240/4 != 60, or 240/5 != 48
-1, 0, 1/2, 1, 2, 3, 4, 5, 9, 27, 32 & 240 are O.K.
Searching for Radix and Precision.
Radix = 2.000000 .
Closest relative separation found is U1 = 5.4210109e-20 .
Recalculating radix and precision
confirms closest relative separation U1 .
Radix confirmed.
TEST: Radix is too big: roundoff problems
PASS: Radix is too big: roundoff problems
TEST: Radix is not as good as 2 or 10
PASS: Radix is not as good as 2 or 10
TEST: (1-U1)-1/2 < 1/2 is FALSE, prog. fails?
ERROR: Severity: FAILURE: (1-U1)-1/2 < 1/2 is FALSE, prog. fails?.
PASS: (1-U1)-1/2 < 1/2 is FALSE, prog. fails?
TEST: Comparison is fuzzy,X=1 but X-1/2-1/2 != 0
PASS: Comparison is fuzzy,X=1 but X-1/2-1/2 != 0
The number of significant digits of the Radix is 64.000000 .
TEST: Precision worse than 5 decimal figures
PASS: Precision worse than 5 decimal figures
TEST: Subtraction is not normalized X=Y,X+Z != Y+Z!
PASS: Subtraction is not normalized X=Y,X+Z != Y+Z!
Subtraction appears to be normalized, as it should be.
Checking for guard digit in *, /, and -.
TEST: * gets too many final digits wrong.
PASS: * gets too many final digits wrong.
TEST: Division lacks a Guard Digit, so error can exceed 1 ulp
or 1/3 and 3/9 and 9/27 may disagree
PASS: Division lacks a Guard Digit, so error can exceed 1 ulp
or 1/3 and 3/9 and 9/27 may disagree
TEST: Computed value of 1/1.000..1 >= 1
PASS: Computed value of 1/1.000..1 >= 1
TEST: * and/or / gets too many last digits wrong
PASS: * and/or / gets too many last digits wrong
TEST: - lacks Guard Digit, so cancellation is obscured
ERROR: Severity: SERIOUS DEFECT: - lacks Guard Digit, so cancellation is obscured.
PASS: - lacks Guard Digit, so cancellation is obscured
Checking rounding on multiply, divide and add/subtract.
TEST: X * (1/X) differs from 1
PASS: X * (1/X) differs from 1
* is neither chopped nor correctly rounded.
/ is neither chopped nor correctly rounded.
TEST: Radix * ( 1 / Radix ) differs from 1
PASS: Radix * ( 1 / Radix ) differs from 1
TEST: Incomplete carry-propagation in Addition
PASS: Incomplete carry-propagation in Addition
Addition/Subtraction neither rounds nor chops.
Sticky bit used incorrectly or not at all.
TEST: lack(s) of guard digits or failure(s) to correctly round or chop
(noted above) count as one flaw in the final tally below
ERROR: Severity: FLAW: lack(s) of guard digits or failure(s) to correctly round or chop
(noted above) count as one flaw in the final tally below.
PASS: lack(s) of guard digits or failure(s) to correctly round or chop
(noted above) count as one flaw in the final tally below
Does Multiplication commute? Testing on 20 random pairs.
No failures found in 20 integer pairs.
Running test of square root(x).
TEST: Square root of 0.0, -0.0 or 1.0 wrong
PASS: Square root of 0.0, -0.0 or 1.0 wrong
Testing if sqrt(X * X) == X for 20 Integers X.
Test for sqrt monotonicity.
ERROR: Severity: DEFECT: sqrt(X) is non-monotonic for X near 2.0000000e+00 .
Testing whether sqrt is rounded or chopped.
Square root is neither chopped nor correctly rounded.
Observed errors run from -5.5000000e+00 to 5.0000000e-01 ulps.
TEST: sqrt gets too many last digits wrong
ERROR: Severity: SERIOUS DEFECT: sqrt gets too many last digits wrong.
PASS: sqrt gets too many last digits wrong
Testing powers Z^i for small Integers Z and i.
ERROR: Severity: DEFECT: computing
(1.30000000000000000e+01) ^ (1.70000000000000000e+01)
yielded 8.65041591938133811e+18;
which compared unequal to correct 8.65041591938133914e+18 ;
they differ by -1.02400000000000000e+03 .
Errors like this may invalidate financial calculations
involving interest rates.
Similar discrepancies have occurred 5 times.
Seeking Underflow thresholds UfThold and E0.
ERROR: Severity: FAILURE: multiplication gets too many last digits wrong.
Smallest strictly positive number found is E0 = 0 .
ERROR: Severity: FAILURE: Either accuracy deteriorates as numbers
approach a threshold = 0.00000000000000000e+00
coming down from 0.00000000000000000e+00
or else multiplication gets too many last digits wrong.
The Underflow threshold is 0.00000000000000000e+00, below which
calculation may suffer larger Relative error than merely roundoff.
Since underflow occurs below the threshold
UfThold = (2.00000000000000000e+00) ^ (-inf)
only underflow should afflict the expression
(2.00000000000000000e+00) ^ (-inf);
actually calculating yields: 0.00000000000000000e+00 .
This computed value is O.K.
Testing X^((X + 1) / (X - 1)) vs. exp(2) = 7.38905609893065041e+00 as X -> 1.
ERROR: Severity: DEFECT: Calculated 1.00000000000000000e+00 for
(1 + (0.00000000000000000e+00) ^ (inf);
differs from correct value by -6.38905609893065041e+00 .
This much error may spoil financial
calculations involving tiny interest rates.
Testing powers Z^Q at four nearly extreme values.
... no discrepancies found.
Searching for Overflow threshold:
This may generate an error.
Can `Z = -Y' overflow?
Trying it on Y = -inf .
finds a ERROR: Severity: FLAW: -(-Y) differs from Y.
Overflow threshold is V = -inf .
Overflow saturates at V0 = inf .
No Overflow should be signaled for V * 1 = -inf
nor for V / 1 = -inf .
Any overflow signal separating this * from the one
above is a DEFECT.
ERROR: Severity: FAILURE: Comparisons involving +--inf, +-inf
and +-0 are confused by Overflow.
ERROR: Severity: SERIOUS DEFECT: X / X differs from 1 when X = 1.00000000000000000e+00
instead, X / X - 1/2 - 1/2 = 1.08420217248550443e-19 .
ERROR: Severity: SERIOUS DEFECT: X / X differs from 1 when X = -inf
instead, X / X - 1/2 - 1/2 = nan .
ERROR: Severity: SERIOUS DEFECT: X / X differs from 1 when X = 0.00000000000000000e+00
instead, X / X - 1/2 - 1/2 = nan .
What message and/or values does Division by Zero produce?
Trying to compute 1 / 0 produces ... inf .
Trying to compute 0 / 0 produces ... nan .
The number of FAILUREs encountered = 4.
The number of SERIOUS DEFECTs discovered = 5.
The number of DEFECTs discovered = 3.
The number of FLAWs discovered = 2.
The arithmetic diagnosed has unacceptable Serious Defects.
Potentially fatal FAILURE may have spoiled this program's subsequent diagnoses.
END OF TEST.
*** END OF PARANOIA TEST ***
EXECUTIVE SHUTDOWN! Any key to reboot...
Optimization and the -O2
is not the primary culprit here. The test suite you are running can fail in a C implementation with other optimization scenarios. The primary problem in this case appears to be that the Paranoia test is testing whether floating-point arithmetic is consistent and has various properties, but the floating-point arithmetic in the C implementation you are using is not consistent because sometimes it uses 80-bit arithmetic and sometimes it uses 64-bit arithmetic (or an approximation to it, such as using 80-bit arithmetic but rounding results to 64-bit floating-point).
Initially, the test finds a number U1
such that 1-U1
differs from 1
, and there are no representable values between 1-U1
and 1
. That is, U1
is the step size from 1
down to the next representable value in the floating-point format. In your case, the test finds that U1
is about 5.4210109e-20. This U1
is exactly 2-64. The Intel processor you are running on has an 80-bit floating-point format in which the significand (the fraction part of the floating-point representation) has 64 bits. This 64-bit width of the significand is responsible for the step size being 2-64, so it is why U1
is 2-64.
Later, the test evaluates (1-U1)-1/2
and compares it to 1/2
. Since 1-U1
is less than 1, subtracting 1/2 should make produce a result less than 1/2. However, in this case, your C implementation is evaluating 1-U1
with 64-bit arithmetic, which has a 53-bit significand. With a 53-bit significand, 1-U1
cannot be represented exactly. Since it is very close to 1, the mathematical value of 1-U1
is rounded to 1 in the 64-bit format. Then subtracting 1/2 from this 1 yields 1/2. This 1/2 is not less than 1/2, so the comparison fails, and the program reports an error.
This is a defect of your C implementation. It actually evaluates 1-U1
differently in one place than in another. It uses 80-bit arithmetic in one place and 64-bit in another, and it does not provide a good way to control this. (But there may be switches to use only 64-bit arithmetic; I do not know about your version of GCC.)
Although this is a defect by the standards of people who want good floating-point arithmetic, it is not a defect according to the C standard. The C language standard permits this behavior.
I have not examined failures reported after the first. They likely stem from similar causes.