typesfloating-pointprecisionieee-754single-precision

Why IEEE754 single-precision float has only 7 digit precision?


Why does a single-precision floating point number have 7 digit precision (or double 15-16 digits precision)?

Can anyone please explain how we arrive on that based on the 32 bits assigned for float(Sign(32) Exponent(30-23), Fraction (22-0))?


Solution

  • 23 fraction bits (22-0) of the significand appear in the memory format but the total precision is actually 24 bits since we assume there is a leading 1. This is equivalent to log10(2^24) ≈ 7.225 decimal digits.

    Double-precision float has 52 bits in fraction, plus the leading 1 is 53. Therefore a double can hold log10(2^53) ≈ 15.955 decimal digits, not quite 16.

    Note: The leading 1 is not a sign bit. It is actually (-1)^sign * 1.ffffffff * 2^(eeee-constant) but we need not store the leading 1 in the fraction. The sign bit must still be stored


    There are some numbers that cannot be represented as a sum of powers of 2, such as 1/9:

    >>>> double d = 0.111111111111111;
    >>>> System.out.println(d + "\n" + d*10);
    0.111111111111111
    1.1111111111111098
    

    If a financial program were to do this calculation over and over without self-correcting, there would eventually be discrepancies.

    >>>> double d = 0.111111111111111;
    >>>> double sum = 0;
    >>>> for(int i=0; i<1000000000; i++) {sum+=d;}
    >>>> System.out.println(sum);
    111111108.91914201
    

    After 1 billion summations, we are missing over $2.