cfloating-point

'float' vs. 'double' precision


The code

float x  = 3.141592653589793238;
double z = 3.141592653589793238;
printf("x=%f\n", x);
printf("z=%f\n", z);
printf("x=%20.18f\n", x);
printf("z=%20.18f\n", z);

will give you the output

x=3.141593
z=3.141593
x=3.141592741012573242
z=3.141592653589793116

where on the third line of output 741012573242 is garbage and on the fourth line 116 is garbage. Do doubles always have 16 significant figures while floats always have 7 significant figures? Why don't doubles have 14 significant figures?


Solution

  • Floating point numbers in C use IEEE 754 encoding.

    This type of encoding uses a sign, a significand, and an exponent.

    Because of this encoding, many numbers will have small changes to allow them to be stored.

    Also, the number of significant digits can change slightly since it is a binary representation, not a decimal one.

    Single precision (float) gives you 23 bits of significand, 8 bits of exponent, and 1 sign bit.

    Double precision (double) gives you 52 bits of significand, 11 bits of exponent, and 1 sign bit.