If I run:
>>> import math
>>> print(math.pi)
3.141592653589793
Then pi is printed with 16 digits,
However, according to:
>>> import sys
>>> sys.float_info.dig
15
My precision is 15 digits.
So, should I rely on the last digit of that value (i.e. that the value of π indeed is 3.141592653589793nnnnnn).
TL;DR
The last digit of str(float)
or repr(float)
can be "wrong" in that it seems that the decimal representation is not correctly rounded.
>>> 0.100000000000000040123456
0.10000000000000003
But that value is still closer to the original than 0.1000000000000000
(with 1 digit less) is.
In the case of math.pi
, the decimal approximation of pi is 3.141592653589793238463..., in this case the last digit is right.
The sys.float_info.dig
tells how many decimal digits are guaranteed to be always precise.
The default output for both str(float)
and repr(float)
in Python 3.1+ (and 2.7 for repr
) is the shortest string that when converted to float
will return the original value; in case of ambiguity, the last digit is rounded to the closest value. A float provides ~15.9 decimal digits of precision; but actually up to 17 decimal digit precision is required to represent a 53-binary-digit floating point number unambiguously,
For example 0.10000000000000004
is between 0x1.999999999999dp-4
and 0x1.999999999999cp-4
, but the latter is closer; these 2 have the decimal expansions
0.10000000000000004718447854656915296800434589385986328125
and
0.100000000000000033306690738754696212708950042724609375
respectively. Clearly the latter is closer, so that binary representation is chosen.
Now when these are converted back to string with str()
, or repr()
, the shortest string that yields the exactly same value is chosen; for these 2 values they are 0.10000000000000005
and 0.10000000000000003
respectively
The precision of a double
in IEEE-754 is 53 binary digits; in decimal you can calculate the precision by taking 10-based logarithm of 2^53,
>>> math.log(2 ** 53, 10)
15.954589770191001
meaning almost 16 digits of precision. The float_info
precision tells how much you can always expect to be presentable, and this number is 15, for there are some numbers with 16 decimal digits that are indistinguishable.
However this is not the whole story. Internally what happens in Python 3.2+ is that the float.__str__
and float.__repr__
end up calling the same C method float_repr
:
float_repr(PyFloatObject *v)
{
PyObject *result;
char *buf;
buf = PyOS_double_to_string(PyFloat_AS_DOUBLE(v),
'r', 0,
Py_DTSF_ADD_DOT_0,
NULL);
if (!buf)
return PyErr_NoMemory();
result = _PyUnicode_FromASCII(buf, strlen(buf));
PyMem_Free(buf);
return result;
}
The PyOS_double_to_string
then, for the 'r'
mode (standing for repr), calls either the _Py_dg_dtoa
with mode 0, which is an internal routine to convert the double to a string, or snprintf
with %17g
for those platforms for which the _Py_dg_dtoa
wouldn't work.
The behaviour snprintf is entirely platform dependent, but if _Py_dg_dtoa
is used (as far as I understand, it should be used on most machines), it should be predictable.
The _Py_dg_dtoa
mode 0 is specified as follows:
0 ==> shortest string that yields d when read in and rounded to nearest.
So, that is what happens - the yielded string must exactly reproduce the double
value when read in, and it must be the shortest representation possible, and among multiple decimal representations that would be read in, it would be the one that is closest to the binary value. Now, this might also mean that the last digit of decimal expansion does not match the original value rounded at that length, only that the decimal representation is as close to the original binary representation as possible. Thus YMMV.