I'm battling some floating point problems in Pandas read_csv function. In my investigation, I found this:
In [15]: a = 5.9975
In [16]: a
Out[16]: 5.9975
In [17]: np.float64(a)
Out[17]: 5.9974999999999996
Why is builtin float
of Python and the np.float64
type from Python giving different results? I thought they were both C++ doubles?
>>> numpy.float64(5.9975).hex()
'0x1.7fd70a3d70a3dp+2'
>>> (5.9975).hex()
'0x1.7fd70a3d70a3dp+2'
They are the same number. What differs is the textual representation obtained via by their __repr__
method; the native Python type outputs the minimal digits needed to uniquely distinguish values, while NumPy code before version 1.14.0, released in 2018 didn't try to minimise the number of digits output.