pythonnumpytypesfloating-pointformatting

Numpy float64 vs Python float


I'm battling some floating point problems in Pandas read_csv function. In my investigation, I found this:

In [15]: a = 5.9975

In [16]: a
Out[16]: 5.9975

In [17]: np.float64(a)
Out[17]: 5.9974999999999996

Why is builtin float of Python and the np.float64 type from Python giving different results? I thought they were both C++ doubles?


Solution

  • >>> numpy.float64(5.9975).hex()
    '0x1.7fd70a3d70a3dp+2'
    >>> (5.9975).hex()
    '0x1.7fd70a3d70a3dp+2'
    

    They are the same number. What differs is the textual representation obtained via by their __repr__ method; the native Python type outputs the minimal digits needed to uniquely distinguish values, while NumPy code before version 1.14.0, released in 2018 didn't try to minimise the number of digits output.