pythonnumpyfloating-pointprecision

Python numpy float16 datatype operations, and float8?


when performing math operations on float16 Numpy numbers, the result is also in float16 type number. My question is how exactly the result is computed? Say Im multiplying/adding two float16 numbers, does python generate the result in float32 and then truncate/round the result to float16? Or does the calculation performed in '16bit multiplexer/adder hardware' all the way?

another question - is there a float8 type? I couldnt find this one... if not, then why? Thank-you all!


Solution

  • To the first question: there's no hardware support for float16 on a typical processor (at least outside the GPU). NumPy does exactly what you suggest: convert the float16 operands to float32, perform the scalar operation on the float32 values, then round the float32 result back to float16. It can be proved that the results are still correctly-rounded: the precision of float32 is large enough (relative to that of float16) that double rounding isn't an issue here, at least for the four basic arithmetic operations and square root.

    In the current NumPy source, this is what the definition of the four basic arithmetic operations looks like for float16 scalar operations.

    #define half_ctype_add(a, b, outp) *(outp) = \
            npy_float_to_half(npy_half_to_float(a) + npy_half_to_float(b))
    #define half_ctype_subtract(a, b, outp) *(outp) = \
            npy_float_to_half(npy_half_to_float(a) - npy_half_to_float(b))
    #define half_ctype_multiply(a, b, outp) *(outp) = \
            npy_float_to_half(npy_half_to_float(a) * npy_half_to_float(b))
    #define half_ctype_divide(a, b, outp) *(outp) = \
            npy_float_to_half(npy_half_to_float(a) / npy_half_to_float(b))
    

    The code above is taken from scalarmath.c.src in the NumPy source. You can also take a look at loops.c.src for the corresponding code for array ufuncs. The supporting npy_half_to_float and npy_float_to_half functions are defined in halffloat.c, along with various other support functions for the float16 type.

    For the second question: no, there's no float8 type in NumPy. float16 is a standardized type (described in the IEEE 754 standard), that's already in wide use in some contexts (notably GPUs). There's no IEEE 754 float8 type, and there doesn't appear to be an obvious candidate for a "standard" float8 type. I'd also guess that there just hasn't been that much demand for float8 support in NumPy.