pythonfloating-pointprecisionunderflow

Prevent underflow in floating point division in Python


Suppose both x and y are very small numbers, but I know that the true value of x / y is reasonable.

What is the best way to compute x/y? In particular, I have been doing np.exp(np.log(x) - np.log(y) instead, but I'm not sure if that would make a difference at all?


Solution

  • Python uses the floating-point features of the hardware it runs on, according to Python documentation. On most common machines today, that is IEEE-754 arithmetic or something near it. That Python documentation is not explicit about rounding mode but mentions in passing that the result of a sample division is the nearest representable value, so presumably Python uses round-to-nearest-ties-to-even mode. (“Round-to-nearest” for short. If two representable values are equally close in binary floating-point, the one with a zero in the low bit of its significand is produced.)

    In IEEE-754 arithmetic in round-to-nearest mode, the result of a division is the representable value nearest to the exact mathematical value. Since you say the mathematical value of x/y is reasonable, it is in the normal range of representable values (not below it, in the subnormal range, where precision suffers, and not above it, where results are rounded to infinity). In the normal range, results of elementary operations will be accurate within the normal precision of the format.

    However, since x and y are “very small numbers,” we may be concerned that they are subnormal and have a loss of precision already in them, before division is performed. In the IEEE-754 basic 64-bit binary format, numbers below 2-1022 (about 2.22507•10-308) are subnormal. If x and y are smaller than that, then they have already suffered a loss of precision, and no method can produce a correct quotient from them except by happenstance. Taking the logarithms to calculate the quotient will not help.

    If the machine you are running on happens not to be using IEEE-754, it is still likely that computing x/y directly will produce a better result than np.exp(np.log(x)-np.log(y)). The former is a single operation computing a basic function in hardware that was likely reasonably designed. The latter is several operations computing complicated functions in software that is difficult to make accurate using common hardware operations.

    There is a fair amount of unease and distrust of floating-point operations. Lack of knowledge seems to lead to people being afraid of them. But what should be understood here is that elementary floating-point operations are very well defined and are accurate in normal ranges. The actual problems with floating-point computing arise from accumulating rounding errors over sequences of operations, from the inherent mathematics that compounds errors, and from incorrect expectations about results. What this means is that there is no need to worry about the accuracy of a single division. Rather, it is the overall use of floating-point that should be kept in mind. (Your question could be better answered if it presented more context, illuminating why this division is important, how x and y have been produced from prior data, and what the overall goal is.)

    Note

    A not uncommon deviation from IEEE-754 is to flush subnormal values to zero. If you have some x and some y that are subnormal, some implementations might flush them to zero before performing operations on them. However, this is more common in SIMD code than in normal scalar programming. And, if it were occurring, it would prevent you from evaluating np.log(x) and np.log(y) anyway, as subnormal values would be flushed to zero in those as well. So we can likely dismiss this possibility.