mathfloating-pointlanguage-agnosticprecisionfloating-accuracy# Is floating-point math broken?

### Side Note: All positional (base-N) number systems share this problem with precision

### Side Note: Working with Floats in Programming

Consider the following code:

```
0.1 + 0.2 == 0.3 -> false
```

```
0.1 + 0.2 -> 0.30000000000000004
```

Why do these inaccuracies happen?

Solution

Binary floating point math works like this. In most programming languages, it is based on the IEEE 754 standard. The crux of the problem is that numbers are represented in this format as a whole number times a power of two; rational numbers (such as `0.1`

, which is `1/10`

) whose denominator is not a power of two cannot be exactly represented.

For `0.1`

in the standard `binary64`

format, the representation can be written exactly as

`0.1000000000000000055511151231257827021181583404541015625`

in decimal, or`0x1.999999999999ap-4`

in C99 hexfloat notation.

In contrast, the rational number `0.1`

, which is `1/10`

, can be written exactly as

`0.1`

in decimal, or`0x1.99999999999999...p-4`

in an analog of C99 hexfloat notation, where the`...`

represents an unending sequence of 9's.

The constants `0.2`

and `0.3`

in your program will also be approximations to their true values. It happens that the closest `double`

to `0.2`

is larger than the rational number `0.2`

but that the closest `double`

to `0.3`

is smaller than the rational number `0.3`

. The sum of `0.1`

and `0.2`

winds up being larger than the rational number `0.3`

and hence disagreeing with the constant in your code.

A fairly comprehensive treatment of floating-point arithmetic issues is *What Every Computer Scientist Should Know About Floating-Point Arithmetic*. For an easier-to-digest explanation, see floating-point-gui.de.

Plain old decimal (base 10) numbers have the same issues, which is why numbers like 1/3 end up as 0.333333333...

You've just stumbled on a number (3/10) that happens to be easy to represent with the decimal system but doesn't fit the binary system. It goes both ways (to some small degree) as well: 1/16 is an ugly number in decimal (0.0625), but in binary it looks as neat as a 10,000th does in decimal (0.0001)** - if we were in the habit of using a base-2 number system in our daily lives, you'd even look at that number and instinctively understand you could arrive there by halving something, halving it again, and again and again.

Of course, that's not exactly how floating-point numbers are stored in memory (they use a form of scientific notation). However, it does illustrate the point that binary floating-point precision errors tend to crop up because the "real world" numbers we are usually interested in working with are so often powers of ten - but only because we use a decimal number system day-to-day. This is also why we'll say things like 71% instead of "5 out of every 7" (71% is an approximation since 5/7 can't be represented exactly with any decimal number).

So, no: binary floating point numbers are not broken, they just happen to be as imperfect as every other base-N number system :)

In practice, this problem of precision means you need to use rounding functions to round your floating point numbers off to however many decimal places you're interested in before you display them.

You also need to replace equality tests with comparisons that allow some amount of tolerance, which means:

Do **not** do `if (x == y) { ... }`

Instead do `if (abs(x - y) < myToleranceValue) { ... }`

.

where `abs`

is the absolute value. `myToleranceValue`

needs to be chosen for your particular application - and it will have a lot to do with how much "wiggle room" you are prepared to allow, and what the largest number you are going to be comparing may be (due to loss of precision issues). Beware of "epsilon" style constants in your language of choice. These **can** be used as tolerance values but their effectiveness depends on the magnitude (size) of the numbers you're working with, since calculations with large numbers may exceed the epsilon threshold.

- How does this bitwise operation check for a power of 2?
- Find a point in a circumference given X
- How to determine if a large integer is a power of 3 in Python?
- Automatically simplify redundant arithmetic relations
- Python polynomial pow
- Rounding to even in C#
- Count ones in a segment (binary)
- Confusion between C++ and OpenGL matrix order (row-major vs column-major)
- How to validate a International Securities Identification Number (ISIN) number
- Finding intersection points between 3 spheres
- Simple way to interpolate between points in 3D space to form a smooth surface
- Fastest prime test for small-ish numbers
- Get a vector that starts at a given point and is a tangent to some given object
- Getting decimal value from division
- Why aren’t posit arithmetic representations commonly used?
- Can you please explain Reed Solomon encoding part's Identity matrix?
- Ruby Floating Point Math - Issue with Precision in Sum Calc
- How do you calculate the average of a set of circular data?
- Number of nodes in tree where each node has k positive integers whose sum is less than n, and greater than parent's sum
- Do CPUs have a hardware "math cache" or dictionary that stores the result of simple math operations for quicker processing?
- JavaScript % (modulo) gives a negative result for negative numbers
- Approximating logarithm using harmonic mean
- barycentric coordinate clamping on 3d triangle
- Fastest way to list all primes below N
- fmod function from math.h library in C not working correctly?
- Truncate decimal places of values within a pandas df
- Using f suffix for numeric literals
- Is there any function, which could solve the linear superposition of two gaussian noise with different mean and stddev like Fourier transform?
- Jekyll page rating calculator with meta from Front Mater area
- How to find distance from the latitude and longitude of two locations?