rfloating-pointfloating-accuracyepsiloninexact-arithmetic# Floating point less-than-equal comparisons after addition and subtraction

**Is there a "best practice" for less-than-equal comparisons with floating point number after a series of floating-point arithmetic operations?**

I have the following example in R (although the question applies to any language using floating-point). I have a double `x = 1`

on which I apply a series of additions and subtractions. In the end `x`

should be exactly one but is not due to floating-point arithmetic (from what I gather). Here is the example:

```
> stop_times <- seq(0.25, 2, by = .25)
> expr <- expression(replicate(100,{
x <- 1
for(i in 1:10) {
tmp <- rexp(1, 1)
n <- sample.int(1e2, 1)
delta <- tmp / n
for(j in 1:n)
x <- x - delta
x <- x + tmp
}
# "correct" answer is 4
which.max(x <= stop_times)
}))
> eval(expr)
[1] 5 5 5 4 4 4 5 5 5 4 5 4 4 4 5 5 4 4 5 4 5 4 5 4 5 5 5 4 4 4 4 4 4 4 4 4 5 5 5 5 5 4 5 4 5 5 5 4 4 5 5 5 4 4 5 5 5 4 4 4 4 4 4
[64] 5 4 4 4 5 5 5 4 4 4 5 4 4 4 4 4 4 4 4 5 5 5 5 4 4 4 5 5 5 5 5 4 4 4 5 5 4
```

A (naive?) solution is to add some arbitrary small positive number to the right hand side of the inequality as follows

```
some_arbitrary_factor <- 100
stop_times <- seq(0.25, 2, by = .25) +
some_arbitrary_factor * .Machine$double.eps
eval(expr)
[1] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[64] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
```

**Is this "best practice" and if so are there guidelines on how to chose some_arbitrary_factor?**

My concrete problem is that I have time periods `(t_0, t_1], (t_1, t_2], ...`

and need to find out in which period a given observation `x`

is in. `x`

may have been set to one the boundaries `t_i`

after having undergone a series of floating-point arithmetic operations which should result in `t_i`

if exact operation where performed.

Solution

No, there is no best practice. Unfortunately, there cannot be, because almost all floating-point calculations introduce some rounding error, and the consequences of the errors are different for different applications.

Typically, software will perform some calculations that ideally would yield some exact mathematical result *x* but, due to rounding errors (or other issues), produce an approximation *x*'. When comparing floating-point numbers, you want to ask some question about *x*, such as “Is *x* < 1?” or “Is *x* = 3.1415926…?” So the problem you want to solve is “How do I use *x*' to answer this question about *x*?”

There is no general solution for this. Some errors may produce an *x*' that is greater than 1 even though *x* is less than 1. Some errors may produce an *x*' that is less than 1 even though x is greater than 1. The solution in any specific instance depends on information about the errors that were generated while calculating *x*' and the specific question to be answered.

Sometimes a thorough analysis can demonstrate that certain questions about *x* can be answered using *x*'. For example, in some situations, we might craft calculations so that we know that, if *x*' < 1, then *x* < 1. Or perhaps that, if *x*' < .99875, then *x* < 1. Say we analyze the calculations we used to calculate *x*' and can show that the final error is less than .00125. Then, if *x*' < .99875, then we know *x* < 1, and, if *x*' > 1.00125, then *x* > 1. But, if .99875 < *x*' < 1.00125, then we do not know whether *x* > 1 or *x* < 1. What do we do in that situation? Is it then better for your application to take the path where *x* < 1 or the path where *x* > 1? The answer is specific to each application, and there is no general best practice.

I will add to this that the amount of rounding error that occurs varies hugely from application to application. This is because rounding error can be compounded in various ways. Some applications with a few floating-point operations will achieve results with small errors. Some applications with many floating-point operations will also achieve results with modest errors. But certain behaviors can lead calculations astray and produce catastrophic errors. So dealing with rounding error is a custom problem for each program.

- Retrieving expected data.frame for testthat expectation
- How to use an anonymous function with pipe operator?
- R passing RSelenium driver environment as function argument
- Keep file names merging list with lapply?
- replace strings that contains one another
- Stop parsing out zeros after decimals in ggplot2's annotate
- Plot data.tree coloring and labelling by level
- Extracting the date and time out of a date and time format
- How to reshape a dataframe into wide format with specified column pairs
- Show partial segments removed after limit scales for `geom_pointrange`
- Control ggplot2 legend look without affecting the plot
- Concatenate strings over all rows in a single column of a data frame
- Is there some way to keep variable names from.SD+.SDcols together with non .SD variable names in data.table?
- How to create mutually dependent checkbox columns in a DT table?
- A problem with an stacked, grouped and faceted graph y gg plot
- Installing R on Linux: configure: error: libcurl >= 7.28.0 library and headers are required with support for https
- Create a column conditional on multiple other character and numerical columns
- data.frame to array (2 columns)
- Changing options() in a function environment without changing options() in global environment in R?
- pivot(), group_by() and summarise() with nested data
- Why legend is not showing up in plot layout using patchwork in R?
- in dplyr::mutate, dplyr::starts_with works for .before but not .after?
- Finding optimal cut-off points for dividing a variable into equal groups considering ties in which quantiles do not work
- How to add a legend to hline?
- R Shiny Suppress slider handle for sliderInput until click
- brmultinom brglm2 vifs and dredge dont work
- R, how to get current time miliseconds, without extra information ? (Sys.time(), format("%X"))
- Installing multiple versions of R
- Command to see 'R' path that RStudio is using
- Fast pairwise longest common substring from start