c++floating-pointc++17epsilon

Largest value representable by a floating-point type smaller than 1


Is there a way to obtain the greatest value representable by the floating-point type float which is smaller than 1.

I've seen the following definition:

static const double DoubleOneMinusEpsilon = 0x1.fffffffffffffp-1;
static const float FloatOneMinusEpsilon = 0x1.fffffep-1;

But is this really how we should define these values?

According to the Standard, std::numeric_limits<T>::epsilon is the machine epsilon, that is, the difference between 1.0 and the next value representable by the floating-point type T. But that doesn't necessarily mean that defining T(1) - std::numeric_limits<T>::epsilon would be better.


Solution

  • You can use the std::nextafter function, which, despite its name, can retrieve the next representable value that is arithmetically before a given starting point, by using an appropriate to argument. (Often -Infinity, 0, or +Infinity).

    This works portably by definition of nextafter, regardless of what floating-point format your C++ implementation uses. (Binary vs. decimal, or width of mantissa aka significand, or anything else.)

    Example: Retrieving the closest value less than 1 for the double type (on Windows, using the clang-cl compiler in Visual Studio 2019), the answer is different from the result of the 1 - ε calculation (which as discussed in comments, is incorrect for IEEE754 numbers; below any power of 2, representable numbers are twice as close together as above it):

    #include <iostream>
    #include <iomanip>
    #include <cmath>
    #include <limits>
    
    int main()
    {
        double naft = std::nextafter(1.0, 0.0);
        std::cout << std::fixed << std::setprecision(20);
        std::cout << naft << '\n';
        double neps = 1.0 - std::numeric_limits<double>::epsilon();
        std::cout << neps << '\n';
        return 0;
    }
    

    Output:

    0.99999999999999988898
    0.99999999999999977796
    

    With different output formatting, this could print as 0x1.fffffffffffffp-1 and 0x1.ffffffffffffep-1 (1 - ε)


    Note that, when using analogous techniques to determine the closest value that is greater than 1, then the nextafter(1.0, 10000.) call gives the same value as the 1 + ε calculation (1.00000000000000022204), as would be expected from the definition of ε.


    Performance

    C++23 requires std::nextafter to be constexpr, but currently only some compilers support that. GCC does do constant-propagation through it, but clang can't (Godbolt). If you want this to be as fast (with optimization enabled) as a literal constant like 0x1.fffffffffffffp-1; for systems where double is IEEE754 binary64, on some compilers you'll have to wait for that part of C++23 support. (It's likely that once compilers are able to do this, like GCC they'll optimize even without actually using -std=c++23.)

    const double DoubleBelowOne = std::nextafter(1.0, 0.); at global scope will at worst run the function once at startup, defeating constant propagation where it's used, but otherwise performing about the same as FP literal constants when used with other runtime variables.