c++floating-pointcomparison

Sort IEEE 754 floats in total order


I've written a algorithm which involves sorting and calls to std::lower_bound and std::upper_bound.

In case value type float or double all NaNs compare unequal and my algorithm fails.

To provide strong ordering of IEEE 754 floating point numbers, I'm copying the number to a signed integer of the same byte size using std::memcpy. In case of negative numbers I'm also reversing the number (x = INT_MIN - x - 1;). After sorting I'm recovering the original floating point numbers in a similar way.

The resulting code works fine and it runs at the exact same performance as with floating point comparision.

Are there better ways to get total ordering of float and double in C++?

Edit: I should have mentioned, that total ordering is a keyword, described in IEEE 754. See: https://en.wikipedia.org/wiki/IEEE_754#Total-ordering_predicate


Solution

  • A template friendly way to sort float, double, and long double with IEEE total ordering is to use std::strong_order. Allowing an array to be sorted such that -sNaN < -qNaN < -inf < -0 < +0 < +inf < +qNaN < +sNaN.

    Here is an example of how one can sort and remove duplicate floating point values using std::strong_order in C++20:

    #include <algorithm>
    #include <compare>
    #include <vector>
    
    template<typename T>
    void remove_duplicate_floating_point(std::vector<T>& values) {
        std::sort(
            values.begin(), values.end(),
            [](T x, T y) { return std::strong_order(x, y) < 0; }
        );
        auto result = std::unique(
            values.begin(), values.end(),
            [](T x, T y) { return std::strong_order(x, y) == 0; }
        );
        values.erase(result, values.end());
    }
    

    Result for x87 long double:

    FFFFE000000000000000 FFFFC000000000000000
    FFFFA000000000000000 FFFF8000000000000000
    FFFEFFFFFFFFFFFFFFFF FFFEFFFFFFFFFFFFFFFE
    FFFEA2F9836E4E44152A FFFEA2F9836E4E441529
    ...
    7FFEA2F9836E4E441529 7FFEA2F9836E4E44152A
    7FFEFFFFFFFFFFFFFFFE 7FFEFFFFFFFFFFFFFFFF
    7FFF8000000000000000 7FFFA000000000000000
    7FFFC000000000000000 7FFFE000000000000000
    

    This also works for GCC's __float128 type.