c++language-lawyerc++20spaceship-operatorpartial-ordering

Is the result of std::ranges::sort well defined for ranges of types supporting only partial_ordering and having indeed unordered elements at runtime?


Here's a relatively short example:

#include <cassert>
#include <algorithm>
#include <compare>
#include <iostream>
#include <optional>
#include <vector>

struct Foo {
    std::optional<int> oi;
    Foo() : oi{} {}
    Foo(int i) : oi{i} {}
    friend bool operator==(Foo const&, Foo const&) {
        // not even invoked by the example
        assert(false);
        return {};
    }
    friend std::partial_ordering operator<=>(Foo const& a, Foo const& b) {
        return a.oi.has_value() && b.oi.has_value()
             ? a.oi <=> b.oi
             : std::partial_ordering::unordered;
    }
};


int main() {
    std::vector<Foo> v{3, 1, {}, 2, {}, 7, 6, 7, 9, 8, 10};
    std::ranges::sort(v);
    for (auto i : v) {
        if (i.oi) {
            std::cout << *i.oi << std::endl;
        } else {
            std::cout << "-" << std::endl;
        }
    }
}

It prints this:

1
3
-
2
-
6
7
7
8
9
10

which feels an intuitive result, if I start from the assumption that std::ranges::sort is smart enough to limit the sorting only to contiguous subranges containing no element that is unsorted with respect to any other in that subrange, but I didn't expect that. I expected something to break.

So the question is: am I correct and I'm just observing UB which "just happens not to break", or is the behavior I observe the defined behavior? Based on this answer I would assume the latter is the case, but I'm not confident enough.


Solution

  • This is undefined behavior, because you're violating the requirements of a strict weak order for the elements that you're sorting.

    A strict weak order requires three things:

    1. x < x is false for all x (irreflexive)
    2. x < y and y < z implies x < z (transitive)
    3. x ~ y and y ~ z implies x ~ z, if you define a ~ b as not (a < b) and not (b < a) (equivalence relation)

    Now, the first bullet holds. Given a Foo f, if it's engaged then f < f will be f.oi.value() < f.oi.value() which is false. If it's disengaged, then the comparison is unordered, which means that < evaluates as false.

    And the second bullet holds, because x < y being true only happens when both Foos are engaged, which means that we're just comparing ints. So we have transitivity.

    But the third bullet doesn't hold. Foo() is equivalent to Foo(i) for all i, because they're unordered. That is Foo() ~ Foo(1) holds. But so does Foo() ~ Foo(2) and Foo() ~ Foo(3), etc. But it is not the case that Foo(1) ~ Foo(2). So we don't have an equivalence relation.

    Since we're violating the semantic requirements of the algorithm, that's undefined behavior. That you happen to see behavior that you can back out something that looks vaguely intuitive is probably coincidental. Maybe it's some emergent property of the underlying algorithm.

    Note that libc++ with hardening enabled (-D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_DEBUG) catches this, which is pretty cool:

    include/c++/v1/__debug_utils/strict_weak_ordering_check.h:59: assertion __comp(*(__first + __a), *(__first + __b)) failed: Your comparator is not a valid strict-weak ordering