c++undefined-behaviortwos-complementc++20

Ramifications of C++20 requiring two's complement


C++20 will specify that signed integral types must use two's complement. This doesn't seem like a big change given that (virtually?) every implementation currently uses two's complement.

But I was wondering if this change might shift some "undefined behaviors" to be "implementation defined" or even "defined."

Consider, the absolute value function, std::abs(int) and some of its overloads. The C++ standard includes this function by reference to the C standard, which says that the behavior is undefined if the result cannot be represented.

In two's complement, there is no positive counterpart to INT_MIN:

abs(INT_MIN) == -INT_MIN == undefined behavior

In sign-magnitude representation, there is:

-INT_MIN == INT_MAX

Thus it seemed reasonable that abs() was left with some undefined behavior.

Once two's complement is required, it would seem to make sense that abs(INT_MIN)'s behavior could be fully specified or, at least, implementation defined, without any issue of backward compatibility. But I don't see any such change proposed.

The only drawback I see is that the C++ Standard would need to specify abs() explicitly rather than referencing the C Standard's description of abs(). (As far as I know, C is not mandating two's complement.)

Was this just not a priority for the committee or are there still reasons not to take advantage of the simplification and certainty that the two's complement mandate provides?


Solution

  • One of the specific questions considered by the committee was what to do about -INT_MIN, and the results of that poll were:

    addition / subtraction / multiplication and -INT_MIN overflow is currently undefined behavior, it should instead be:

    4: wrap
    6: wrap or trap
    5: intermediate values are mathematical integers
    14: status quo (remain undefined behavior)

    This was explicitly considered and people felt that the best option was keeping it undefined behavior.

    To clarify on "intermediate values are mathematical integers", there is a other part of the paper which clarifies that means that (int)a + (int)b > INT_MAX might be true.


    Note that implementations are free to define specific behavior in these cases if they so choose. I don't know if any of them do.