Still struggling with C (C99) undefined and unspecified behaviours.
This time it is the following Unspecified Behaviour (Annex J.1):
The representation used when storing a value in an object that has more than one object representation for that value (6.2.6.1).
The corresponding section 6.2.6.1 states:
Where an operator is applied to a value that has more than one object representation, which object representation is used shall not affect the value of the result43). Where a value is stored in an object using a type that has more than one object representation for that value, it is unspecified which representation is used, but a trap representation shall not be generated.
with the following note 43:
It is possible for objects
x
andy
with the same effective typeT
to have the same value when they are accessed as objects of typeT
, but to have different values in other contexts. In particular, if==
is defined for typeT
, thenx == y
does not imply thatmemcmp(&x, &y, sizeof(T)) == 0
. Furthermore,x == y
does not necessarily imply thatx
andy
have the same value; other operations on values of typeT
may distinguish between them.
I don't even understand what would be a value that has more than one object representation. Is it related for example to a floating point representation of 0 (negative and positive zero) ?
Most of this language is the C standard going well out of its way to allow for continued use on Burroughs B-series mainframes (AFAICT the only surviving ones-complement architecture). Unless you have to work with those, or certain uncommon microcontrollers, or you're seriously into retrocomputing, you can safely assume that the integer types have only one object representation per value, and that they have no padding bits. You can also safely assume that all integer types have no trap representations, except that you must take this line of J.2
[the behavior is undefined if ...] the value of an object
with automatic storage durationis used while it is indeterminate
as if it were normative and as if the crossed-out words were not present. (This rule is not supported by a close reading of the actual normative text, but it is nonetheless the rule adopted by all of the current generation of optimizing compilers.)
Concrete examples of types that can have more than one object representation for a value on a modern, non-exotic implementation include:
_Bool
: the effect of overwriting a _Bool
object with the representation of an integer value other than an appropriately sized 0 or 1 is unspecified.
pointer types: some architectures ignore the low bits of a pointer to a type whose minimum alignment is greater than 1 (e.g. (int*)0x8000_0000
and (int*)0x8000_0001
might be treated as referring to the same int
object; this is an intentional hardware feature, facilitating the use of tagged pointers)
floating point types: IEC 60559 allows all of the many representations of NaN to be treated identically (and possibly squashed together) by the hardware. (Note: +0 and −0 are distinct values in IEEE floating point, not different representations of the same value.)
These are also the scalar types that can have trap representations in modern implementations. In particular, Annex F specifically declares the behavior of signaling NaN to be undefined, even though it's well-defined in an abstract implementation of IEC 60559.