I am studying ISO/IEC 9899:2023 (E) 6.2.6 Representations of types: 6.2.6.1 General, paragraph 6:
Certain object representations need not represent a value of the object type. If such a representation is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined. 54) Such a representation is called a non-value representation
I am interested in the highlighted sentence. Does this mean that in example:
union {
int i;
float f;
} u;
(where, say, sizeof(int) == sizeof(float) == 4 and the 'int' type has no padding bits),
the line of code
u.i = some_value; // some_value has 'int' type
will contain undefined behavior? After all, with such a line I can (potentially) create a view without the value of the 'f' member of the union object (e.g. NaN).
Does this mean that in example:
union { int i; float f; } u;
(where, say, sizeof(int) == sizeof(float) == 4 and the 'int' type has no padding bits),
the line of code
u.i = some_value; // some_value has 'int' type
will contain undefined behavior?
There are two possible questions here:
Does such an assignment necessarily produce UB? That seems to be what you are proposing, but it is not plausible that the committee intended that each and every assignment to the int
member of such a union has undefined behavior. But,
Could such an assignment in some cases produce UB? That's a more interesting consideration. More on that momentarily.
After all, with such a line I can (potentially) create a view without the value of the 'f' member of the union object (e.g. NaN).
If the implementation's float
type affords non-value representations then yes, by that approach you could conceivably cause u.f
to contain a non-value representation. If u.f
were read while containing such a representation, then that would definitely produce UB, per the first part of the provision you quoted.
HOWEVER, IEEE-754 binary float32 format does not afford any non-value representations. In particular, its NaNs are not non-value representations for a float
type that uses IEEE-754 format (as most modern C implementations do). Although NaNs do not represent members of the set of real numbers, they are nevertheless well-defined values of their type.
A non-value representation is one to which the relevant data type assigns no meaning whatever. For example, consider an integer type with built-in parity, and a representation that does not satisfy the type's parity rule. Or a pointer type that contains flag bits, and a value with an invalid combination of those flags.
So, what if an assignment to one union member causes another to contain a non-value representation?
If we suppose that the provision you're asking about is not wholly redundant, then I think yes, we must interpret the spec to attribute UB to such an assignment. There are very few things expressible in C that could have UB as a result of that provision, but do not also have UB for another reason, such as violating the strict-aliasing rule or reading a non-value representation. The remaining possibilities are the only ones to which your provision could uniquely apply. These are all the ones I have come up with:
(Note that structure and union types do not themselves have non-value representations, though their members can have.)
I don't see a good reason to distinguish among those with respect to the question at hand.