We recently had a lecture in university about programming specials in several languages.
The lecturer wrote down the following function:
inline u64 Swap_64(u64 x)
{
u64 tmp;
(*(u32*)&tmp) = Swap_32(*(((u32*)&x)+1));
(*(((u32*)&tmp)+1)) = Swap_32(*(u32*) &x);
return tmp;
}
While I totally understand that this is also really bad style in terms of readability, his main point was that this part of code worked fine in production code until they enabled a high optimization level. Then, the code would just do nothing.
He said that all the assignments to the variable tmp
would be optimized out by the compiler. But why would this happen?
I understand that there are circumstances where variables need to be declared volatile so that the compiler doesn't touch them even if he thinks that they are never read or written but I wouldn't know why this would happen here.
This code violates the strict aliasing rules which makes it illegal to access an object through a pointer of a different type, although access through a *char ** is allowed. The compiler is allowed to assume that pointers of different types do not point to the same memory and optimize accordingly. It also means the code invokes undefined behavior and could really do anything.
One of the best references for this topic is Understanding Strict Aliasing and we can see the first example is in a similar vein to the OP's code:
uint32_t swap_words( uint32_t arg )
{
uint16_t* const sp = (uint16_t*)&arg;
uint16_t hi = sp[0];
uint16_t lo = sp[1];
sp[1] = hi;
sp[0] = lo;
return (arg);
}
The article explains this code violates strict aliasing rules since sp
is an alias of arg
but they have different types and says that although it will compile, it is likely arg
will be unchanged after swap_words
returns. Although with simple tests, I am unable to reproduce that result with either the code above nor the OPs code but that does not mean anything since this is undefined behavior and therefore not predictable.
The article goes on to talk about many different cases and presents several working solution including type-punning through a union, which is well-defined in C991 and may be undefined in C++ but in practice is supported by most major compilers, for example here is gcc's reference on type-punning. The previous thread Purpose of Unions in C and C++ goes into the gory details. Although there are many threads on this topic, this seems to do the best job.
The code for that solution is as follows:
typedef union
{
uint32_t u32;
uint16_t u16[2];
} U32;
uint32_t swap_words( uint32_t arg )
{
U32 in;
uint16_t lo;
uint16_t hi;
in.u32 = arg;
hi = in.u16[0];
lo = in.u16[1];
in.u16[0] = lo;
in.u16[1] = hi;
return (in.u32);
}
For reference the relevant section from the C99 draft standard on strict aliasing is 6.5
Expressions paragraph 7 which says:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:76)
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
and footnote 76 says:
The intent of this list is to specify those circumstances in which an object may or may not be aliased.
and the relevant section from the C++ draft standard is 3.10
Lvalues and rvalues paragraph 10
The article Type-punning and strict-aliasing gives a gentler but less complete introduction to the topic and C99 revisited gives a deep analysis of C99 and aliasing and is not light reading. This answer to Accessing inactive union member - undefined? goes over the muddy details of type-punning through a union in C++ and is not light reading either.
Footnotes: