ccastingcompiler-optimizationstrict-aliasingpointer-aliasing

Consequenes of warning “dereferencing type-punned pointer will break strict-aliasing rules”


I have gone through some queries on the similar topic and some material related to it. But my query is mainly to understand the warning for the below code. I do not want a fix !! I understand there are two ways, a union or using memcpy.

uint32 localval;
void * DataPtr;
localval = something;
(*(float32*)(DataPtr))= (*(const float32*)((const void*)(&localval)));

please note the below significant points
1. both the types involved in the cast here are 32 bit. (or am i wrong ?)
2. Both are local variables.

Compiler specific points:
1. The code is supposed to be platform independent, this is a requirement!!
2. I compiled on GCC and it just worked as expected. (I could reinterpret the int as a float) , which is why i ignored the warning.

My questions
1. What optimizations could the compiler perform in this aliasing case ?
2. As both would occupy the same size (correct me if not) what could be the side affects of such a compiler optimization ?
3. Can I safely ignore the warning or turn off aliasing ?
4. If the compiler hasn't performed an optimization and my program is not broken after my first compilation ? Can i safely assume that every time the compiler would behave the same way (does not do optimizations) ?
5. Does the aliasing apply to a void * typecast too ? or is it applicable only for the standard typecasts (int,float etc...) ?
6. what are the affects if I disable the aliasing rules ?

Edited
1. based on R's and Matt McNabb's corrections
2. added a new questions


Solution

  • Language standards try to strike a balance between the sometimes competing interests of programmers that will use the language and compiler writers that want to use a broad set of optimizations to generate reasonably fast code. Keeping variables in registers is one such optimization. For variables that are "live" in a section of a program the compiler tries to allocate them in registers. Storing at the address in a pointer could store anywhere in the program's address space - which would invalidate every single variable in a register. Sometimes the compiler could analyze a program and figure out where a pointer could or could not be pointing, but the C (and C++) language standards consider this an undue burden, and for "system" type of programs often an impossible task. So the language standards relax the constraints by specifying that certain constructs lead to "undefined behavior" so the compiler writer can assume they don't happen and generate better code under that assumption. In the case of strict aliasing the compromise reached is that if you store to memory using one pointer type, then variables of a different type are assumed to be unchanged, and thus can be kept in registers, or stores and loads to these other types can be reordered with respect to the pointer store.

    There are many examples of these kind of optimizations in this paper "Undefined Behavior: What Happened to My Code?"

    http://pdos.csail.mit.edu/papers/ub:apsys12.pdf

    There is an example there of a violation of the strict-aliasing rule in the Linux kernel, apparently the kernel avoids the problem by telling the compiler not to make use of the strict-aliasing rule for optimizations "The Linux kernel uses -fno-strict-aliasing to disable optimizations based on strict aliasing."

    struct iw_event {
        uint16_t len; /* Real length of this stuff */
        ...
    };
    static inline char * iwe_stream_add_event(
        char * stream, /* Stream of events */
        char * ends, /* End of stream */
        struct iw_event *iwe, /* Payload */
        int event_len ) /* Size of payload */
    {
        /* Check if it's possible */
        if (likely((stream + event_len) < ends)) {
            iwe->len = event_len;
            memcpy(stream, (char *) iwe, event_len);
            stream += event_len;
        }
        return stream;
    }
    

    Figure 7: A strict aliasing violation, in include/net/iw_handler.h of the Linux kernel, which uses GCC’s -fno-strict-aliasing to prevent possible reordering.

    2.6 Type-Punned Pointer Dereference

    C gives programmers the freedom to cast pointers of one type to another. Pointer casts are often abused to reinterpret a given object with a different type, a trick known as type-punning. By doing so, the programmer expects that two pointers of different types point to the same memory location (i.e., aliasing). However, the C standard has strict rules for aliasing. In particular, with only a few exceptions, two pointers of different types do not alias [19, 6.5]. Violating strict aliasing leads to undefined behavior. Figure 7 shows an example from the Linux kernel. The function first updates iwe->len, and then copies the content of iwe, which contains the updated iwe->len, to a buffer stream using memcpy. Note that the Linux kernel provides its own optimized memcpy implementation. In this case, when event_len is a constant 8 on 32-bit systems, the code expands as follows.

    iwe->len = 8;
    *(int *)stream = *(int *)((char *)iwe);
    *((int *)stream + 1) = *((int *)((char *)iwe) + 1);
    

    The expanded code first writes 8 to iwe->len, which is of type uint16_t, and then reads iwe, which points to the same memory location of iwe->len, using a different type int. According to the strict aliasing rule, GCC concludes that the read and the write do not happen at the same memory location, because they use different pointer types, and reorders the two operations. The generated code thus copies a stale iwe->len value. The Linux kernel uses -fno-strict-aliasing to disable optimizations based on strict aliasing.

    Answers

    1) What optimizations could the compiler perform in this aliasing case ?

    The language standard is very specific about the semantics (behavior) of a strictly conforming program - the burden is on the compiler writer or language implementor to get it right. Once the programmer crosses the line and invokes undefined behavior then the standard is clear that the burden of proof that this will work as intended falls on the programmer, not on the compiler writer - the compiler in this case has been nice enough to warn that undefined behavior has been invoked although it is under no obligation to even do that. Sometimes annoyingly people will tell you that at this point "anything can happen" usually followed by some joke/exaggeration. In the case of your program the compiler could generate code that is "typical for the platform" and store to localval the value of something and then load from localval and store at DataPtr, like you intended, but understand that it is under no obligation to do so. It sees the store to localval as a store to something of uint32 type and it sees the dereference of the load from (*(const float32*)((const void*)(&localval))) as a load from a float32 type and concludes these aren't to the same location so localval can be in a register containing something while it loads from an uninitialized location on the stack reserved for localval should it decide it needs to "spill" that register back to its reserved "automatic" storage (stack). It may or may not store localval to memory before dereferencing the pointer and loading from memory. Depending on what follows in your code it may decide that localval isn't used and the assignment of something has no side-effect, so it may decide that assignment is "dead code" and not even do the assignment to a register.

    2) As both would occupy the same size (correct me if not) what could be the side affects of such a compiler optimization ?

    The effect could be that an undefined value is stored at the address pointed to by DataPtr.

    3) Can I safely ignore the warning or turn off aliasing ?

    That is specific to the compiler you are using - if the compiler documents a way to turn off the strict aliasing optimizations then yes, with whatever caveats the compiler makes.

    4) If the compiler hasn't performed an optimization and my program is not broken after my first compilation ? Can i safely assume that every time the compiler would behave the same way (does not do optimizations) ?

    Maybe, sometimes very small changes in another part of your program could change what the compiler does to this code, think for a moment if the function is "inlined" it could be thrown in the mix of some other part of your code, see this SO question.

    5) Does the aliasing apply to a void * typecast too ? or is it applicable only for the standard typecasts (int,float etc...) ?

    You cannot dereference a void * so the compiler just cares about the type of your final cast (and in C++ it would gripe if you convert a const to non-const and vice-versa).

    6) what are the affects if I disable the aliasing rules ?

    See your compiler's documentation - in general you will get slower code, if you do this (like the Linux kernel chose to do in the example from the paper above) then limit this to a small compilation unit, with only the functions where this is necessary.

    Conclusion

    I understand your questions are for curiosity and trying to better understand how this works (or might not work). You mentioned it is a requirement that the code be portable, by implication then it is a requirement that the program be compliant and not invoke undefined behavior (remember, the burden is on you if you do). In this case, as you pointed out in the question, one solution is to use memcpy, as it turns out not only does that make your code compliant and therefore portable, it also does what you intend in the most efficient way possible on current gcc with optimization level -O3 the compiler converts the memcpy into a single instruction storing the value of localval at the address pointed to by DataPtr, see it live in coliru here - look for the movl %esi, (%rdi) instruction.