cstrict-aliasing

Does casting a char * to another pointer type break the strict aliasing rule when the memory is from malloc?


I read that char *- and their signed and unsigned counterparts - can alias any type without violating the strict aliasing rule. However, having a char * point to an int variable and casting that char * to a double * breaks the rules because the underlying object is of type int. But what if the memory is from malloc? For example:

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
    void *buffer = malloc(32);
    unsigned char *ptr = buffer;

    *ptr = 10;
    *((double *)(ptr + 1)) = 3.14;
    *((double *)(ptr + 9)) = 2.718;

    printf("*ptr: %d\n", *ptr);
    printf("*(ptr + 1): %lf\n", *((double *)(ptr + 1)));
    printf("*(ptr + 9): %lf\n", *((double *)(ptr + 9)));
    
    return 0;
}

This prints the following:

*ptr: 10
*(ptr + 1): 3.140000
*(ptr + 9): 2.718000

Correct me if I'm wrong but as far as I know the memory from malloc is untyped and can store any data unlike an int array which can only store data of type int.

I haven't received any warning from gcc but apparently it is not very reliable at warning you when you break the strict aliasing rules. So does my example break them?


Solution

  • Your code has undefined behavior due to invalid access by misaligned pointers.

    The memory address of the pointer returned by malloc is specified to have the maximal alignment so that the address can be used by different types without issues. That's why it is valid to do

    double* ptr = malloc(42*sizeof(double));
    

    On the other hand, it is not guaranteed that ptr+1 is a properly aligned pointer for double. In fact, it is very likely that it is not aligned provided that we know ptr itself is properly aligned, as you can see the undefined behavior is reported here.

    The next question is, if we change the code so that the pointers are properly aligned, what is the answer to your question.

    int main(void)
    {
        void *buffer = malloc(32);
        unsigned char *ptr = buffer;
    
        *ptr = 10;
        *((double *)(ptr + _Alignof(double))) = 3.14; //(*)
    
        printf("*ptr: %d\n", *ptr);
        printf("*(ptr + _Alignof(double)): %lf\n", *((double *)(ptr + _Alignof(double))));
        
        return 0;
    }
    

    For the above modified code, the behavior is well-defined assuming that there are no out-of-bound access to the allocated buffer. This can be seen from the following quote from the latest standard draft (6.5.6 in https://open-std.org/JTC1/SC22/WG14/www/docs/n3096.pdf):

    The effective type of an object for an access to its stored value is the declared type of the object, if any98. If a value is stored into an object having no declared type through an lvalue having a type that is not a non-atomic character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

    Where footnote 98 says

    Allocated objects have no declared type.

    At the line marked (*), we hit the second sentence in the above quotes, and the effective type is fixed to be double for the double object that now lives at the address of ptr+_Alignof(double).

    The following code after (*) would be a violation to the strict-aliasing rules, even if the pointer is properly aligned.

    int i = *((int *)(ptr+_Alignof(double)));
    

    Writing to the memory through a lvalue of a different type is allowed according to the last sentence of the quote above, and the intent is made quite clear by the explicit mention of "subsequent accesses that do not modify the stored value". This is in fact the basis of how memory pools work by recycling memories earlier used by other objects and reassigning them for objects with new types. So the following is valid, and updates the effective type of the associated memory. (Assuming that the pointer is properly aligned)

    *((int *)(ptr+_Alignof(double))) = 42;
    

    However, as @JohnBollinger points out in a comment, many of the common C implementations are deficient with regards to the effective type updates via a write similar to the one above. For such implementations, it is possible that they perform incorrect type-aliasing analysis and optimize the code incorrectly. So despite that the C standard states the above is valid, it is probably wiser not to do it directly. The case for memory pool implementation is different as the code that updates the effective type is often located in a different TU and the incorrect type-based aliasing analysis by such deficient implementations cannot do much harm.