c++undefined-behaviorstrict-aliasingtype-punning

How can you repurpose an array of floats as an array of doubles without undefined behavior?


In one particular C++ function, I happen to have a pointer to a big buffer of floats that I want to temporarily use to store half the number of doubles. Is there a method to use this buffer as scratch space for storing the doubles, which is also allowed (i.e., not undefined behaviour) by the standard?

In summary, I would like this:

void f(float* buffer)
{
  double* d = reinterpret_cast<double*>(buffer);
  // make use of d
  d[i] = 1.;
  // done using d as scratch, start filling the buffer
  buffer[j] = 1.;
}

As far as I see there's no easy way to do this: if I understand correctly, a reinterpret_cast<double*> like this causes undefined behaviour because of type aliasing, and using memcpy or a float/double union is not possible without copying the data and allocating extra space, which defeats the purpose and happens to be costly in my case (and using a union for type punning is not allowed in C++).

It can be assumed the float buffer is correctly aligned for using it for doubles.


Solution

  • I think the following code is a valid way to do it (it is really just a small example about the idea):

    #include <memory>
    
    void f(float* buffer, std::size_t buffer_size_in_bytes)
    {
        double* d = new (buffer)double[buffer_size_in_bytes / sizeof(double)];
    
        // we have started the lifetime of the doubles.
        // "d" is a new pointer pointing to the first double object in the array.        
        // now you can use "d" as a double buffer for your calculations
        // you are not allowed to access any object through the "buffer" pointer anymore since the floats are "destroyed"       
        d[0] = 1.;
        // do some work here on/with the doubles...
    
    
        // conceptually we need to destory the doubles here... but they are trivially destructable
    
        // now we need to start the lifetime of the floats again
        new (buffer) float[10];  
    
    
        // here we are unsure about wether we need to update the "buffer" pointer to 
        // the one returned by the placement new of the floats
        // if it is nessessary, we could return the new float pointer or take the input pointer
        // by reference and update it directly in the function
    }
    
    int main()
    {
        float* floats = new float[10];
        f(floats, sizeof(float) * 10);
        return 0;
    }
    

    It is important that you only use the pointer you receive from placement new. And it is important to placement new back the floats. Even if it is a no-operation construction, you need to start the lifetimes of the floats again.

    Forget about std::launder and reinterpret_cast in the comments. Placement new will do the job for you.

    edit: Make sure you have proper alignment when creating the buffer in main.

    Update:

    I just wanted to give an update on things that were discussed in the comments.

    1. The first thing mentioned was that we may need to update the initially created float pointer to the pointer returned by the re-placement-new'ed floats (the question is whether the initially float pointer can still be used to access the floats, because the floats are now "new" floats obtained by an additional new expression).

    To do this, we can either a) pass the float pointer by reference and update it, or b) return the new obtained float pointer from the function:

    a)

    void f(float*& buffer, std::size_t buffer_size_in_bytes)
    {
        double* d = new (buffer)double[buffer_size_in_bytes / sizeof(double)];    
        // do some work here on/with the doubles...
        buffer = new (buffer) float[10];  
    }
    

    b)

    float* f(float* buffer, std::size_t buffer_size_in_bytes)
    {
        /* same as inital example... */
        return new (buffer) float[10];  
    }
    
    int main()
    {
        float* floats = new float[10];
        floats = f(floats, sizeof(float) * 10);
        return 0;
    }
    
    1. The next and more crucial thing to mention is that placement-new is allowed to have a memory overhead. So the implementation is allowed to place some meta data infront of the returned array. If that happens, the naive calculation of how many doubles would fit into our memory will be obviously wrong. The problem is, that we dont know how many bytes the implementation will aquire beforehand for the specific call. But that would be nessessary to adjust the amounts of doubles we know will fit into the remaining storage. Here ( https://stackoverflow.com/a/8721932/3783662 ) is another SO post where Howard Hinnant provided a test snippet. I tested this using an online compiler and saw that for trivial destructable types (for example doubles), the overhead was 0. For more complex types (for example std::string), there was an overhead of 8 bytes. But this may varry for your plattform/compiler. Test it beforehand with the snippet by Howard.

    2. For the question why we need to use some kind of placement new (either by new[] or single element new): We are allowed to cast pointers in every way we want. But in the end - when we access the value - we need to use the right type to avoid voilating the strict aliasing rules. Easy speaking: its only allowed to access an object when there is really an object of the pointer type living in the location given by the pointer. So how do you bring objects to life? the standard says:

    https://timsong-cpp.github.io/cppwp/intro.object#1 :

    "An object is created by a definition, by a new-expression, when implicitly changing the active member of a union, or when a temporary object is created."

    There is an additional sector which may seem interesting:

    https://timsong-cpp.github.io/cppwp/basic.life#1:

    "An object is said to have non-vacuous initialization if it is of a class or aggregate type and it or one of its subobjects is initialized by a constructor other than a trivial default constructor. The lifetime of an object of type T begins when:

    So now we may argue that because doubles are trivial, do we need to take some action to bring the trivial objects to life and change the actual living objects? I say yes, because we initally obtained storage for the floats, and accessing the storage through a double pointer would violate strict aliasing. So we need the tell the compiler that the actual type has changed. This whole last point 3 was pretty controversial discussed. You may form your own opinion. You have all the information at hand now.