c++unions

Is "container punning" defined behavior when using Unions


Having read a fair bit about the purpose of unions, and type-punning (that it's basically not allowed, and you should rely on the compiler optimizing away memcpy calls), I'm wondering if the following use of a Union is unsafe at all, or if it is at all Undefined behavior, or if it is in fact a perfectly compliant and safe usage of a Union.
The idea is that the underlying data is all of the same fundamental type (int, double, etc.), but I just want to look at how it is laid out in different ways, or named in different ways.
I would maybe call this "container-punning" rather than "type-punning"

// main.cpp
// compile with `c++ main.cpp -std=c+23`

#include <iostream>
#include <array>
#include <format>

struct Mat2 {
  union {
    std::array<float, 4> data;
    std::array<std::array<float, 2>, 2> rows;
    struct { float x, y, z, w; };
  };
};

int main() {

  // sometimes you want to think about it in rows
  Mat2 m{ .rows = {{ {0, 1},
                     {2, 3} }}};

  for (auto &row : m.rows) {
    for (auto &col : row) {
      std::cout << col << " ";
    }
    std::cout << "\n";
  }

  // sometimes you want to think about it as a block of data
  std::cout << "\n";
  for (auto &d : m.data) {
    std::cout << d << " ";
  }
  std::cout << "\n\n";

  // sometimes you want to access elements based on sematic names
  std::cout << std::format("{}, {}, {}, {}", m.x, m.y, m.z, m.w);
  std::cout << std::endl;
}

Solution

  • C++ unions do not allow type punning (only in very specific case that does not apply here). Moreover the memory layout of the array is not guaranteed to be what you expect (see Is the address of a std::array guaranteed the same as its data?).

    However, you don't need a union. You can provide different accessors:

    struct Mat2 {
        std::array<float, 4> data;
        float& get(size_t i) { return data[i]; }
        float& get(size_t i,size_t j) { return data[i*2+j]; }
        float& x() { return data[0]; }
        float& y() { return data[1]; }
        float& z() { return data[2]; }
        float& w() { return data[3]; }
    };
    

    Generally it should not affect the data layout more than necessary when all you want is different views of the same data.