c++arraysc++11stdmemory-alignment

How to align std::array contained data?


Since std::array does not allow changing its allocator, is there a way to ensure that the pointer to the data address is aligned?

For instance, in GNU g++ 4.8.4 and 6.1.0, the code below

#include <array>
#include <iostream>

int main(void)
{
  std::array<bool, 10> a;
  std::array<char, 10> b;
  std::array<int,10> c;
  std::array<long long, 10> d;
  std::array<float, 10> e;
  std::array<double, 10> f;

  std::cout << "array<bool,10>.data()       = " << a.data() << std::endl;
  std::cout << "array<char,10>.data()       = " << (void*) b.data() << std::endl;
  std::cout << "array<int,10>.data()        = " << c.data() << std::endl;
  std::cout << "array<long long, 10>.data() = " << d.data() << std::endl;
  std::cout << "array<float, 10>.data()     = " << e.data() << std::endl;
  std::cout << "array<double, 10>.data()    = " << f.data() << std::endl;

  return 0;
}

provides the following output that shows that the container data is aligned to 16-byte addresses no matter the data-type contained when compiling for an x86-64 bit architecture.

array<bool,10>.data()       = 0x7ffe660a2e40
array<char,10>.data()       = 0x7ffe660a2e30
array<int,10>.data()        = 0x7ffe660a2e00
array<long long, 10>.data() = 0x7ffe660a2db0
array<float, 10>.data()     = 0x7ffe660a2d80
array<double, 10>.data()    = 0x7ffe660a2d30

However, for Intel's icpc v16.0.3 the result is shown below even using -align. While most of containers are aligned to 16-byte addresses, some (char and float arrays) are aligned to smaller byte addresses (2-byte and 8-byte, respectively).

array<bool,10>.data()       = 0x7ffdedcb6bf0
array<char,10>.data()       = 0x7ffdedcb6bfa
array<int,10>.data()        = 0x7ffdedcb6ba0
array<long long, 10>.data() = 0x7ffdedcb6b00
array<float, 10>.data()     = 0x7ffdedcb6bc8
array<double, 10>.data()    = 0x7ffdedcb6b50

EDIT

Just to exemplify the proposal from RustyX, this is the changed code

#include <array>
#include <iostream>

int main(void)
{
  alignas(16) std::array<bool, 10> a;
  alignas(16) std::array<char, 10> b;
  alignas(16) std::array<int,10> c;
  alignas(16) std::array<long long, 10> d;
  alignas(16) std::array<float, 10> e;
  alignas(16) std::array<double, 10> f;

  std::cout << "array<bool,10>.data()       = " << a.data() << std::endl;
  std::cout << "array<char,10>.data()       = " << (void*) b.data() << std::endl;
  std::cout << "array<int,10>.data()        = " << c.data() << std::endl;
  std::cout << "array<long long, 10>.data() = " << d.data() << std::endl;
  std::cout << "array<float, 10>.data()     = " << e.data() << std::endl;
  std::cout << "array<double, 10>.data()    = " << f.data() << std::endl;

  return 0;
}

and this is the result when compiling it with Intel's icpc v16.0.3.

array<bool,10>.data()       = 0x7ffe42433500
array<char,10>.data()       = 0x7ffe42433510
array<int,10>.data()        = 0x7ffe424334a0
array<long long, 10>.data() = 0x7ffe42433400
array<float, 10>.data()     = 0x7ffe424334d0
array<double, 10>.data()    = 0x7ffe42433450

Solution

  • By default the compiler will do the right thing when it comes to alignment.

    But you can override (increase) it with alignas:

    alignas(16) std::array<char, 10> b;
    

    It is interesting that the Intel compiler thinks it is sufficient to align a char[] on 8 bytes. It is as if it knows that on an x86 platform you gain little by aligning it any wider.

    Keep in mind that too much alignment can reduce performance due to increased memory use and reduced cache efficiency. Modern x86 architectures (Sandy Bridge and newer) work very efficiently with unaligned data, but cannot compensate for partially used cache lines (more info).