How to store (empty) allocator in my container without taking up size?

I have an allocator aware container that has to store the allocator if it is a stateful one (e.g. polymorphic allocator). Otherwise, the allocator should take up no space. I thought the stdlib would use something like empty base optimization for this, however, I can't seem to get it to work:

Demo

#include <cstdio>
#include <memory>
#include <memory_resource>
#include <iostream>


template <typename T, typename Allocator>
struct MyContainerBase {
    Allocator allocator;
};

template <typename T, typename Allocator = std::allocator<T>>
struct MyContainer : private MyContainerBase<T, Allocator>
{
    using base_ = MyContainerBase<T, Allocator>;

    T myint_;
};

int main() {
    MyContainer<int> c;

    std::cout << "size = " << sizeof(MyContainer<int>) << std::endl;
}

Yields

size = 8

I had a look at libc++ and libstdc++ implementations of vector, but I couldn't find the allocator stored at all? It still works with pmr allocators though. How do I do this? Also, I must at least have some allocator object reference to use with std::allocator_traits<Allocator>(myalloc);

Solution

There are some techniques to avoid that a stateless allocator occupy space in an allocator-aware container.

1. Empty Base Optimization

To avoid the memory increment due to the allocator instance, and empty structures in general, it is used a particular optimization, Empty Base Optimization (EBO). It is based on the assumption that, if a class inherits from an empty base class, the space occupied by the latter can be optimized by the compiler to result in no increase in the overall size of the derived class.

Thus, the container is implemented in order that its class or a sub-class containing some or all of the attributes inherit from the allocator. If the latter is stateless, the compiler will optimize the structure not to increase the size; otherwise, the overall size will be equivalent to the sum of the size of the allocator and structure.

For example, the std::vector container may implement optimization in the following way.

struct impl
: public allocator_type
{
    pointer         start;
    pointer         finish;
    pointer         end;
};

impl _impl;

A downside of this design is the need to convert the instance of the derived class to access the base class, which in this case is the allocator. For this reason, many implementations prefer to introduce helper functions that allow access to the allocator without having to perform any explicit cast operation.

For example, two helper functions can be created in the following way.

allocator_type& _get_allocator() noexcept
{ return this->_impl; }

const allocator_type& _get_allocator() const noexcept
{ return this->_impl; }

Although the design seems robust, the introduction of the keyword final, in C++17, allows classes to be made non-derivable, completely breaking the original EBO implementation. Indeed, the attempt to inherit an allocator, and in general an empty structure, that has been made non-derivable produces a compilation error. In this sense, the libstdc++ (GCC) implementation of all standard allocator-aware containers is broken.

One way to solve the problem, trying to retain the benefits of EBO, involves specializing the derived class in order that it inherit the allocator, if it is derivable, or store the allocator as an attribute, otherwise.

To take the example of the std::vector container, the structure can be re-implemented in the following way.

template <bool = !std::is_final_v<allocator_type>>
struct impl
: public allocator_type
{
    pointer         start;
    pointer         finish;
    pointer         end;
};

template <>
struct impl<false>
{
    pointer         start;
    pointer         finish;
    pointer         end;
    allocator_type  alloc;
};

impl<> _impl;

This compromise requires, among other things, the almost mandatory introduction of helper functions to access the allocator, since their absence would require implementing a series of checks at compile time to know which of the two specializations has been selected and then access the allocator in the most appropriate way.

2. Compressed pair

To reduce complexity, it is possible to create an abstraction layer, compressed pair. It is just an external class that, implemented as a pair class, allows storage of two objects on which EBO is performed whenever possible.

This class makes it possible to move the EBO implementation under the hood and reduce both the complexity of the container implementation and the repetition of code. The latter is relevant when the same optimization must be applied to two or more allocator-aware containers. Two member functions, first() and second(), can be used to access the two objects stored in the compressed pair.

In the example of the std::vector container, the previous implementation can be replaced with the following, which for simplicity uses the Boost version of compressed pair.

struct impl
{
    pointer         start;
    pointer         finish;
    pointer         end;
};

boost::compressed_pair<impl, allocator_type> _impl;

Although the design is better, this still has an important limitation: optimization can not be performed if the allocator is non-derivable. Furthermore, EBO itself is a complex idiom, which requires increasing the size of the code and pulls the programmer away from the most important aspects of the implementation to focus on optimizations.

3. no_unique_address

Since C++20, however, these problems have been solved with the introduction of a new standard compiler attribute, no_unique_address. It suggests to the compiler which attributes of a structure can be optimized to not take up memory. Thus, it is sufficient to directly instantiate the allocator, and in general any empty structure, and juxtapose the above attribute with the type declaration.

To conclude the example of the std::vector container, its implementation would be the following.

struct impl
{
    pointer         start;
    pointer         finish;
    pointer         end;
[[no_unique_address]]
    allocator_type  alloc;
};

impl _impl;