c++c++11 optimization undefined-behavior unions

Is placement new in an union UB for SOO?

I am writing a simplified version of std::move_only_function in C++11.

In my implementation, I provide Small Object Optimization (a.k.a SOO) for my code, which is basically a union that stores both a function pointer and a type erase pointer.

I realize that when the received function object is smaller than the union, I can use placement new to store the wrapped object directly on the memory area where the union is located, then access it by reinterpret_cast.

Here is my implementation:

template<typename Signature>
class UniqueFunction;

template<typename Ret, typename... Args>
class UniqueFunction<Ret( Args... )> {
  struct AnyFn {
    virtual ~AnyFn() noexcept              = default;
    virtual Ret operator()( Args... args ) = 0;
  };
  template<typename Fn>
  struct FnContainer : public AnyFn {
    Fn fntor_;

    FnContainer( Fn fntor ) : fntor_ { std::move( fntor ) } {}

    FnContainer( const FnContainer& )              = delete;
    FnContainer& operator=( const FnContainer& ) & = delete;
    virtual ~FnContainer() noexcept                = default;

    Ret operator()( Args... args ) override { return fntor_( std::forward<Args>( args )... ); }
  };

  union {
    typename std::add_pointer<Ret( Args... )>::type fptr_;
    AnyFn* ftor_;
  } data_; // The union is here.
  // Tag that identifies the type of data currently stored in the union.
  enum class Tag : std::uint8_t { None, Fptr, FtorInline, FtorDync } tag_;

  // other methods...
};

In the constructor, if I find that the size of the object is less than union, I store it directly in union with placement new and discard the return value of the placement new.

template<typename F>
UniqueFunction( F functor )
{
  if ( sizeof( FnContainer<F> <= sizeof data_ ) {
    new ( &data_ ) FnContainer<F>( std::move( functor ) );
    tag_ = Tag::FtorInline;
  } else {
    data_.ftor_ = new FnContainer<F>( std::move( functor ) );
    tag_ = Tag::FtorDync;
}

In this case, I would reinterpret the address of the union as AnyFn* via reinterpret_cast to try to access the derived object stored on this area using base pointer.

if ( tag_ == Tag::FtorInline ) {
  // When I'm trying to deconstruct it, I do this:
  // reinterpret_cast<AnyFn*>( &data_ )->~AnyFn();

  // Other cases, I do this:
  ( *reinterpret_cast<AnyFn*>( &data_ ) );
}

I know that there's a rule called strict aliasing in the C++ standard, so I suspect that my optimization is violating the standard and causing some sort of UB; I'm not so sure, though, since this code worked well in testing.

Here is the online test.

The problem is:

Is this an UB? If so, is there another way for me to achieve the small object optimization I want?

Solution

The placement new call is not UB. But it ends the lifetime of the union object since it reuses the storage that the union object occupies. See [basic.life]/2.5. &data_ becomes a pointer to an object that is not within its lifetime.

The reinterpret_cast doesn't produce a pointer to the AnyFn object. It is equivalent to a static_cast to void* followed by a second static_cast to AnyFn*. The first cast is fine. The second one will only actually produce a pointer to the AnyFn object if that object is pointer-interconvertible with the original pointee (the dead union object). See [expr.static.cast]/13. This condition will not be satisfied. Access through the resulting pointer will be UB.

It is sometimes possible to obtain a pointer to the AnyFn base class subobject by passing the result of the cast to std::launder. However, this will work only if the AnyFn base class subobject is actually located at the beginning of the storage for data_. This condition is not guaranteed to hold.

To implement the small object optimization it is suggested to have an array of unsigned char or std::byte as one of the members of the union. The small object is then placement new'd into that array. The pointer returned by placement new is stored as a separate member and is used to access the created object.