c++functional-programmingunique-ptrrange-v3value-categories

How can I concatenate several vectors of unique pointers


The original use-case

The usecase is that I have a function to retrieve some factories derived from FooFactory,

std::vector<std::unique_ptr<FooFactory>> facts = getFactories();

and each factory can create some derived objects of base class Foo,

struct FooFactory {
    std::vector<std::unique_ptr<Foo>> getFoos();
};

and I want to call getFoos for all the factories facts and collect all the Foos in a single std::vector.

A naive solution

A naive solution is the following, where the Foos are push_backed on the foos vector one by one in a nested for loop.

constexpr auto getFoos = [](auto const& fooFactories) {
    std::vector<std::unique_ptr<Foo>> foos;
    for (auto const& factory : fooFactories) {
        for (auto&& foo : factory->getFoos()) {
            foos.push_back(std::move(foo));
        }
    }
    return foos;
};

auto foos = getFoos(facts);

However this requires frequent reallocation of foos. A small improvement consists in doing foos.reserve(foos.size() + factory->getFoos().size()) right before the inner for loop, which means that foos is reallocated "only" fooFactories.size() times. Another improvement would consist in accumulating the sizes of the fooFactories first, reserve that much space in foos, and then go for the nested for loop above.

However, any such solution seems overly verbose and low lever, where all I really want to do is to getFoos() from each element of fooFactories and join them together in a single vector.

The functional programming way

Ideal - not compiling

In the spirit of this would amount to the following (I've generically commented range to avoid writing the actual, long-to-write, types):

auto foos = facts                           // vector<unique_ptr<FooFactory>>
          | transform(&FooFactory::getFoos) // range<vector<unique_ptr<Foo>>>
          | join                            // range<unique_ptr<Foo>>
          | to_vector;                      // vector<unique_ptr<Foo>>, COMPILE-TIME ERROR

However, this seems far from working, as implied by the comment, for the reason that std::unique_ptr is not copyable.

Sub-optimal solution - but is it legal?

Playing around with the above snippet, I eventually found a solution in releasing the pointers and reconstructing them right after join:

auto foos = facts                                         // vector<unique_ptr<FooFactory>>
          | transform(&FooFactory::getFoos)               // range<vector<unique_ptr<Foo>>>
          | join                                          // range<unique_ptr<Foo>>
          | transform(&std::unique_ptr<Foo>::release)     // range<Foo*>
          | transform(construct¹<std::unique_ptr<Foo>>()) // range<unique_ptr<Foo>>
          | to_vector;                                    // vector<unique_ptr<Foo>>

My question(s)

  1. Is the last snippet above legal? Or does it invoke UB?
  2. If it is UB (or invalid in any other way), why is that?
  3. If it is legal, can you explain why the trick of release+reconstruct work?

The complete working code is here.


(¹) That construct is boost::hof::construct from the Boost.HOF library. In short, construct<std::unique_ptr<Foo>>() is a wrapper around the constructor of std::unique_ptr, i.e. it behaves more or less like this lambda: [](auto* p){ return std::unique_ptr<Foo>{p}; }.


Solution

  • You've noted the issue is that views::join is a view over lvalues. Just move from them:

    auto foos = facts                    // vector<unique_ptr<FooFactory>>&
      | transform(&FooFactory::getFoos)  // range<vector<unique_ptr<Foo>>>
      | join                             // range<unique_ptr<Foo>&>
      | ranges::move                     // range<unique_ptr<Foo>&&>
      | to_vector;                       // vector<unique_ptr<Foo>>
    

    Note that the composition of &std::unique_ptr<Foo>::release followed by construct<std::unique_ptr<Foo>>() is equivalent to [](std::unique_ptr<Foo>& x) -> std::unique_ptr<Foo> { return std::move(x); }, which is very similar to what ranges::move is doing (lvalue -> xvalue instead of lvalue -> prvalue):

    auto foos = facts
              | transform(&FooFactory::getFoos)
              | join
              // | transform(&std::unique_ptr<Foo>::release)
              // | transform(hof::construct<std::unique_ptr<Foo>>())
              | transform([](std::unique_ptr<Foo>& x) { return std::move(x); })
    
              | to_vector;