c++iteratorcopy-constructordefault-copy-constructor

Is C++'s default copy-constructor inherently unsafe? Are iterators fundamentally unsafe too?


I used to think C++'s object model is very robust when best practices are followed.
Just a few minutes ago, though, I had a realization that I hadn't had before.

Consider this code:

class Foo
{
    std::set<size_t> set;
    std::vector<std::set<size_t>::iterator> vector;
    // ...
    // (assume every method ensures p always points to a valid element of s)
};

I have written code like this. And until today, I hadn't seen a problem with it.

But, thinking about it a more, I realized that this class is very broken:
Its copy-constructor and copy-assignment copy the iterators inside the vector, which implies that they will still point to the old set! The new one isn't a true copy after all!

In other words, I must manually implement the copy-constructor even though this class isn't managing any resources (no RAII)!

This strikes me as astonishing. I've never come across this issue before, and I don't know of any elegant way to solve it. Thinking about it a bit more, it seems to me that copy construction is unsafe by default -- in fact, it seems to me that classes should not be copyable by default, because any kind of coupling between their instance variables risks rendering the default copy-constructor invalid.

Are iterators fundamentally unsafe to store? Or, should classes really be non-copyable by default?

The solutions I can think of, below, are all undesirable, as they don't let me take advantage of the automatically-generated copy constructor:

  1. Manually implement a copy constructor for every nontrivial class I write. This is not only error-prone, but also painful to write for a complicated class.
  2. Never store iterators as member variables. This seems severely limiting.
  3. Disable copying by default on all classes I write, unless I can explicitly prove they are correct. This seems to run entirely against C++'s design, which is for most types to have value semantics, and thus be copyable.

Is this a well-known problem, and if so, does it have an elegant/idiomatic solution?


Solution

  • Is this a well-known problem?

    Well, it is known, but I would not say well-known. Sibling pointers do not occur often, and most implementations I have seen in the wild were broken in the exact same way than yours is.

    I believe the problem to be infrequent enough to have escaped most people's notice; interestingly, as I follow more Rust than C++ nowadays, it crops up there quite often because of the strictness of the type system (ie, the compiler refuses those programs, prompting questions).

    does it have an elegant/idiomatic solution?

    There are many types of sibling pointers situations, so it really depends, however I know of two generic solutions:

    Let's review them in order.

    Pointing to a class-member, or pointing into an indexable container, then one can use an offset or key rather than an iterator. It is slightly less efficient (and might require a look-up) however it is a fairly simple strategy. I have seen it used to great effect in shared-memory situation (where using pointers is a no-no since the shared-memory area may be mapped at different addresses).

    The other solution is used by Boost.MultiIndex, and consists in an alternative memory layout. It stems from the principle of the intrusive container: instead of putting the element into the container (moving it in memory), an intrusive container uses hooks already inside the element to wire it at the right place. Starting from there, it is easy enough to use different hooks to wire a single elements into multiple containers, right?

    Well, Boost.MultiIndex kicks it two steps further:

    1. It uses the traditional container interface (ie, move your object in), but the node to which the object is moved in is an element with multiple hooks
    2. It uses various hooks/containers in a single entity

    You can check various examples and notably Example 5: Sequenced Indices looks a lot like your own code.