c++c++11move-semanticsmove-assignment-operator

Move assignment operator and `if (this != &rhs)`


In the assignment operator of a class, you usually need to check if the object being assigned is the invoking object so you don't screw things up:

Class& Class::operator=(const Class& rhs) {
    if (this != &rhs) {
        // do the assignment
    }

    return *this;
}

Do you need the same thing for the move assignment operator? Is there ever a situation where this == &rhs would be true?

? Class::operator=(Class&& rhs) {
    ?
}

Solution

  • First, the Copy and Swap is not always the correct way to implement Copy Assignment. Almost certainly in the case of dumb_array, this is a sub-optimal solution.

    The use of Copy and Swap is for dumb_array is a classic example of putting the most expensive operation with the fullest features at the bottom layer. It is perfect for clients who want the fullest feature and are willing to pay the performance penalty. They get exactly what they want.

    But it is disastrous for clients who do not need the fullest feature and are instead looking for the highest performance. For them dumb_array is just another piece of software they have to rewrite because it is too slow. Had dumb_array been designed differently, it could have satisfied both clients with no compromises to either client.

    The key to satisfying both clients is to build the fastest operations in at the lowest level, and then to add API on top of that for fuller features at more expense. I.e. you need the strong exception guarantee, fine, you pay for it. You don't need it? Here's a faster solution.

    Let's get concrete: Here's the fast, basic exception guarantee Copy Assignment operator for dumb_array:

    dumb_array& operator=(const dumb_array& other)
    {
        if (this != &other)
        {
            if (mSize != other.mSize)
            {
                delete [] mArray;
                mArray = nullptr;
                mArray = other.mSize ? new int[other.mSize] : nullptr;
                mSize = other.mSize;
            }
            std::copy(other.mArray, other.mArray + mSize, mArray);
        }
        return *this;
    }
    

    Explanation:

    One of the more expensive things you can do on modern hardware is make a trip to the heap. Anything you can do to avoid a trip to the heap is time & effort well spent. Clients of dumb_array may well want to often assign arrays of the same size. And when they do, all you need to do is a memcpy (hidden under std::copy). You don't want to allocate a new array of the same size and then deallocate the old one of the same size!

    Now for your clients who actually want strong exception safety:

    template <class C>
    C&
    strong_assign(C& lhs, C rhs)
    {
        swap(lhs, rhs);
        return lhs;
    }
    

    Or maybe if you want to take advantage of move assignment in C++11 that should be:

    template <class C>
    C&
    strong_assign(C& lhs, C rhs)
    {
        lhs = std::move(rhs);
        return lhs;
    }
    

    If dumb_array's clients value speed, they should call the operator=. If they need strong exception safety, there are generic algorithms they can call that will work on a wide variety of objects and need only be implemented once.

    Now back to the original question (which has a type-o at this point in time):

    Class&
    Class::operator=(Class&& rhs)
    {
        if (this == &rhs)  // is this check needed?
        {
           // ...
        }
        return *this;
    }
    

    This is actually a controversial question. Some will say yes, absolutely, some will say no.

    My personal opinion is no, you don't need this check.

    Rationale:

    When an object binds to an rvalue reference it is one of two things:

    1. A temporary.
    2. An object the caller wants you to believe is a temporary.

    If you have a reference to an object that is an actual temporary, then by definition, you have a unique reference to that object. It can't possibly be referenced by anywhere else in your entire program. I.e. this == &temporary is not possible.

    Now if your client has lied to you and promised you that you're getting a temporary when you're not, then it is the client's responsibility to be sure that you don't have to care. If you want to be really careful, I believe that this would be a better implementation:

    Class&
    Class::operator=(Class&& other)
    {
        assert(this != &other);
        // ...
        return *this;
    }
    

    I.e. If you are passed a self reference, this is a bug on the part of the client that should be fixed.

    For completeness, here is a move assignment operator for dumb_array:

    dumb_array& operator=(dumb_array&& other)
    {
        assert(this != &other);
        delete [] mArray;
        mSize = other.mSize;
        mArray = other.mArray;
        other.mSize = 0;
        other.mArray = nullptr;
        return *this;
    }
    

    In the typical use case of move assignment, *this will be a moved-from object and so delete [] mArray; should be a no-op. It is critical that implementations make delete on a nullptr as fast as possible.

    Caveat:

    Some will argue that swap(x, x) is a good idea, or just a necessary evil. And this, if the swap goes to the default swap, can cause a self-move-assignment.

    I disagree that swap(x, x) is ever a good idea. If found in my own code, I will consider it a performance bug and fix it. But in case you want to allow it, realize that swap(x, x) only does self-move-assignemnet on a moved-from value. And in our dumb_array example this will be perfectly harmless if we simply omit the assert, or constrain it to the moved-from case:

    dumb_array& operator=(dumb_array&& other)
    {
        assert(this != &other || mSize == 0);
        delete [] mArray;
        mSize = other.mSize;
        mArray = other.mArray;
        other.mSize = 0;
        other.mArray = nullptr;
        return *this;
    }
    

    If you self-assign two moved-from (empty) dumb_array's, you don't do anything incorrect aside from inserting useless instructions into your program. This same observation can be made for the vast majority of objects.

    <Update>

    I've given this issue some more thought, and changed my position somewhat. I now believe that assignment should be tolerant of self assignment, but that the post conditions on copy assignment and move assignment are different:

    For copy assignment:

    x = y;
    

    one should have a post-condition that the value of y should not be altered. When &x == &y then this postcondition translates into: self copy assignment should have no impact on the value of x.

    For move assignment:

    x = std::move(y);
    

    one should have a post-condition that y has a valid but unspecified state. When &x == &y then this postcondition translates into: x has a valid but unspecified state. I.e. self move assignment does not have to be a no-op. But it should not crash. This post-condition is consistent with allowing swap(x, x) to just work:

    template <class T>
    void
    swap(T& x, T& y)
    {
        // assume &x == &y
        T tmp(std::move(x));
        // x and y now have a valid but unspecified state
        x = std::move(y);
        // x and y still have a valid but unspecified state
        y = std::move(tmp);
        // x and y have the value of tmp, which is the value they had on entry
    }
    

    The above works, as long as x = std::move(x) doesn't crash. It can leave x in any valid but unspecified state.

    I see three ways to program the move assignment operator for dumb_array to achieve this:

    dumb_array& operator=(dumb_array&& other)
    {
        delete [] mArray;
        // set *this to a valid state before continuing
        mSize = 0;
        mArray = nullptr;
        // *this is now in a valid state, continue with move assignment
        mSize = other.mSize;
        mArray = other.mArray;
        other.mSize = 0;
        other.mArray = nullptr;
        return *this;
    }
    

    The above implementation tolerates self assignment, but *this and other end up being a zero-sized array after the self-move assignment, no matter what the original value of *this is. This is fine.

    dumb_array& operator=(dumb_array&& other)
    {
        if (this != &other)
        {
            delete [] mArray;
            mSize = other.mSize;
            mArray = other.mArray;
            other.mSize = 0;
            other.mArray = nullptr;
        }
        return *this;
    }
    

    The above implementation tolerates self assignment the same way the copy assignment operator does, by making it a no-op. This is also fine.

    dumb_array& operator=(dumb_array&& other)
    {
        swap(other);
        return *this;
    }
    

    The above is ok only if dumb_array does not hold resources that should be destructed "immediately". For example if the only resource is memory, the above is fine. If dumb_array could possibly hold mutex locks or the open state of files, the client could reasonably expect those resources on the lhs of the move assignment to be immediately released and therefore this implementation could be problematic.

    The cost of the first is two extra stores. The cost of the second is a test-and-branch. Both work. Both meet all of the requirements of Table 22 MoveAssignable requirements in the C++11 standard. The third also works modulo the non-memory-resource-concern.

    All three implementations can have different costs depending on the hardware: How expensive is a branch? Are there lots of registers or very few?

    The take-away is that self-move-assignment, unlike self-copy-assignment, does not have to preserve the current value.

    </Update>

    One final (hopefully) edit inspired by Luc Danton's comment:

    If you're writing a high level class that doesn't directly manage memory (but may have bases or members that do), then the best implementation of move assignment is often:

    Class& operator=(Class&&) = default;
    

    This will move assign each base and each member in turn, and will not include a this != &other check. This will give you the very highest performance and basic exception safety assuming no invariants need to be maintained among your bases and members. For your clients demanding strong exception safety, point them towards strong_assign.