c++vectoriteratorstdstdvector

Can the back() iterator of a vector be safely assumed to be the end() iterator after a pop_back()?


My problem is the following :

std::vector<struct pollfd> vec = { ... }; // Actually a member variable on a Server object

for (auto iter = vec.begin(); iter != vec.end(); ) {
    if (client_closed_connection(*iter)) {
        *iter = std::move(*vec.back()); // Move item to delete to the end
        vec.pop_back(); // perform deletion
        // Dont increment cause we dont want to skip the newly moved element
    } else {
        process_client_request(*iter);
        iter++;
    }
}

I'm not using "remove and erase" idiom or similar, because the order of the elements doesn't matter, so I dont want to waste resources trying to preserve it (its code for a list of epoll() fds of clients connected to a server, so deletions while iterating the vector is expected and will happen quite often as clients get their response and disconnect).

While this makes sense to me, and seems to work in practice, I can't help but wonder if i'm potentially relying on UB.

The only issue, is if client_closed_connection() returns false for the last valid element (for when iter == back()). In this case, *iter = std::move(*vec.back()) will essentially do nothing (self assignment), but then we pop_back() which according to cpprefernce "invalidates iterator to last element, as well as end() iterator". In our case, iter would thus be invalidated, but is then compared to vec.end() to end the loop.

My questions are as follow :

And also, why isn't there a standard way to remove elements without preserving the order ? I dont think there are that many usecases where the ordering would matter, and at least having the choice in routines would be nice (eg: erase and quick_erase)


Solution

  • To answer the specific question:

    Can the back() iterator of a vector be safely assumed to be the end() iterator after a pop_back()?

    According to the standard: no.

    From the standard:

    From: n4981
    24.3.11.5 Modifiers [vector.modifiers]
    constexpr void pop_back();
    Effects: Invalidates iterators and references at or after the point of the erase.

    The pop_back removes the last element of the vector. Thus, any iterators that refer to the last element or end are "invalid".

    I could not find a definition for "Invalidates" in the standard so we can not assume anything, which includes that we can not assume that an iterator that was previously on end is valid.

    So I would say this is likely an invalid program according to the standard.


    Alternative 1.

    Rather than using pop_back() you could use erase(--end()).

    Note this question: https://stackoverflow.com/a/263958/14065

    Basically: In C++11 erase() was modified to return the iterator of the next item (because otherwise it is not intuaitve how to erase and continue iterating a vector).

    for (auto loop = std::begin(cont); loop != std::end(cond);)
    {
        if (shouldIeraseElement(*loop)) {
            loop = cont.erase(loop);
        }
        else {
            ++loop;
        }
    }
    

    Not the exact problem you have, but a similar pattern.


    Alternative 2.

    You can test if you are about to remove the last element and take explicit action to break out of the loop.


    Alternative 3.

    I would change how you write this:

    In C++, there is a standard idiom called erase/remove.

    1. You do a remove which moves all the bad items to end of the container.
    2. You erase (delete from the container) all the bad items.

    This is how I might write it:

    template<typename I, typename A>
    I remove_non_stable_if(I begin, I end, A&& action)
    {
        for(auto loop = begin; loop != end;)
        {
            if (action(*loop)) {
                std::iter_swap(loop, --end);
            }
            else {
                ++loop;
            }
        }
        return end;
    }
    .....
    
    auto bad = remove_non_stable_if(std::begin(vec), std::end(vec), [](pollfd const& fd)
        {
            if (client_closed_connection(fd)) {
                return true;
            }
            process_client_request(fd);
            return false;
        }
    );
    vec.erase(bad, std::end(vec));
    

    The advantage:

    Note: std::remove_if is stable, thus it will do more moves than your initial algorithm. BUT It should be simple to write a non-stable version, but still uses the erase/remove idiom.