c++iteratorlanguage-lawyerc++20

Does the standard require `operator->()` to be defined for past-the-end non-contiguous iterators?


Does the standard require that operator->() is defined for non-contiguous past-the-end iterators?

Background:

It is less clear whether non-contiguous iterators are allowed to implement operator->() so that its behavior is undefined for past-the-end iterators. Here is various "evidence"/"hints" I found related to this:

So, easy to get confused, and I see quite a bit of "evidence"/"hints" related to non-contiguous iterators, operator->(), and past-the-end. But no explicit requirement, as far as I could find, which settles whether non-contiguous iterators are allowed to exhibit undefined behavior in operator->() when the iterator points to past-the-end. Does anyone have a more definite answer?

Edit: Thanks for several helpful comments. To give some background, possibly answering some of the replies: The practical side, slightly simplified, is that I defined iterator wrappers, i.e., custom iterator classes whose implementation depends on an existing ("wrapped") iterator. The type of the wrapped iterator is given as a template parameter and the wrapped iterator is given as a constructor argument. I carelessly assumed that I could define operator->() in the wrapper class as &*wrapped_iterator. This worked on clang, gcc, MSVC release build, and even in MSVC debug build for non-contiguous iterators. However, it resulted in assertion failures on MSVC's debug build for contiguous iterators, in functions like std::vector::assign(first,last) where first and last are wrappers around contiguous iterators and first and last are both past-the-end iterators. The reason was that MSVC's vector::assign invokes std::address_of(first) even if first==last, which I didn't anticipate. In turn, address_of invoked first.operator->() for my contiguous iterator wrapper. Since I had defined operator->() as &*wrapped_iterator, it invokes operator* in the wrapped iterator whose behavior under the given conditions is undefined. In this particular case it resulted in an assertion failure, because MSVC's debug mode has special code that checks things like this (_ITERATOR_DEBUG_LEVEL).

So I need to change my iterator wrapper's implementation of operator->(). My first idea was to make it invoke operator->() on the wrapped iterator. However, that is not guaranteed to be defined (see the accepted answer). What I have to do, is invoke std::to_address on the wrapped iterator.


Solution

  • By my reading of the standard, a random_access_iterator is not required to define operator->. Thus, generic algorithms must only use operator*. Of course concrete iterators are allowed to define additional methods, AFAIK the standard poses no requirements on those. So a non-contiguous iterator with UB in end.operator->() should be fine; C++20 algorithms should not be calling operator-> at all.

    (I think the rationale here is that some iterators might want to return elements by value, but operator-> is required to return a pointer. So those iterators are forced to not define any operator->.)

    It's only contiguous_iterator that introduces the operator-> requirement (indirectly via std::to_address). There, despite the common "obviously it's undefined behavior" reflex in the comments here, it must not have undefined behavior: std::to_address(c) == std::to_address(a) + std::iter_difference_t<I>(c - a) must hold even when c is a past-the-end iterator. This makes sense when you consider that a past-the-end contiguous iterator can be converted to a past-the-end pointer via &c.operator->().

    The situation is different for legacy iterators (pre-C++20 algorithms): these expect that a->m is equivalent to (*a).m ([input.iterators] table). This also allows UB for non-contiguous iterators.

    Note that these legacy iterator requirements are still used in C++20, e.g. for std::sort; it's only the newer C++20-only functions like std::ranges::sort that use the new concept-based iterator requirements.

    (by the way, these language-lawyer question should be based on the actual standard text, not cppreference. Though in this case it turns out there's no significant difference between them.)