Does the standard require that operator->()
is defined for non-contiguous past-the-end iterators?
Background:
operator*()
to exhibit undefined behavior when the iterator points to past-the-end. That is explicit at https://en.cppreference.com/w/cpp/iterator, section "Dereferenceability and validity", which says "Values of an iterator i for which the expression *i is defined are called dereferenceable. The standard library never assumes that past-the-end values are dereferenceable."operator->()
to exhibit undefined behavior when the iterator points to past-the-end. This can be inferred from two sections at cppreference: (1) At https://en.cppreference.com/w/cpp/iterator/contiguous_iterator, the "Semantic requirements" section defines the non-dereferenceable iterator c
and states requirements for std::to_address(c)
which imply that std::to_address(c)
does not exhibit undefined behavior. (2) at https://en.cppreference.com/w/cpp/memory/to_address it gives a "Possible implementation" where std::to_address
depends on operator->()
. EDIT: The "possible implementation" does not use operator->()
in case pointer_traits
is defined for the iterator; if that is the case it seems allowed for operator->()
to not be defined for end iterators.It is less clear whether non-contiguous iterators are allowed to implement operator->()
so that its behavior is undefined for past-the-end iterators. Here is various "evidence"/"hints" I found related to this:
operator->()
is defined at all. That's simply not mentioned among all requirements defined at https://en.cppreference.com/w/cpp/iterator/random_access_iterator, or the requirements they depend on, AFAICS.std::contiguous_iterator
, requires operator->()
: as mentioned above, that can be inferred since std::to_address
must be defined, and the "Possible implementation" for std::to_address
uses operator->()
. And the page https://en.cppreference.com/w/cpp/memory/to_address explicitly mentions std::contiguous_iterator
. Since it does not mention other iterator types, this does not imply anything for non-contiguous iterators, IIUC.operator->()
is defined for LegacyInputIterator
and stronger (see the table at https://en.cppreference.com/w/cpp/named_req/InputIterator) and this is qualified by "Precondition: i is dereferenceable". In other words, there is an explicit exception allowing for past-the-end iterators to exhibit undefined behavior. I don't see this precondition removed for any of the stronger legacy iterator categories (FWIW not even https://en.cppreference.com/w/cpp/named_req/ContiguousIterator, so that seems like a difference between LegacyContiguousIterator
and std::contiguous_iterator
).operator*()
does not need to be defined for past-the-end iterators, but does not mention operator->()
. So this may suggest that the exception that allows some operations on past-the-end iterators to be undefined, does not necessarily apply to operator->()
.So, easy to get confused, and I see quite a bit of "evidence"/"hints" related to non-contiguous iterators, operator->()
, and past-the-end. But no explicit requirement, as far as I could find, which settles whether non-contiguous iterators are allowed to exhibit undefined behavior in operator->()
when the iterator points to past-the-end. Does anyone have a more definite answer?
Edit: Thanks for several helpful comments. To give some background, possibly answering some of the replies: The practical side, slightly simplified, is that I defined iterator wrappers, i.e., custom iterator classes whose implementation depends on an existing ("wrapped") iterator. The type of the wrapped iterator is given as a template parameter and the wrapped iterator is given as a constructor argument. I carelessly assumed that I could define operator->()
in the wrapper class as &*wrapped_iterator
. This worked on clang, gcc, MSVC release build, and even in MSVC debug build for non-contiguous iterators. However, it resulted in assertion failures on MSVC's debug build for contiguous iterators, in functions like std::vector::assign(first,last)
where first and last are wrappers around contiguous iterators and first and last are both past-the-end iterators. The reason was that MSVC's vector::assign
invokes std::address_of(first)
even if first==last
, which I didn't anticipate. In turn, address_of
invoked first.operator->()
for my contiguous iterator wrapper. Since I had defined operator->()
as &*wrapped_iterator
, it invokes operator*
in the wrapped iterator whose behavior under the given conditions is undefined. In this particular case it resulted in an assertion failure, because MSVC's debug mode has special code that checks things like this (_ITERATOR_DEBUG_LEVEL
).
So I need to change my iterator wrapper's implementation of operator->()
. My first idea was to make it invoke operator->()
on the wrapped iterator. However, that is not guaranteed to be defined (see the accepted answer). What I have to do, is invoke std::to_address
on the wrapped iterator.
By my reading of the standard, a random_access_iterator
is not required to define operator->
. Thus, generic algorithms must only use operator*
. Of course concrete iterators are allowed to define additional methods, AFAIK the standard poses no requirements on those. So a non-contiguous iterator with UB in end.operator->()
should be fine; C++20 algorithms should not be calling operator->
at all.
(I think the rationale here is that some iterators might want to return elements by value, but operator->
is required to return a pointer. So those iterators are forced to not define any operator->
.)
It's only contiguous_iterator
that introduces the operator->
requirement (indirectly via std::to_address
). There, despite the common "obviously it's undefined behavior" reflex in the comments here, it must not have undefined behavior: std::to_address
(c) ==
std::to_address
(a) +
std::iter_difference_t
<I>(c - a)
must hold even when c
is a past-the-end iterator. This makes sense when you consider that a past-the-end contiguous iterator can be converted to a past-the-end pointer via &c.operator->()
.
The situation is different for legacy iterators (pre-C++20 algorithms): these expect that a->m
is equivalent to (*a).m
([input.iterators]
table). This also allows UB for non-contiguous iterators.
Note that these legacy iterator requirements are still used in C++20, e.g. for std::sort
; it's only the newer C++20-only functions like std::ranges::sort
that use the new concept-based iterator requirements.
(by the way, these language-lawyer question should be based on the actual standard text, not cppreference. Though in this case it turns out there's no significant difference between them.)