c++stringlanguage-lawyerstd

Can I dereference std::string.end()?


I believe a common response to this is "no," as the end() iterator for containers represents a "past-the-end" address which is undefined behavior to dereference. I can't find an explicit statement in the standard that exempts strings from this constraint, even though strings have a special case over other containers.

The C++11 standard declares that you can read one index past the end of a string. string[size()] references a read-only value of a null terminator.

24.3.2.5 basic_string element access [string.access]

const_reference operator[](size_type pos) const;

reference operator[](size_type pos);

(1) Requires: pos <= size().

(2) Returns: *(begin() + pos) if pos < size(). Otherwise, returns a reference to an object of type charT with value charT(), where modifying the object to any value other than charT() leads to undefined behavior.

front() is defined to be equivalent to return operator[](0) which is equivalent to return operator[](size()) for an empty string.

end() - begin() is well-defined to be a difference of the length of the string, so end() must be pointing to the index of size() for a sane implementation to define that arithmetic.

In the above standard excerpt, it states that operator[](pos) is equivalent to *(begin() + pos) if pos < size(). It does not say that you can dereference begin() + size(), but do you think it is reasonable to assume that this should be well defined? Or better yet, do you know of some proof that exempts string iterators from the constraint?

Additionally, can it be proven that *(begin() + i) for any i is equivalent to operator[](i)?


Solution

  • From the definition of string.end():

    Returns: An iterator which is the past-the-end value.

    and from the definition for past-the-end:

    ... Such a value is called a past-the-end value. Values of an iterator i for which the expression *i is defined are called dereferenceable. The library never assumes that past-the-end values are dereferenceable. ...

    The emphasis is mine, and I would guess that any exception made for std::string would be mentioned in the first link. Since it's not, dereferencing std::string.end() is undefined by omission.