I'm working on code examples that demonstrate how you can "shoot yourself in the foot" using pointers in C++.
It's easy to create code that crashes. But now I'm trying to write code that would change the value of a constant, and it's not working.
Here's a sample code:
int main()
{
int first = 1;
int second = 2;
const int the_answer = 42;
int third = 3;
int fourth = 4;
cout << "First : " << &first << " -> " << first << endl;
cout << "Second : " << &second << " -> " << second << endl;
cout << "TheAns : " << &the_answer << " -> " << the_answer << endl;
cout << "Third : " << &third << " -> " << third << endl;
cout << "Fourth : " << &fourth << " -> " << fourth << endl << endl;
for (int *pc = &second; pc < &third; pc++) {
*pc = 33;
cout << pc << "->" << * pc << endl;
}
cout << "First : " << &first << " -> " << first << endl;
cout << "Second : " << &second << " -> " << second << endl;
cout << "TheAns : " << &the_answer << " -> " << the_answer << endl;
cout << "Third : " << &third << " -> " << third << endl;
cout << "Fourth : " << &fourth << " -> " << fourth << endl << endl;
return 0;
}
I can see in the output that the contents of the address of the constant (0x56F40FF574) gets overwritten:
First : 00000056F40FF534 -> 1
Second : 00000056F40FF554 -> 2
TheAns : 00000056F40FF574 -> 42
Third : 00000056F40FF594 -> 3
Fourth : 00000056F40FF5B4 -> 4
00000056F40FF554->33
00000056F40FF558->33
00000056F40FF55C->33
00000056F40FF560->33
00000056F40FF564->33
00000056F40FF568->33
00000056F40FF56C->33
00000056F40FF570->33
00000056F40FF574->33 <---
00000056F40FF578->33
00000056F40FF57C->33
00000056F40FF580->33
00000056F40FF584->33
00000056F40FF588->33
00000056F40FF58C->33
00000056F40FF590->33
First : 00000056F40FF534 -> 1
Second : 00000056F40FF554 -> 33
TheAns : 00000056F40FF574 -> 42
Third : 00000056F40FF594 -> 3
Fourth : 00000056F40FF5B4 -> 4
I stepped through the code with debugger, and I saw the value of the constant the_answer
change in "locals" window. But then, cout
displays the original value.
Your code has multiple issues, and what you observe is the result of undefined behaviors, and unless you really realize the importance of undefined behavior and carefully avoid them, your C++ code will not do what you naively expect it to do. C++ is not just glorified assembly (nor is C). It has its own very strict rules.
that demonstrate how you can "shoot yourself in the foot" using pointers in C++.
Yes, you have just demonstrated some footguns in C++, although not your expected ones.
Let's check what the C++ standard has to say.
In dcl.type.cv:
Any attempt to modify ([expr.ass], [expr.post.incr], [expr.pre.incr]) a const object ([basic.type.qualifier]) during its lifetime ([basic.life]) results in undefined behavior.
followed by some examples. In your case, even if your *pc=33
does an assignment to the object the_answer
(actually it does not, and we'll soon see why), it is undefined behavior because you are attempting to modify a const object within its lifetime.
In any case, a compiler is free to assume that value of the object the_answer
never changes because it is a const object, and is free to optimize
cout << "TheAns : " << &the_answer << " -> " << the_answer << endl;
to
cout << "TheAns : " << &the_answer << " -> " << 42 << endl;
If this optimzation is performed (which is a fairly well-known technique by compiler implementors, and is known as constant propagation), it is quite natural that you can observe the change of the memory at &the_answer
when the output is still 42.
Now that we have clarified this part, let's check other undefined behaviors in your code.
In C++, you cannot compare two arbitrary pointers using the built-in relational operator and get a reliable result. You may expect them to naively just compare the address, but that is not what the standard says. Let's check expr.rel:
The result of comparing unequal pointers to objects is defined in terms of a partial order consistent with the following rules:
- If two pointers point to different elements of the same array, or to subobjects thereof, the pointer to the element with the higher subscript is required to compare greater.
- If two pointers point to different non-static data members of the same object, or to subobjects of such members, recursively, the pointer to the later declared member is required to compare greater provided neither member is a subobject of zero size and their class is not a union.
- Otherwise, neither pointer is required to compare greater than the other.
If two operands p and q compare equal ([expr.eq]), p<=q and p>=q both yield true and pq both yield false. Otherwise, if a pointer to object p compares greater than a pointer q, p>=q, p>q, q<=p, and q<p all yield true and p<=q, p<q, q>=p, and q>p all yield false. Otherwise, the result of each of the operators is unspecified.
In particular, that means a implementation can give you the result that both pc>&third
and pc<&third
are true, or both are false. Anyway this is not undefined behavior, only unspecified. If you really mean to compare their address, you may use reinterpret_cast<std::uintptr_t>(p) < reinterpret_cast<std::uintptr_t>(q)
, and it becomes implementation-defined behavior (Note that the standard also does not guarantee the reinterpret_cast
s to give you the addresses). Or you can consider using std::less{}(p, q)
which at least guarantees a total order (so that it can never be that both std::less{}(p, q)
and std::less{}(q, p)
are true).
Now let's further assume your compiler does the correct thing and compares the addresses as you expect, then we run to undefined behaviors related to pointer arithmetics,
The expression pc++
is equivalent to pc = pc + 1
, so let's check the validity of that, expr.add says,
When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.
- If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
- Otherwise, if P points to a (possibly-hypothetical) array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
- Otherwise, the behavior is undefined.
That means except for the case of adding 0 to nullptr
, pointer arithmetics are only valid for objects within the same array (including the past-the-end pointer). In particular, a variable of non-array type can be considered as an array of size one, and since pc
originally points to the single element of the hypothetical array, after pc++
, pc
becomes past-the-end pointer of the array. The pointer arithmetic here is fine so far.
However, expr.unary.op states,
The unary * operator performs indirection. Its operand shall be a prvalue of type “pointer to T”, where T is an object or function type. The operator yields an lvalue of type T. If the operand points to an object or function, the result denotes that object or function; otherwise, the behavior is undefined except as specified in [expr.typeid].
and basic.compound has this to say:
A value of a pointer type that is a pointer to or past the end of an object represents the address of the first byte in memory ([intro.memory]) occupied by the object or the first byte in memory after the end of the storage occupied by the object, respectively.
[Note 2: A pointer past the end of an object ([expr.add]) is not considered to point to an unrelated object of the object's type, even if the unrelated object is located at that address. — end note]
In particular, Note 2 rules out the possiblilty of using *pc
to refer to the object the_answer
even if it happens to be at the same address (which is, again, not guaranteed at all.1). So this makes your *pc = 33
(after incrementing pc
) undefined behavior by itself, regardless of whether the_answer
is const
or not, or whether printing out &the_answer
and pc
shows you the same address.
Since your code has at least two undefined behaviors, there is no guarantee about the observed behavior of your program. While most answers merely states that modifying a const object is UB (which it is), this answer explains why *pc=33
is not even a valid attempt to modify the_answer
(*pc
does not refer to the_answer
after pc
is incremented).
1As @Peter comments, the_answer
may not even exist in the memory in general when it is never ODR-used. In your case, std::cout << &the_answer
counts as an ODR-use of the_answer
, so it is guaranteed to be in the memory, but << the_answer
is not an ODR-use and the compiler is free not to emit any memory read instructions.