I've inherited a codebase with a fondness for const references. I found a devilish bug and am wondering how it ever slipped past the compiler. My C++ is super rusty, so I would appreciate some guidance.
As a MRE, suppose you have a struct and a function that returns an instance by value:
typedef struct
{
int a;
int b;
} Data;
Data Make_Data(int aVal)
{
Data ret = { aVal, 2*aVal };
return ret;
}
It is illegal to pass that to a method expecting a reference:
class Bar
{
public:
Bar(Data& theData) : data(theData) {};
Data& data;
};
int main()
{
Bar *bar1 = new Bar(Make_Data(5)); //error: candidate constructor not viable: expects an lvalue for 1st argument
}
But legal if it's a const ref:
class Foo
{
public:
Foo(const Data& theData) : data(theData) {};
const Data& data;
};
int main()
{
Foo *foo1 = new Foo(Make_Data(1));
}
The problem is, I believe, the return value of Make_Data
is on the stack. A reference to that will promptly go out of scope. Sure enough, if I put this code into an online repl the data is corrupted for certain optimisation settings.
I understand the various arguments for pass by value vs pass by const ref. But I suppose I'm missing the bit about why it's okay to pass a return value (which is a temporary, right?) as a const reference. Not from an for-optimal-performance point of view, but from why-at-all point of view. Is this a unique use-case for a const, where it prevents an additional full copy but still enables individual field access? Or just a case of caveat emptor?
By the way, I think another way to phrase the question would be, why is the restriction on const reference lifetime expansion not enforced? I've already found myself applying Guideline C.12 a few times, but this is the first instance of "ref to temp" I've found.
Full MRE for your own experimentation if the repl link above is broken:
#include <iostream>
#include <string>
typedef struct
{
int a;
int b;
} Data;
Data Make_Data(int aVal, int bVal)
{
Data ret = { aVal, bVal };
return ret;
}
class Foo
{
public:
Foo(const Data& theData) : data(theData) {};
const Data& data;
};
class Bar
{
public:
Bar(Data& theData) : data(theData) {};
Data& data;
};
int main()
{
Foo *foo1 = new Foo(Make_Data(1, 2));
Foo *foo2 = new Foo(Make_Data(3, 4));
//Bar *bar1 = new Bar(Make_Data(5, 6)); //error: candidate constructor not viable: expects an lvalue for 1st argument
std::cout << "a=" << foo1->data.a << ", b=" << foo1->data.b << std::endl;
std::cout << "a=" << foo2->data.a << ", b=" << foo2->data.b << std::endl;
delete foo1;
delete foo2;
}
The problem is, I believe, the return value of Make_Data is on the stack. A reference to that will promptly go out of scope.
No, the problem is that the reference goes out of scope. Your phrasing suggests that going out of scope is a consequence of the problem, but it is the problem. With a small change that extends the lifetime, a value on the stack is not problematic.
int main()
{
const Data& value_ref = Make_Data(1); // Still binding to a const reference
Foo *foo1 = new Foo(value_ref);
// Do stuff -- reference is valid
delete foo1;
}
Thus, the problem is not "on the stack". The problem is "go out of scope".
Another red herring is the use of a temporary object. The dangling reference is not exclusive to temporary objects. In the above code, I gave the temporary a name, which had the side-effect of keeping the object alive. With another small change, the named object can be forced out of scope early, leaving you with the same dangling reference.
int main()
{
Foo *foo1 = nullptr;
{
// This time, make `value` a non-reference so it is clearly not a
// temporary object.
Data value = Make_Data(1);
foo1 = new Foo(value);
// The non-temporary `value` goes out of scope here.
}
// Reference is dangling at this point.
delete foo1; // A smart pointer would simplify this part
}
While the above is somewhat contrived, it could easily happen organically in a more complex program, especially since the Foo
object is dynamically allocated (so it easily outlives the current scope).
In summary, your proposed remedy (disallow binding a temporary object to a const reference) does not address the root problems, so it would be an inadequate solution.
But I suppose I'm missing the bit about why it's okay to pass a return value (which is a temporary, right?) as a const reference.
Because it's useful. For example, if binding a temporary object to a const reference was illegal, many compound expressions – something with more than one operator — would no longer be possible without unnecessary copying of parameters.
Consider string concatenation. Let's say you have three strings that you want to combine.
std::string hello = "Hello";
std::string world = " world";
std::string bang = "!";
// Parentheses are not needed, but might add clarity.
std::string greeting = (hello + world) + bang;
The initialization of greeting
twice uses an operator+
with the following signature.
std::string operator+(const std::string& lhs, const std::string& rhs);
This operator returns a temporary object, just like your Make_Data
. So when the sub-expression hello + world
is evaluated, the result is a temporary object. This temporary object becomes the first argument to the second invocation of operator+
(where the second argument is bang
). That is, the temporary object binds to const std::string& lhs
, while bang
binds to const std::string& rhs
(it is not copied). Useful and perfectly safe (because operator+
does not try to retain a reference, unlike your Foo
constructor).
In other words, your proposed remedy (disallow binding a temporary object to a const reference) would cause more problems than it solves.