c++

Why is assigning a temporary value to a const ref allowed?


I've inherited a codebase with a fondness for const references. I found a devilish bug and am wondering how it ever slipped past the compiler. My C++ is super rusty, so I would appreciate some guidance.

As a MRE, suppose you have a struct and a function that returns an instance by value:

typedef struct
{
    int a;
    int b;
} Data;

Data Make_Data(int aVal)
{
    Data ret = { aVal, 2*aVal };
    return ret;
}

It is illegal to pass that to a method expecting a reference:

class Bar
{
public:
    Bar(Data& theData) : data(theData) {};
    
    Data& data;
};

int main()
{
  Bar *bar1 = new Bar(Make_Data(5)); //error: candidate constructor not viable: expects an lvalue for 1st argument
}

But legal if it's a const ref:

class Foo
{
public:
    Foo(const Data& theData) : data(theData) {};
    
    const Data& data;
};

int main()
{
  Foo *foo1 = new Foo(Make_Data(1));
}

The problem is, I believe, the return value of Make_Data is on the stack. A reference to that will promptly go out of scope. Sure enough, if I put this code into an online repl the data is corrupted for certain optimisation settings.

I understand the various arguments for pass by value vs pass by const ref. But I suppose I'm missing the bit about why it's okay to pass a return value (which is a temporary, right?) as a const reference. Not from an for-optimal-performance point of view, but from why-at-all point of view. Is this a unique use-case for a const, where it prevents an additional full copy but still enables individual field access? Or just a case of caveat emptor?

By the way, I think another way to phrase the question would be, why is the restriction on const reference lifetime expansion not enforced? I've already found myself applying Guideline C.12 a few times, but this is the first instance of "ref to temp" I've found.


Full MRE for your own experimentation if the repl link above is broken:

#include <iostream>
#include <string>

typedef struct
{
    int a;
    int b;
} Data;

Data Make_Data(int aVal, int bVal)
{
    Data ret = { aVal, bVal };
    return ret;
}

class Foo
{
public:
    Foo(const Data& theData) : data(theData) {};
    
    const Data& data;
};

class Bar
{
public:
    Bar(Data& theData) : data(theData) {};
    
    Data& data;
};

int main()
{
  Foo *foo1 = new Foo(Make_Data(1, 2));
  Foo *foo2 = new Foo(Make_Data(3, 4));
  //Bar *bar1 = new Bar(Make_Data(5, 6)); //error: candidate constructor not viable: expects an lvalue for 1st argument
  std::cout << "a=" << foo1->data.a << ", b=" << foo1->data.b << std::endl;
  std::cout << "a=" << foo2->data.a << ", b=" << foo2->data.b << std::endl;
  delete foo1;
  delete foo2;
}

Solution

  • The problem is, I believe, the return value of Make_Data is on the stack. A reference to that will promptly go out of scope.

    No, the problem is that the reference goes out of scope. Your phrasing suggests that going out of scope is a consequence of the problem, but it is the problem. With a small change that extends the lifetime, a value on the stack is not problematic.

    int main()
    {
      const Data& value_ref = Make_Data(1); // Still binding to a const reference
      Foo *foo1 = new Foo(value_ref);
      // Do stuff -- reference is valid
      delete foo1;
    }
    

    Thus, the problem is not "on the stack". The problem is "go out of scope".

    Another red herring is the use of a temporary object. The dangling reference is not exclusive to temporary objects. In the above code, I gave the temporary a name, which had the side-effect of keeping the object alive. With another small change, the named object can be forced out of scope early, leaving you with the same dangling reference.

    int main()
    {
      Foo *foo1 = nullptr;
      {
        // This time, make `value` a non-reference so it is clearly not a
        // temporary object.
        Data value = Make_Data(1);
        foo1 = new Foo(value);
        // The non-temporary `value` goes out of scope here.
      }
      // Reference is dangling at this point.
      delete foo1; // A smart pointer would simplify this part
    }
    

    While the above is somewhat contrived, it could easily happen organically in a more complex program, especially since the Foo object is dynamically allocated (so it easily outlives the current scope).

    In summary, your proposed remedy (disallow binding a temporary object to a const reference) does not address the root problems, so it would be an inadequate solution.


    But I suppose I'm missing the bit about why it's okay to pass a return value (which is a temporary, right?) as a const reference.

    Because it's useful. For example, if binding a temporary object to a const reference was illegal, many compound expressions – something with more than one operator — would no longer be possible without unnecessary copying of parameters.

    Consider string concatenation. Let's say you have three strings that you want to combine.

      std::string hello = "Hello";
      std::string world = " world";
      std::string bang = "!";
    
      // Parentheses are not needed, but might add clarity.
      std::string greeting = (hello + world) + bang;
    

    The initialization of greeting twice uses an operator+ with the following signature.

    std::string operator+(const std::string& lhs, const std::string& rhs);
    

    This operator returns a temporary object, just like your Make_Data. So when the sub-expression hello + world is evaluated, the result is a temporary object. This temporary object becomes the first argument to the second invocation of operator+ (where the second argument is bang). That is, the temporary object binds to const std::string& lhs, while bang binds to const std::string& rhs (it is not copied). Useful and perfectly safe (because operator+ does not try to retain a reference, unlike your Foo constructor).

    In other words, your proposed remedy (disallow binding a temporary object to a const reference) would cause more problems than it solves.