Suppose we implement a string
class which represents, uhm, strings. We then want to add an operator+
which concatenates two string
s, and decide to implement that via expression templates to avoid multiple allocations when doing str1 + str2 + ... + strN
.
The operator will look like this:
stringbuilder<string, string> operator+(const string &a, const string &b)
stringbuilder
is a template class, which in turn overloads operator+
and has an implicit string
conversion operator. Pretty much the standard textbook exercise:
template<class T, class U> class stringbuilder;
template<> class stringbuilder<string, string> {
stringbuilder(const string &a, const string &b) : a(a), b(b) {};
const string &a;
const string &b;
operator string() const;
// ...
}
// recursive case similar,
// building a stringbuilder<stringbuilder<...>, string>
The above implementation works perfectly as long as someone does
string result = str1 + str2 + ... + strN;
However, it has a subtle bug. Assigning the result to a variable of the right type will make that variable hold references to all the strings that compose the expression. That means, for instance, that changing one of the strings will change the result:
void print(string);
string str1 = "foo";
string str2 = "bar";
right_type result = str1 + str2;
str1 = "fie";
print(result);
This will print fiebar, because of the str1 reference stored inside the expression template. It gets worse:
string f();
right_type result = str1 + f();
print(result); // kaboom
Now the expression template will contain a reference to a destroyed value, crashing your program straight away.
Now what's that right_type
? It is of course stringbuilder<stringbuilder<...>, string>
, i.e. the type the expression template magic is generating for us.
Now why would one use a hidden type like that? In fact, one doesn't use it explicitely -- but C++11's auto does!
auto result = str1 + str2 + ... + strN; // guess what's going on here?
The bottom line is: it seems that this way of implementing expression templates (by storing cheap references instead of copying values or using shared pointers) gets broken as soon as one tries to store the expression template itself.
Therefore, I'd pretty much like a way of detecting if I'm building a rvalue or a lvalue, and provide different implementations of the expression template depending on whether a rvalue is built (keep references) or a lvalue is built (make copies).
Is there an estabilished design pattern to handle this situation?
The only things I was able to figure out during my research were that
One can overload member functions depending on this
being an lvalue or rvalue, i.e.
class C {
void f() &;
void f() &&; // called on temporaries
}
however, it seems I can't do that on constructors as well.
In C++ one cannot really do ``type overloads'', i.e. offer multiple implementations of the same type, depending on how the type is going to be used (instances created as lvalues or rvalues).
I started this in a comment but it was a bit big for that. Then, let's make it an answer (even though it doens't really answer your question).
This is a known issue with auto
. For instance, it has been discussed by Herb Sutter here and in more details by Motti Lanzkron here.
As they say, there were discussions in the committee to add operator auto
to C++ to tackle this problem. The idea would be instead of (or in addition to) providing
operator string() const;
as you mentioned, one would provide
string operator auto() const;
to be used in type deduction contexts. In this case,
auto result = str1 + str2 + ... + strN;
would not deduce the type of result
to be the "right type" but rather the type string
because that's what operator auto()
returns.
AFAICT this is not going to happen in C++14. C++17 pehaps...