I mean, what exactly is generated when you instantiate a template variable that is not constexpr
?
Consider a basic variable template that calculates a factorial:
template<int N>
int fat = N*(fat<N-1>);
template<>
int fat<0> = 1;
int main() {
return fat<5>;
}
My intuition is that it would generate something like this:
int fat0 = 1;
int fat1 = 1*fat0;
int fat2 = 2*fat1;
int fat3 = 3*fat2;
int fat4 = 4*fat3;
int fat5 = 5*fat4;
int main() {
return fat5;
}
I tried to have a look at it on C++ Insights, but the generated code looks like this:
template<int N>
const int fat = N*(fat<N-1>);
template<>
const int fat<0> = 1;
int main()
{
return fat<5>;
}
... which doesn't help at all.
My next try was taking a look at the (optimized) generated assembly using godbolt.org, and see if there is any difference:
To my surprise, there is! The templated version has roughly double the amount of lines of assembly than the hand-written one. GCC seems to generate an extra "guard variable" for each instantiation. Clang also does this.
Now, considering the zero-overhead principle, these variables should be doing something important. Specifically, something that I missed when writing my "unrolled" version. What is this "something" I'm missing?
P.S.: To further hurt my brain, MSVC goes the inverse way, and the generated assembly for the templated version is actually 3x smaller than the version without templates. I can't make a lot of sense from the generated assembly, though, so I left it out of the main question.
The compiler do exactly what you are asking. The variable fat as external linkage so every instantiation must be accessible to any other program that would be dynamicaly linked to this one. So the compiler must generate the code.
But if you declare it static
the optimizer can remove the extra instantiations:
template<int N>
static int fat = N*(fat<N-1>);
template<>
int fat<0> = 1;
int main() {
return fat<5>;
}
NB: This code can be used as an example of why Clang should not be used for c++. The clang template stack is screwed up, Clang return 0.!!
Clang bug reported since 2016: bug 29033