Given the following C++ code:
struct vertex_type {
float x, y, z;
//vertex_type() {}
//vertex_type(float x, float y, float z) : x(x), y(y), z(z) {}
};
typedef struct {
vertex_type vertex[10000];
} obj_type;
obj_type cube = {
{
{-1, -1, -1},
{1, -1, -1},
{-1, 1, -1},
{1, 1, -1},
{-1, -1, 1},
{1, -1, 1},
{-1, 1, 1},
{1, 1, 1}
}
};
int main() {
return 0;
}
When I added the (currently commented out) constructors into the vertex_type
struct, it abruptly 10-15 second rise in compilation time.
Stumped, I looked to the assembly generated by gcc (using -S
), and noticed that code-gen size was several hundred times bigger than before.
...
movl $0x3f800000, cube+84(%rip)
movl $0x3f800000, cube+88(%rip)
movl $0x3f800000, cube+92(%rip)
movl $0x00000000, cube+96(%rip)
...
movl $0x00000000, cube+119996(%rip)
...
By leaving out the constructor definition, the generated assembly was completely different.
.globl cube
.data
.align 32
.type cube, @object
.size cube, 120
cube:
.long 3212836864
.long 3212836864
.long 3212836864
.long 1065353216
.long 3212836864
.long 3212836864
.long 3212836864
.long 1065353216
.long 3212836864
.long 1065353216
.long 1065353216
.long 3212836864
.long 3212836864
.long 3212836864
.long 1065353216
.long 1065353216
.long 3212836864
.long 1065353216
.long 3212836864
.long 1065353216
.long 1065353216
.long 1065353216
.long 1065353216
.long 1065353216
.zero 24
.text
Obviously there is a significant difference in the code generated by the compiler. Why is that? Also, why does gcc zero all the elements in one situation and not the other?
edit:
I am using the following compiler flags: -std=c++0x
with g++ 4.5.2.
This is a long-standing missing optimization in GCC. It should be able to generate the same code for both cases, but it can't.
Without the constructors, your vertex_type
is a POD structure, which GCC can initialize static/global instances of at compile time. With the constructors, the best it can do is generate code to initialize the global at program startup.