c++c++11gccg++

Strange GCC Behaviour


Given the following C++ code:

struct vertex_type {
    float x, y, z;

    //vertex_type() {}
    //vertex_type(float x, float y, float z) : x(x), y(y), z(z) {}
};

typedef struct {
    vertex_type vertex[10000];
} obj_type;

obj_type cube = {
    {
        {-1, -1, -1},
        {1, -1, -1},
        {-1, 1, -1},
        {1, 1, -1},

        {-1, -1, 1},
        {1, -1, 1},
        {-1, 1, 1},
        {1, 1, 1}
    }
};

int main() {
    return 0;
}

When I added the (currently commented out) constructors into the vertex_type struct, it abruptly 10-15 second rise in compilation time. Stumped, I looked to the assembly generated by gcc (using -S), and noticed that code-gen size was several hundred times bigger than before.

...
movl    $0x3f800000, cube+84(%rip)
movl    $0x3f800000, cube+88(%rip)
movl    $0x3f800000, cube+92(%rip)
movl    $0x00000000, cube+96(%rip)
...
movl    $0x00000000, cube+119996(%rip)
...

By leaving out the constructor definition, the generated assembly was completely different.

.globl cube
    .data
    .align 32
    .type   cube, @object
    .size   cube, 120
cube:
    .long   3212836864
    .long   3212836864
    .long   3212836864
    .long   1065353216
    .long   3212836864
    .long   3212836864
    .long   3212836864
    .long   1065353216
    .long   3212836864
    .long   1065353216
    .long   1065353216
    .long   3212836864
    .long   3212836864
    .long   3212836864
    .long   1065353216
    .long   1065353216
    .long   3212836864
    .long   1065353216
    .long   3212836864
    .long   1065353216
    .long   1065353216
    .long   1065353216
    .long   1065353216
    .long   1065353216
    .zero   24
    .text

Obviously there is a significant difference in the code generated by the compiler. Why is that? Also, why does gcc zero all the elements in one situation and not the other?

edit: I am using the following compiler flags: -std=c++0x with g++ 4.5.2.


Solution

  • This is a long-standing missing optimization in GCC. It should be able to generate the same code for both cases, but it can't.

    Without the constructors, your vertex_type is a POD structure, which GCC can initialize static/global instances of at compile time. With the constructors, the best it can do is generate code to initialize the global at program startup.