c++compiler-optimizationstatic-constructor

Why can compiler not optimize out unused static std::string?


If I compile this code with GCC or Clang and enable -O2 optimizations, I still get some global object initialization. Is it even possible for any code to reach these variables?

#include <string>
static const std::string s = "";

int main() { return 0; }

Compiler output:

main:
        xor     eax, eax
        ret
_GLOBAL__sub_I_main:
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:s
        mov     edi, OFFSET FLAT:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev
        mov     QWORD PTR s[rip], OFFSET FLAT:s+16
        mov     QWORD PTR s[rip+8], 0
        mov     BYTE PTR s[rip+16], 0
        jmp     __cxa_atexit

Specifically, I was not expecting the _GLOBAL__sub_I_main: section.

Godbolt link

Edit: Even with a simple custom defined type, the compiler still generates some code.

class Aloha
{
public:
    Aloha () : i(1) {}
    ~Aloha() = default;
private:
    int i;
};
static const Aloha a;
int main() { return 0; }

Compiler output:

main:
        xor     eax, eax
        ret
_GLOBAL__sub_I_main:
        ret

Solution

  • Compiling that code with short string optimization (SSO) may be an equivalent of taking address of std::string's member variable. Constructor have to analyze string length at compile time and choose if it can fit into internal storage of std::string object or it have to allocate memory dynamically but then find that it never was read so allocation code can be optimized out.

    Lack of optimization in this case might be an optimization flaw limited to such simple outlying examples like this one:

    const int i = 3;
    
    int main()
    {
        return (long long)(&i);  // to make sure that address was used
    }
    

    GCC generates code:

    i:
            .long   3     ; this a variable
    main:
            push    rbp
            mov     rbp, rsp
            mov     eax, OFFSET FLAT:i
            pop     rbp
            ret
    

    GCC would not optimize this code as well:

    const int i = 3;
    const int *p = &i;
    int main() {  return 0; }
    

    Static variables declared in file scope, especially const-qualified ones can be optimized out per as-if rule unless their address was used, GCC does that only to const-qualified ones regardless of use case. Taking address of variable is an observable behaviour, because it can be passed somewhere. Logic which would trace that would be too complex to implement and would be of little practical value.

    Of course, the code that doesn't use address

    const int i = 3;
    int main() { return i; }
    

    results in optimizing out reserved storage:

    main:
        mov     eax, 3
        ret
    

    As of C++20 constexpr construction of std::string? Per older rules it could not be a compile-time expression if result was dependant on arguments. It possible that std::string would allocate memory dynamically if string is too long, which isn't a compile-time action. It appears that only mainstream compiler that supports C++20 features required for that it at this moment is MSVC in certain conditions.