I am seeing LTO optimize some global objects out from a TU if there are no functions in that TU that are being explicitly from another TU.
The following excerpt attempts to describe the key classes and files involved (please note that it's just for demonstration purposes and may not be completely accurate in all the places):
I have a singleton class Registrar
that maintains a list of all the objects of type Foo
that have been constructed. To avoid the static order of construction fiasco, I dynamically construct the instance of this object when the first object of type Foo has been constructed.
// Registrar.hpp
class Registrar
{
public:
static Registrar * sRegistrar;
std::vector<Foo *> objectList;
Registrar() = default;
};
Next, we have the class Foo
. This class's instances register with Registrar
as noted above.
// Foo.hpp
class Foo
{
public:
Foo()
{
if (Registrar::sRegistrar == nullptr)
Registrar::sRegistrar = new Registrar();
Registrar::sRegistrar->objectList.push_back(this);
}
};
The instances of Foo
are globals that may be created from several files. In one such file, we happen to have another function defined that gets called from elsewhere:
// file1.hpp
void someFunctionThatIsCalledExplicitly()
{
doSomething();
}
namespace
{
__attribute__((used, retain))
Foo f1;
}
But in another file, we just have an instance of Foo
being created:
// file2.hpp
namespace
{
__attribute__((used, retain))
Foo f2;
}
What I am seeing is that f2
is getting optimized out, while f1
is not, this is despite adding __attribute__((used, retain))
for all declarations of class Foo
.
How should I prevent LTO from optimizing out these instances? Why are the attributes making no difference?
EDIT: I was able to write a small example to reproduce said issue.
#include <iostream>
#include "Registrar.hpp"
#ifdef FORCE_LINKAGE
extern int i;
#endif
extern void someFunctionThatIsCalledExplicitly();
int main()
{
#ifdef FORCE_LINKAGE
i++;
#endif
someFunctionThatIsCalledExplicitly();
if (Registrar::sRegistrar == nullptr)
{
std::cout << "No instances of foo";
}
else
{
std::cout << Registrar::sRegistrar->objectList.size() << " instances of foo\n";
}
return 0;
}
#pragma once
class Foo
{
public:
Foo();
};
#include "Foo.hpp"
#include "Registrar.hpp"
Foo::Foo()
{
if (Registrar::sRegistrar == nullptr)
{
Registrar::sRegistrar = new Registrar();
}
Registrar::sRegistrar->objectList.push_back(this);
}
#pragma once
#include <vector>
#include "Foo.hpp"
class Registrar
{
public:
static Registrar * sRegistrar;
std::vector<Foo *> objectList;
Registrar() = default;
};
#include "Registrar.hpp"
Registrar * Registrar::sRegistrar = nullptr;
#include <iostream>
#include "Foo.hpp"
void someFunctionThatIsCalledExplicitly()
{
std::cout << "someFunctionThatIsCalledExplicitly() called\n";
}
namespace
{
__attribute__((used, retain))
Foo f1;
}
#include "Foo.hpp"
#ifdef FORCE_LINKAGE
int i = 0;
#endif
namespace
{
__attribute__((used, retain))
Foo f2;
}
CC = clang++
LIBTOOL = libtool
BUILDDIR = build
BINFILE = lto
BUILDFLAGS = -flto -std=c++17
LINKFLAGS = -flto
.PHONY: all
all: $(BUILDDIR) $(BINFILE)
.PHONY: force
force: def all
.PHONY: def
def:
$(eval BUILDFLAGS += -DFORCE_LINKAGE)
$(BINFILE): foo files
$(CC) -o $(BUILDDIR)/$@ $(LINKFLAGS) -L$(BUILDDIR) $(addprefix -l, $^)
foo: Foo.o main.o Registrar.o
$(LIBTOOL) $(STATIC) -o $(BUILDDIR)/lib$@.a $(addprefix $(BUILDDIR)/, $^)
files: File1.o File2.o
$(LIBTOOL) $(STATIC) -o $(BUILDDIR)/lib$@.a $(addprefix $(BUILDDIR)/, $^)
%.o: %.cpp
$(CC) $(BUILDFLAGS) -c -o $(addprefix $(BUILDDIR)/, $@) $<
.PHONY: $(BUILDDIR)
$(BUILDDIR):
mkdir -p $(BUILDDIR)
.PHONY: clean
clean:
rm -rf $(BUILDDIR)
I have two variants, one which is similar to above (I only see 1 instance) and another where I force linkage by declaring a global variable that I refer to elsewhere (here I see both instances):
$ make
$ ./build/lto
someFunctionThatIsCalledExplicitly() called
1 instances of foo
$ make force
$ ./build/lto
someFunctionThatIsCalledExplicitly() called
2 instances of foo
OK, I did some digging and the fact you're linking the .a library is the culprit here, not the LTO, neither any other optimization.
This had been brought up on SO before btw, see: Static initialization and destruction of a static library's globals not happening with g++
When linking the .o files (as I did on godbolt) everything goes in and it works.
For .a files only the referenced code is linked, the rest is not. Creating a dummy variable is one workaround, but the proper one is passing --whole-archive
to the linker.
I could not run your makefile-based example due to issues with libtool, but have a look at my CMake config:
cmake_minimum_required(VERSION 3.18)
project(LINK)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY "${PROJECT_BINARY_DIR}")
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY "${PROJECT_BINARY_DIR}")
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY "${PROJECT_BINARY_DIR}")
add_library(Files File1.cpp File2.cpp)
target_include_directories(Files
INTERFACE ${CMAKE_CURRENT_SOURCE_DIR}
)
target_compile_definitions(Files PUBLIC ${FORCE})
add_executable(test Foo.cpp main.cpp Registrar.cpp)
# note the line below
target_link_libraries(test -Wl,--whole-archive Files -Wl,--no-whole-archive)
target_compile_definitions(test PUBLIC ${FORCE})
When linking it will invoke the command the more-less the following way:
g++ -o test -Wl, --whole-archive -l:libFiles.a -Wl, --no-whole-archive Foo.o Registrar.o main.o