c++linkerstatic-libraries

Linker drops globals (and their constructors) from static library


I've got a singleton registry that maps names to function pointers. I also have a registrar object whose constructor registers a function pointer.

My goal is to have the functions and registry in a static library, but I found that the linker omits the registrations when built like that.

Here's a simplified illustration. For brevity, I've replaced the function pointers with pointers to global ints, and the registry's associative container with a vector.

// registry.h
#include <vector>
std::vector<int const *> &GetRegistry();
// registry.cpp
#include "registry.h"

std::vector<int const *> &GetRegistry() {
    static std::vector<int const *> registry;
    return registry;
}
// thingadder.h
#include "registry.h"

class ThingAdder {
    public:
        explicit ThingAdder(int const *thing) {
            GetRegistry().push_back(thing);
        }
};
// things.cpp
#include "thingadder.h"

int g_thing1 = 1;
ThingAdder g_adder1(&g_thing1);
// main.cpp
#include "registry.h"
#include <print>
#include <vector>

int main() {
    std::print("registry size: {}\n", GetRegistry().size());
    return 0;
}

If all of the files are compiled and linked as a single project, g_adder1's constructor adds g_global1's address to the registry, and the program reports that the registry size is 1.

But when everything except main.cpp is built into a static library, and then main.cpp is compiled and linked against that library, the reported registry size is 0. It appears the globals in things.cpp have been omitted by the link.

I can sort of understand why that's happening: Nothing outside things.cpp directly references the global nor its registrar object. But that's true even when built as a monolith. I wouldn't have expected bundling that portion into a static library to change the behavior.

Solutions?

The only one I've found is to have main.cpp reference a symbol defined in the things.cpp translation unit. In the actual library, there will be more files of objects to be registered, and I don't want to make each user of the library add a reference to a symbol for each of them.


Solution

  • See the Stackoverflow tag wiki for static-libraries.

    From this you'll see the difference between inputting things.o[bj] directly to your linkage and inputting a static library one of whose members is things.o[bj].

    An object file input explicitly is always linked unconditionally. The linker doesn't skip it for failing to define any unresolved references (because if it did then linkage could never get started). Object files that are members of an input static library are offered to the linker as needed. An object file will be extracted and linked only if the linkage refers to an external symbol that is defined by that library member.

    When you link your main.o[bj] with a static library containing things.o[bj] and registry.o[bj], the offer of things.o[bj] is superflous to the linkage, because main.o[bj] (the sole explicit object file) does not refer to anything defined in things.o[bj]. The offer of registry.o[bj] is needed. So registry.o[bj] is extracted and linked; things.o[bj] is ignored.

    Solutions?

    1. Just link the object files you need

    By default, when you want your program to contain a particular object file, you link it explicitly in the program's build system: that makes it one of the starting points of the linkage. You'd want some supervening reason for first placing it in a static library. There are such reasons (e.g. serving the linking of a unit-test driver as well as the program), but then you'll need to reckon with the difference you see that it makes to the linkage of the program.

    2. /WHOLEARCHIVE | --whole-archive

    Sometimes, you really do want all the members of a static library to be linked into the output image. For that case the linker provides an option that coerces it to link all the members of a static library to which the option applies, whether they are needed or not. For MS LINK that option is /WHOLEARCHIVE and for GNU ld it is --whole-archive.

    But...

    The foremost use case for this option is the linkage of a dynamic library that is simply to be a dynamic implementation of some static library. The normal motive for maintaining a static library of registerable things for linkage with programs would be that you want to facilitate different programs registering different selections of them, as-needed.

    If you maintain such a portmanteau static library of all thing + thing-registration object files and link it whole-archive with programs then they will all link all the things in the library, and their registrations, whether the program functionally wants them or not.

    A custom build-step for a client program can avoid linking any dead wood by extracting from the portmanteau library just the object files it needs, using your toolchain's archive manager, then re-archiving the chosen object files in an application-specific static library that you link with your program. But the application-specific library serves no purpose. You might as well just extract the needed object files and input them to program linkage.

    3. A partially linked object file

    @CraigEstey comments that GNU linker ld provides the -r|--relocatable option that will partially (a.k.a incrementally) link the input object files into a single output object file combining them all, without faulting undefined references.

    This option is available to you on Windows in a GCC toolchain for Windows, to create a single object file that defines any or all things at your disposal and their registrations, but it does not have an equivalent for MS link. So to use it you will need to build any client program with the GCC toolchain, not Microsoft's, because the C++ object files produced by GCC on Windows are not binary compatible with those produced by MS cl.

    The use of such a partially linked object file combining multiple others is for present purposes equivalent to the use of a static library that contains the same others, to which whole-archive is applied. So it is subject to the same considerations: If it combines more object files than a client program needs then you link dead wood, and it if is custom-created per client program then you might as well just link the object files combined in it.

    4. Separate things from their registrations in the linkage.

    As I said, one would normally maintain a static library of registerable things so that different programs can register different ones. You'd have exactly one thing defined per member object file. In that case you would compile the registrations of the things needed by a client program within one or more object files that are not in the library and explicitly link those object files with the client program, and the static library. Each registration will refer to the thing it is registering and oblige the linker to extract the object files that define the registered things from the static library, but no others. You would provide the static library with a header file that externally declares all the registerable things, for inclusion in the registration source code.