I have a Test library Header File ObjectA.h
#pragma once
namespace sarora::testing {
void testingObject();
void againTesting();
} // namespace sarora::testing
Cpp file ObjectA.cpp
#include "ObjectA.h"
#include <iostream>
namespace sarora::testing {
void testingObject() {
std::cout << "testingObject" << std::endl;
}
void againTesting() {
std::cout << "againTesting" << std::endl;
}
} // namespace sarora::testing
Now, I have the buck for this defined as
cpp_library(
name = "object",
srcs = ["ObjectA.cpp"],
headers = ["ObjectA.h"],
link_whole = True,
)
Once I am done with the cpp library, I add it to the main.cpp
#include "ObjectA.h"
#include <iostream>
using namespace std;
int main(int argc, char* argv[]) {
sarora::testing::testingObject();
}
This is the buck for the final main
cpp_binary(
name = "test",
srcs = ["test.cpp"],
deps = [
":object",
],
)
Now, note that I only used the testingObject in the main.cpp. When I try to check the symbol table, I do "nm main_executable_path | grep testingObject" and I get the symbol
But when I do grep againTesting, I don't see the symbol, so what is the function of link_whole defined in buck here https://buck.build/rule/cxx_library.html#link_whole
What you are expecting to achieve here is:
libobject
with option link_whole
test
that is linked against libobject
, having
libobject
as a whole statically linked into test
, so that -test
you will be able see that all the symbols
defined in libobject
are likewise defined in test
.Why you can't do that with a shared library
This is impossible by the nature of a shared library, as distinct from
a static library. (I'll stick with the usual unix-style naming conventions for
libraries - libfoo.so
is the shared library build of the foo
library; libfoo.a
is the static library build - although the buck build system has slightly different
ones.)
When you link an executable against a shared library libfoo.so
, no part of libfoo.so
is statically transcribed into your program. If your program contains no undefined references to
symbols defined by libfoo.so
then by default absolutely nothing about libfoo.so
is
written into your program. It might as well not exist. If your program does
make an undefined reference to any symbol sym
defined by libfoo.so
- and libfoo.so
is
the first library the static linker finds that defines sym
- then the static
linker merely:
sym
from the global symbol table of the
executable into its dynamic symbol table.libfoo.so
.By default that is all that happens. sym
remains an undefined symbol in the executable.
It is left to the runtime linker to notice that note when the executable is loaded to run
the program, search for libfoo.so
, load it into the program's address space and resolve any
undefined dynamic symbols in the program that are defined by libfoo.so
, or by any other
shared libraries that the program needs.
You can override the default behaviour by passing the option --no-as-needed
to the static
linker. But that will merely make it note "This program needs libfoo.so
" in the executable
even if it is not true, i.e. if the program does not actually make undefined references to symbols
defined by libfoo.so
. That's all. A shared library is a library whose linkage with
a program leaves dynamic symbol resolution entirely to the runtime linker. You don't
even have to get the static linker to write notes in an executable to tell the runtime
linker what shared libraries it needs. The program itself can call the runtime linker
to find and load libfoo.so
and give it the addresses of symbols defined therein.
That's doing it the first-principles way.
Why you can do that with a static library.
On the other hand when you link an executable against a static library libfoo.a
, what goes on
is completely different, as described by the Stackoverflow tag-wiki for static-libraries
.
libfoo.a
is a bag of object files from which the static linker will select just the ones
it needs to resolve symbols referenced but not already defined in the executable, take
them out of the bag and statically link them into the executable like any other object files
in the linkage. Nothing but an object file can be statically linked into an executable. That means
that nothing but the linkage of an object file can make the linker physically incorporate symbol
definitions into an executable.
Sometimes you may want the static linker to take all of the object files out of the bag and link them in the executable, whether they are needed or not. To do that you use the linker options:
--whole-archive libfoo.a ... --no-whole-archive
if you're invoking the static linker directly. Or:
-Wl,--whole-archive libfoo.a... -Wl,--no-whole-archive
if you're invoking it via GCC/Clang, as usual. (Vital to turn off --whole-archive
after all the
libraries you want it to apply to, because it will continue to apply to subsequent libraries
until you do so.)
libfoo.a
can just be replaced with the usual linkage option -lfoo
if static linkage is in effect when you do this:
linker option -Bstatic
, activated by GCC/Clang linkage option -static
. While -Bstatic
is
in effect the linker will not resolve -lfoo
to a shared library libfoo.so
- which it does by default -
and will only accept the static library libfoo.a
, if it can find it. (-Bstatic
also continues to be in effect until
and unless the default behaviour is restored with -Bdynamic
).
Why your BUCK file doesn't build what you expect.
The link_whole = True
option in your:
cpp_library(
name = "object",
srcs = ["ObjectA.cpp"],
headers = ["ObjectA.h"],
link_whole = True,
)
should mean that:
-Wl,--whole-archive <object-library-name> Wl,--no-whole-archive
gets written in the toolchain's linkage commandline for the program test
as built by your:
cpp_binary(
name = "test",
srcs = ["test.cpp"],
deps = [
":object",
],
)
That's what would happen if <object-library-name>
was a static library. But your <object-library-name>
= libobject.so
,
a shared library. And that's because:
libobject.a
. The default is shared.and:
test
. If you had, then it would have inferred that you want to link against libobject.a
and built it instead of libobject.so
, but buck's default preference is
shared libraries, because that's the default preference of the linker.So by default, buck builds libobject.so
and links test
against it. It knows
that --whole-archive
means nothing as applied to libobject.so
so it ignores
link_whole = True
: no error, no warning. Even if:
-Wl,--whole-archive libobject.so Wl,--no-whole-archive
was passed to the linker it would just ignore -[no]-whole-archive
; no error, no warning.
Buck completes the build of test
successfully. test
has a dynamic
dependency on libobject.so
, represented by:
sarora::testing::testingObject()
in the
dynamic symbol table of test
test
that says libobject.so
is needed.That's all it gets from libobject.so
What you'd need to do to your BUCK file to see what you expected
Before going here, remember that if you link a program against
a shared library in order to resolve symbol sym
, then you don't want and
don't need to have a definition of sym
in your program, and won't get
one. If there was a definition of sym
in your linked program, it could
only have got there from an object file that defined sym
before any
shared library that defined it was reached, and any such shared library definition
would have been ignored, because sym
was already defined.
To see the outcome you expect for link_whole = True
, you'd need to do one of:
object
library is to be provided
in linkages as libobject.a
, rather than the default libobject.so
. That would take:
cxx_library(
...
preferred_linkage = "static",
...
)
or:
cxx_binary(
...
link_style = "static",
...
)
Either way (or both together), the object
library will be built as the static library libobject.a
, and
then --whole-archive
, and will be meaningful, and buck will apply it. The one and
only object file libobject.a(object.o)
will be extracted from libobject.a
and
statically linked into test
, bringing with it all the symbol definitions in object.o
, and you will see them in the global symbol
table of test
. (But not in its dynamic symbol
table, because they don't need runtime resolution any more.)
Since there will only be one object file in libobject.a
, --whole-archive
is
of course redundant in this particular case: the linkage will need libobject(object.o)
to resolve sarora::testing::testingObject()
, so it will extract and link that
object file without coercion, and that object file will bring with it all the
symbols it defines or references, including those that test
does not need. When
the linker consumes an object file, it consumes all of it.1.
For the same reason libobject.a
itself is redundant in this particular case. You might as well just compile the object file object.o
from ObjectA.cpp
and link it directly.
Bottom line: link_whole
is meaningful if and only if you make sure the library you are applying
it to is a static libary. link_whole
is useful if and only if you want to link all the
object files in the static library, whether or not the linker needs them.
No need to read on unless you're interested in seeing all this demonstrated.
Demo all that with buck
Source files:
$ cat foo.cpp
#include <iostream>
void hello_world()
{
std::cout << "Hello World" << std::endl;
}
void goodbye_world()
{
std::cout << "Goodbye World" << std::endl;
}
$ cat main.cpp
#include <iostream>
extern void hello_world();
int main() {
hello_world();
return 0;
}
BUCK file, v1:
$ cat BUCK
cxx_library(
name = "foo",
srcs = ["foo.cpp"],
link_whole = True,
)
cxx_binary(
name = "main",
srcs = ["main.cpp"],
deps = [
':foo',
],
)
# toolchains/BUCK
load("@prelude//toolchains:cxx.bzl", "system_cxx_toolchain")
load("@prelude//toolchains:python.bzl", "system_python_bootstrap_toolchain")
system_cxx_toolchain(
name = "cxx",
visibility = ["PUBLIC"],
)
system_python_bootstrap_toolchain(
name = "python_bootstrap",
visibility = ["PUBLIC"],
)
Build, take #1:
$ buck2 build //...
Starting new buck2 daemon...
Connected to new buck2 daemon.
Build ID: b0ed2f4f-3d43-47cc-b9e4-19a53158dc3e
Jobs completed: 62. Time elapsed: 0.3s.
Cache hits: 0%. Commands: 4 (cached: 0, remote: 0, local: 4)
BUILD SUCCEEDED
Run the program:
$ ./buck-out/v2/gen/root/904931f735703749/__main__/main
Hello World
All good. Now look at its global symbol table and dynamic symbol table (demangled) for
hits on hello_world()
:
$ readelf -W --syms ./buck-out/v2/gen/root/904931f735703749/__main__/main | \
c++filt | egrep '(Symbol table|Ndx|hello_world)'
Symbol table '.dynsym' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
7: 0000000000000000 0 FUNC GLOBAL DEFAULT UND hello_world()
Symbol table '.symtab' contains 32 entries:
Num: Value Size Type Bind Vis Ndx Name
31: 0000000000000000 0 FUNC GLOBAL DEFAULT UND hello_world()
hello_world()
is an undefined (Ndx
= UND
) symbol mentioned once in the dynamic
symbol table (.dynsym
) and once in the global symbol table (.symtab
).
The runtime linker (ld.so
) was able to define hello_world()
and run the program because the
static linker wrote the following dynamic
section in the executable:
$ readelf --dynamic ./buck-out/v2/gen/root/904931f735703749/__main__/main
Dynamic section at offset 0x840 contains 31 entries:
Tag Type Name/Value
0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN/./__main__shared_libs_symlink_tree]
0x0000000000000001 (NEEDED) Shared library: [lib_foo.so]
0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6]
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
...[cut]...
which informed ld.so
that lib_foo.so
was needed before any of the other
dynamic dependencies, and also told it where to look to find lib_foo.so
, namely:
(RUNPATH) Library runpath: [$ORIGIN/./__main__shared_libs_symlink_tree]
where indeed we find a symlink2:
$ ls -l ./buck-out/v2/gen/root/904931f735703749/__main__/__main__shared_libs_symlink_tree
total 0
lrwxrwxrwx 1 imk imk 24 Apr 22 11:49 lib_foo.so -> ../../__foo__/lib_foo.so
to the actual shared library:
./buck-out/v2/gen/root/904931f735703749/__foo__/lib_foo.so
The uncalled function void goodbye_world()
:
$ readelf -W --syms ./buck-out/v2/gen/root/904931f735703749/__main__/main | \
c++filt | egrep '(Symbol table|Ndx|goodbye_world)'
Symbol table '.dynsym' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
Symbol table '.symtab' contains 32 entries:
Num: Value Size Type Bind Vis Ndx Name
does not appear in either symbol table.
And as for the static library that link_whole = True
might apply
to:
$ find . -name lib*.a; echo Done
Done
it doesn't exist. Let's look at the actual linkage arguments:
$ cat ./buck-out/v2/gen/root/904931f735703749/__main__/main.linker.argsfile;
"-fuse-ld=lld"
-o
buck-out/v2/gen/root/904931f735703749/__main__/main
"-Wl,-rpath,\$ORIGIN/./__main__shared_libs_symlink_tree"
buck-out/v2/gen/root/904931f735703749/__main__/__objects__/main.cpp.pic.o
buck-out/v2/gen/root/904931f735703749/__foo__/lib_foo.so
lib_foo.so
is linked, i.e. the NEEDED
note was written into main
; its runtime path (-rpath
) was also written in main
; --whole-archive
is absent.
Build take #2. The cxx_library
preferred_linkage
option
Now let's change the build to request that libfoo
is built as libfoo.a
.
cxx_library(
name = "foo",
srcs = ["foo.cpp"],
preferred_linkage = "static", # New
link_whole = True,
)
Clean and rebuild:
$ buck2 clean
...
$ buck2 build //...
...
BUILD SUCCEEDED
The program runs as before:
$ ./buck-out/v2/gen/root/904931f735703749/__main__/main
Hello World
But:
$ find . -name lib*.so; echo Done
Done
No shared library was built. Instead:
$ find . -name lib*.a; echo Done
./buck-out/v2/tmp/root/904931f735703749/__foo__/archive/libfoo.pic.a
./buck-out/v2/gen/root/904931f735703749/__foo__/libfoo.pic.a
Done
The static library libfoo.pic.a
was built, which contains the object file:
$ ar -t ./buck-out/v2/gen/root/904931f735703749/__foo__/libfoo.pic.a
foo.cpp.pic.o
in which are defined:
$ nm -C ./buck-out/v2/gen/root/904931f735703749/__foo__/libfoo.pic.a | egrep '(foo.cpp.pic.o|world)'
foo.cpp.pic.o:
0000000000000000 T hello_world()
0000000000000030 T goodbye_world()
T
= defined in the text
section of the program. And both definitions were linked into the program:
$ readelf -W --syms ./buck-out/v2/gen/root/904931f735703749/__main__/main | \
c++filt | egrep '(Symbol table|Ndx|world)'
Symbol table '.dynsym' contains 11 entries:
Num: Value Size Type Bind Vis Ndx Name
Symbol table '.symtab' contains 38 entries:
Num: Value Size Type Bind Vis Ndx Name
32: 0000000000001940 40 FUNC GLOBAL DEFAULT 14 hello_world()
37: 0000000000001970 40 FUNC GLOBAL DEFAULT 14 goodbye_world()
But only in the .symtab
, not in the .dynsym
: the runtime
linker does not need to define them. And the definition of goodbye_world()
is dead weight.
Check out the dynamic section of the new executable:
$ readelf --dynamic buck-out/v2/gen/root/904931f735703749/__main__/main
Dynamic section at offset 0xa20 contains 29 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6]
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
...[cut]...
It's the same as before except that:
0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN/./__main__shared_libs_symlink_tree]
0x0000000000000001 (NEEDED) Shared library: [lib_foo.so]
is now gone. And see the new linkage arguments:
$ cat ./buck-out/v2/gen/root/904931f735703749/__main__/main.linker.argsfile
"-fuse-ld=lld"
-o
buck-out/v2/gen/root/904931f735703749/__main__/main
buck-out/v2/gen/root/904931f735703749/__main__/__objects__/main.cpp.pic.o
-Wl,--whole-archive
buck-out/v2/gen/root/904931f735703749/__foo__/libfoo.pic.a
-Wl,--no-whole-archive
libfoo.pic.a
is statically linked, --whole-archive libfoo.pic.a --no-whole-archive
What's with this .pic.
sub-extension as in libfoo.pic.a
and foo.cpp.pic.o
?
That's hint that the object file libfoo.pic.a(foo.cpp.pic.o)
has been compiled
as Position Independent Code, compiler option -fPIC
, and is suitable for
static linkage into a position independent binary, i.e. a shared libary; not just into
a program, which doesn't require PIC code. We don't in fact need PIC code for
static linkage into our main
program; but we've got it anyway. In the next build
we'll see that go away.
Build take #3. The cxx_binary
link_style
option
Let's change the build again to say that static libraries are preferred in the
linkage of the main
program. The BUCK
file now has:
cxx_library(
name = "foo",
srcs = ["foo.cpp"],
link_whole = True,
)
cxx_binary(
name = "main",
srcs = ["main.cpp"],
link_style = "static", # New
deps = [
':foo',
],
)
with the cxx_library
reverted to original.
Clean and rebuild:
$ buck2 clean
...
$ buck2 build //...
...
BUILD SUCCEEDED
The program runs as before:
$ ./buck-out/v2/gen/root/904931f735703749/__main__/main
Hello World
But:
$ find . -name lib*.a
./buck-out/v2/tmp/root/904931f735703749/__foo__/archive/libfoo.a
./buck-out/v2/gen/root/904931f735703749/__foo__/libfoo.a
now we've got the regular libfoo.a
rather than libfoo.pic.a
, and it
contains the regular:
$ ar -t ./buck-out/v2/gen/root/904931f735703749/__foo__/libfoo.a
foo.cpp.o
We told buck the main
program prefers static libraries; programs don't
need PIC code, so buck has ditched the -fPIC
compilation. Nothing else
is different from Build #2.
But it's possible compile object files with finer granularity than the default, enabling the linker to discard definitions that come in from object files if it finally determines they're not needed, so they never appear in the global symbol table.
$ORIGIN
is meaningful to the runtime linker. It means: the directory containing the file
in which $ORIGIN
is written.