I am trying to understand more about linking and shared library.
Ultimately, I wonder if it's possible to add a method to a shared library. For instance, suppose one has a source file a.c, and a library lib.so (without the source file). Let's furthermore assume, for simplicity, that a.c declares a single method, whose name is not present in lib.so. I thought maybe it might be possible to, at linking time, link a.o to lib.so while instructing to create newLib.so, and forcing the linker to export all methods/variable in lib.so to that the newLib.so is now basically lib.so with the added method from a.so.
More generally, if one has some source file depending on a shared library, can one create a single output file (library or executable) that is not dependent on the shared library anymore ? (That is, all the relevant methods/variable from the library would have been exported/linked/inlined to the new executable, hence making the dependency void). If that's not possible, what is technically preventing it ?
A somehow similar question has been asked here: Merge multiple .so shared libraries. One of the reply includes the following text: "If you have access to either source or object files for both libraries, it is straightforward to compile/link a combined SO from them.: without explaining the technical details. Was it a mistake or does it hold ? If so, how to do it ?
Once you have a shared library libfoo.so
the only ways you can use it
in the linkage of anything else are:-
Link a program that dynamically depends on it, e.g.
$ gcc -o prog bar.o ... -lfoo
Or, link another shared library that dynamically depends on it, e.g.
$ gcc -shared -o libbar.so bar.o ... -lfoo
In either case the product of the linkage, prog
or libbar.so
acquires a dynamic dependency on libfoo.so
. This means that prog|libfoo.so
has information inscribed in it by the linker that instructs the
OS loader, at runtime, to find libfoo.so
, load it into the
address space of the current process and bind the program's references to libfoo
's exported symbols to
the addresses of their definitions.
So libfoo.so
must continue to exist as well as prog|libbar.so
.
It is not possible to link libfoo.so
with prog|libbar.so
in
such a way that libfoo.so
is physically merged into prog|libbar.so
and is no longer a runtime dependency.
It doesn't matter whether or not you have the source code of the
other linkage input files - bar.o ...
- that depend on libfoo.so
. The
only kind of linkage you can do with a shared library is dynamic linkage.
This is in complete contrast with the linkage of a static library
You wonder about the statement in this this answer where it says:
If you have access to either source or object files for both libraries, it is straightforward to compile/link a combined SO from them.
The author is just observing that if I have source files
foo_a.c foo_b.c... bar_a.c bar_b.c
which I compile to the corresponding object files:
foo_a.o foo_b.o... bar_a.o bar_b.o...
or if I simply have those object files. Then as well as - or instead of - linking them into two shared libraries:
$ gcc -shared -o libfoo.so foo_a.o foo_b.o...
$ gcc -shared -o libbar.so bar_a.o bar_b.o...
I could link them into one:
$ gcc -shared -o libfoobar.so foo_a.o foo_b.o... bar_a.o bar_b.o...
which would have no dependency on libfoo.so
or libbar.so
even if they exist.
And although that could be straightforward it could also be false. If there is
any symbol name
that is globally defined in any of foo_a.o foo_b.o...
and
also globally defined in any of bar_a.o bar_b.o...
then it will not matter
to the linkage of either libfoo.so
or libbar.so
(and it need not be dynamically
exported by either of them). But the linkage of libfoobar.so
will fail for
multiple definition of name
.
If we build a shared library libbar.so
that depends on libfoo.so
and has
itself been linked with libfoo.so
:
$ gcc -shared -o libbar.so bar.o ... -lfoo
and we then want to link a program with libbar.so
, we can do that in such a way
that we don't need to mention its dependency libfoo.so
:
$ gcc -o prog main.o ... -lbar -Wl,-rpath=<path/to/libfoo.so>
See this answer to follow that up. But
this doesn't change the fact that libbar.so
has a runtime dependency on libfoo.so
.
If that's not possible, what is technically preventing it?
What technically prevents linking a shared library with some program
or shared library targ
in a way that physically merges it into targ
is that a
shared library (like a program) is not the sort of thing that a linker knows
how to physically merge into its output file.
Input files that the linker can physically merge into targ
need to
have structural properties that guide the linker in doing that merging. That is the structure of object files.
They consist of named input sections of object code or data that are tagged with various attributes.
Roughly speaking, the linker cuts up the object files into their sections and distributes them into
output sections of the output file according to their attributes, and makes
binary modifications to the merged result to resolve static symbol references
or enable the OS loader to resolve dynamic ones at runtime.
This is not a reversible process. The linker can't consume a program or shared library and reconstruct the object files from which it was made to merge them again into something else.
But that's really beside the point. When input files are physically
merged into targ
, that is called static linkage.
When input files are just externally referenced in targ
to
make the OS loader map them into a process it has launched for targ
,
that is called dynamic linkage. Technical development has given us
a file-format solution to each of these needs: object files for static linkage, shared libraries
for dynamic linkage. Neither can be used for the purpose of the other.