I just finished a tutorial on building a DLL library. From the tutorial I learnt that a DLL also has an associated lib file, which the linker will use to statically linked in information to the client program. The lib file will contain information such as memory address locations of where to find functions inside the DLL.
My confusion comes in when using Python. With Python we seem to utilize pyd files, which are DLL formatted files with added information to make them callable into Python. In addition, I have seen code examples of using the ctypes library to call into DLL files, which allows it to happen without using the associated lib file. So I am confused on why we need the lib file when using the DLL library in the Microsoft tutorial, how this file does not seem to be used when calling into DLL libraries via Python by using either a pyd or the ctypes library.
A DLL import library (Microsoft extension .lib
) is required only
by the Micrsoft linker to link a program against a DLL. That means of
course that it is only required at link time; it is not required at
runtime. An import library has no runtime function at all; it is not
a loadable file.
Furthermore, an import library is only required by the Microsoft linker, for historical reasons. It is not technically necessary to use an import library to link a program against a DLL. The MYSY2/MinGW-w64 linker that is invoked by Windows GCC does not need import libraries: it can link directly against a DLL, although it can also use an import library if it finds one first.
An import library name.lib
serves the linker as a statically linkable proxy for
name.dll
, which itself cannot be statically linked. In the simplest terms,
name.lib
is an archive of little object files which between them convey
to the linker information such as this:
Here are some symbol names symbol1, symbol2,...,symbolN
which may not be defined by
any regular object files available to the linkage but are defined in a DLL name.dll
.
Usually symbol1, symbol2,...,symbolN
are names of functions exported by name.dll
.
The linker extracts these little object files from the archive and links the
information statically into the program. Then at runtime, the runtime linker
will detect this information when it is asked to load the program; it will
search its runtime library path for name.dll
and - if successful - it will
load name.dll
into the address space of the program and resolve the program's
references to symbol1, symbol2,...,symbolN
to the definitions that are (hopefully)
provided by name.dll
. If that too is successful (for all DLLs that the program
needs) the program is finally allowed to run. The role of the import library
in the life of the program is finished once the static linker has
created the program.
This process of obtaining information about a DLL that a program depends on, and about the symbols that the program references that are defined in the DLL, and statically linking this information into the program as instructions to the runtime linker - that's what we mean by linking against a DLL.
As noted already, is not necessary to use an import library to accomplish this process. Import libraries are the Microsoft way of doing it. In Linux and other OSes - Python runs on all of them - import libraries aren't used at all to link against dynamic libraries. Instead, the dynamic library itself is input to the static linker. The dynamic library cannot be statically linked, but the linker just examines it to see what undefined symbols in the program are defined by the dynamic library, and then statically links the the necessary instructions to the runtime linker into the program.
Not only is an import library unncessary, it is also unnessary to statically link any information
about name.dll
into a program to enable the program to load name.dll
and call functions in it.
If a program believes that name.dll
exists on the system and wants to reference a symbol
it believes is defined in name.dll
(usually, call a function defined in name.dll
), it can itself request
the OS to find and load name.dll
, using the LoadLibrary
system call, and request the runtime linker to give it the address of the symbol it wants to
reference from name.dll
, using the GetProcAddress
system call.
So, the process of statically linking instructions to the runtime linker into a program - whether it is done using an import library or not - is way in which you can avoid doing the runtime linkage of a DLL programmatically and have it all done instead by the OS when the program is run. And an import library is an optional way of doing that avoiding.
Python, of course, is a runtime interpreter. It does not invoke the static linker at all. If you
ask it to import your .pyd
module, Python loads the DLL programmatically. If you use cytypes
to call into
a C or C++ DLL, Python either loads the DLL programmatically for you, or you
can explicitly load it programmatically yourself by calling cdll.LoadLibrary(libname)
.
Any DLL that Python needs to load, either for its own purposes or the user's purposes, is either a DLL that it loads
programmatically, or it is a DLL that Python was linked against when Python was built, which will be automatically
loaded with Python. In either case, as a user of Python you never need the import libary for that DLL, and this
is true whether you invoke Python by a > python3 myscript.py
shell command or invoke
it embedded in a C or C++ program that has been linked with [lib]python3dll.