pythonc++packagepython-packagingpython-extensions

how does python package links to dll (.so) files


I am creating a python package based on this repo. The package has few cpp files which are compiled when I build the package using setup.py and running pip install . This generates _C.cpython-36m-x86_64-linux-gnu.so file in my package installation directory. To import this dll (.so) file all I have to do is

from . import _C (something like this)

Now the imported _C object points to _C.cpython-36m-x86_64-linux-gnu.so. I don't understand how _C object gets linked to the specific .so file. Is that information written in any of the metadata files while the package is being built?


Solution

  • No. The mechanism for handling the C++ library loading is done by pybind. In the documentation (https://pybind11.readthedocs.io/en/stable/basics.html), you will see that in order to import a C++ library built with the pybind API, the correct syntax in your .py file is to import the prefix of the library. Thus, when you write import _C in your python code on your linux system it will look for _C.<whatever>.so and load the symbols from that file.

    All of the C++ files used to build import-able python modules in the repo you refer to ultimately include torch/extension.h (via vision.h --- https://github.com/microsoft/scene_graph_benchmark/blob/main/maskrcnn_benchmark/csrc/cuda/vision.h#L3) and if you explore the source for pytorch, extension.h includes python.h which includes pybind.h (https://pytorch.org/cppdocs/api/program_listing_file_torch_csrc_api_include_torch_python.h.html).