I am creating a python package based on this repo. The package has few cpp files which are compiled when I build the package using setup.py and running pip install .
This generates _C.cpython-36m-x86_64-linux-gnu.so
file in my package installation directory. To import this dll (.so) file all I have to do is
from . import _C
(something like this)
Now the imported _C object points to _C.cpython-36m-x86_64-linux-gnu.so
. I don't understand how _C object gets linked to the specific .so file. Is that information written in any of the metadata files while the package is being built?
No. The mechanism for handling the C++ library loading is done by pybind. In the documentation (https://pybind11.readthedocs.io/en/stable/basics.html), you will see that in order to import a C++ library built with the pybind API, the correct syntax in your .py file is to import the prefix of the library. Thus, when you write import _C
in your python code on your linux system it will look for _C.<whatever>.so
and load the symbols from that file.
All of the C++ files used to build import-able python modules in the repo you refer to ultimately include torch/extension.h
(via vision.h
--- https://github.com/microsoft/scene_graph_benchmark/blob/main/maskrcnn_benchmark/csrc/cuda/vision.h#L3) and if you explore the source for pytorch, extension.h
includes python.h
which includes pybind.h
(https://pytorch.org/cppdocs/api/program_listing_file_torch_csrc_api_include_torch_python.h.html).