There are multiple questions about "how to" call C C++ code from Python. But I would like to understand what exactly happens when this is done and what are the performance concerns. What is the theory underneath? Some questions I hope to get answered by understanding the principle are:
When considering data (especially large data) being processed (e.g. 2GB) which needs to be passed from python to C / C++ and then back. How are the data transferred from python to C when function is called? How is the result transferred back after function ends? Is everything done in memory or are UNIX/TCP sockets or files used to transfer the data? Is there some translation and copying done (e.g. to convert data types), do I need 2GB memory for holding data in python and additional +-2GB memory to have a C version of the data that is passed to C function? Do the C code and Python code run in different processes?
You can call between C, C++, Python, and a bunch of other languages without spawning a separate process or copying much of anything.
In Python basically everything is reference-counted, so if you want to use a Python object in C++ you can simply use the same reference count to manage its lifetime (e.g. to avoid copying it even if Python decides it doesn't need the object anymore). If you want the reverse, you may need to use a C++ std::shared_ptr
or similar to hold your objects in C++, so that Python can also reference them.
In some cases things are even simpler than this, such as if you have a pure function in C or C++ which takes some values from Python and returns a result with no side effects and no storing of the inputs. In such a case, you certainly do not need to copy anything, because you can read the Python values directly and the Python interpreter will not be running while your C or C++ code is running (because they are all in a single thread).
There is an extensive Python (and NumPy, by the way) C API for this, plus the excellent Boost.Python for C++ integration including smart pointers.