I am creating a custom library (written in C++) that does some numerical stuff with ARPACK-NG. The function is wrapped in pybind11 to provide access to the method from Python in a package. I observe strange behavior.
import numpy as np
from mylib import mymethod
mymethod() # Segfault
The same also results if the import order changes.
from mylib import mymethod
import numpy as np
mymethod() # Segfault
from mylib import mymethod
mymethod() # Works fine
import numpy as np
# Further calls to NumPy or my library works also.
The backtrace looks like this.
#0 0x00007fffec59d2ef in mkl_blas.cdotc () from /home/myname/.conda/envs/mylib/lib/./libmkl_intel_lp64.so.1
#1 0x00007ffff7281974 in cneupd_ () from /home/myname/.conda/envs/mylib/lib/libarpack.so.2
#2 0x00007ffff72af228 in cneupd_c () from /home/myname/.conda/envs/mylib/lib/libarpack.so.2
#3 0x00007ffff76b25cd in void complex_symmetric_runner<float>(double const&) ()
from /home/myname/Documents/mylib/build/lib.linux-x86_64-3.9/mylib/libmylib.so
#4 0x00007ffff76b102b in mymethod() ()
The test code is essentially the same as the C++ example provided by ARPACK-NG, with the main method replaced by mymethod()
. The minimal binding code is
#include<pybind11/pybind11.h>
// ARPACK-NG C++ example code goes here. The main method is replaced with mymethod so it can be called from pybind11.
void mymethod(){
// ...Contents of the main() function in the example ARPACK-NG code...
}
PYBIND11_MODULE(mylib, m){
m.def("mymethod", mymethod);
}
I believe there is some issue with NumPy and MKL's initialization, similar to this issue. From what I've gathered, NumPy links to MKL through mkl_rt
dynamically through libmkl_rt.so
, as shown in the NumPy config below.
import numpy
numpy.show_config()
blas_mkl_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/myname/.conda/envs/mylib/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/myname/.conda/envs/mylib/include']
blas_opt_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/myname/.conda/envs/mylib/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/myname/.conda/envs/mylib/include']
lapack_mkl_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/myname/.conda/envs/mylib/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/myname/.conda/envs/mylib/include']
lapack_opt_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/myname/.conda/envs/mylib/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/myname/.conda/envs/mylib/include']
Supported SIMD extensions in this NumPy install:
baseline = SSE,SSE2,SSE3
found = SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,FMA3,AVX2
not found = AVX512F,AVX512CD,AVX512_KNL,AVX512_KNM,AVX512_SKX,AVX512_CNL
My library links to it through ARPACK-NG's shared library dynamically, and per the GDB trace, ends up linking to libmkl_intel_lp64.so
. This is confusing, however, because when I type ldd /home/myname/.conda/envs/mylib/lib/libarpack.so.2
there is no mention of libarpack.so
linking to MKL.
linux-vdso.so.1 (0x0000697945afd000)
libblas.so.3 => /home/myname/.conda/envs/mylib/lib/./libblas.so.3 (0x0000697945200000)
libgfortran.so.4 => /home/myname/.conda/envs/mylib/lib/./libgfortran.so.4 (0x0000697945972000)
libm.so.6 => /usr/lib/libm.so.6 (0x00006979450db000)
libc.so.6 => /usr/lib/libc.so.6 (0x0000697944ed1000)
libdl.so.2 => /usr/lib/libdl.so.2 (0x000069794596d000)
libquadmath.so.0 => /home/myname/.conda/envs/mylib/lib/./libquadmath.so.0 (0x0000697944e97000)
libgcc_s.so.1 => /home/myname/.conda/envs/mylib/lib/./libgcc_s.so.1 (0x0000697944e82000)
/usr/lib64/ld-linux-x86-64.so.2 (0x0000697945aff000)
If I were to guess what is happening, NumPy is checking to see if some BLAS library is loaded in when it is imported. If my code is called first, libblas.so
is loaded in by it and NumPy happens to use that. However, if NumPy is imported first, it loads in MKL for the BLAS library, which somehow interferes with libarpack.so
.
Is my assessment correct and is there a way to solve this problem?
As far as I can tell, what I believed to be the root cause of the issue is correct. I have arrived at a solution that, while not completely satisfactory, nonetheless solves the problem: instantiate the Anaconda environment with the nomkl
package (i.e. conda create -n mylib_nomkl nomkl python=3.9 numpy
). NumPy will no longer try to swap out the BLAS out from under ARPACK-NG.