pythoncondapyinstalleropenblas

Reducing number of identical BLAS DLLs in pyinstaller generated distributable


I created an executable with pyinstaller and noticed that even after some size reduction tricks (creating a custom environment, using OpenBLAS instead of MKL) the package comes out quite big. When looking into the _internal directory I found that the same DLL has been copied there four times. I used WinMerge to verify that indeed the files are binary identical.

dir /Os

[...]
08-May-25  12:07         7,280,128 python313.dll
08-May-25  12:07        27,951,616 liblapack.dll
08-May-25  12:07        27,951,616 openblas.dll
08-May-25  12:07        27,951,616 libcblas.dll
08-May-25  12:07        27,951,616 libblas.dll
             145 File(s)    161,490,356 bytes

Out of a total of 247MB for the package those libraries make up 106MB.

How can I avoid having the same copy multiple times? Can I tell pyinstaller? Can I avoid it during python package installation in the environment?

Creating symbolic links with mklink is not an option on my Windows installation (You do not have sufficient privilege to perform this operation.), so any solution that targets the root cause would be appreciated.

Digging into the conda-created environment reveals that the multiple copies of the same library are already present in the environment. numpy links against all of them making it impossible to untie the dependency for the existing build.


Solution

  • This behavior seems to be caused especially by using conda as package manager, which is shipping multiple copies of the same library, also reproducible on Linux or MacOS builds.

    Solution is to drop conda and rely on purely venv and pip to set-up your environment in the first place.

    Note: numpy also allows to specify the BLAS library to use during build via pkg-config.