I created an executable with pyinstaller
and noticed that even after some size reduction tricks (creating a custom environment, using OpenBLAS instead of MKL) the package comes out quite big. When looking into the _internal
directory I found that the same DLL has been copied there four times. I used WinMerge to verify that indeed the files are binary identical.
dir /Os
[...]
08-May-25 12:07 7,280,128 python313.dll
08-May-25 12:07 27,951,616 liblapack.dll
08-May-25 12:07 27,951,616 openblas.dll
08-May-25 12:07 27,951,616 libcblas.dll
08-May-25 12:07 27,951,616 libblas.dll
145 File(s) 161,490,356 bytes
Out of a total of 247MB for the package those libraries make up 106MB.
How can I avoid having the same copy multiple times? Can I tell pyinstaller
? Can I avoid it during python package installation in the environment?
Creating symbolic links with mklink
is not an option on my Windows installation (You do not have sufficient privilege to perform this operation.
), so any solution that targets the root cause would be appreciated.
Digging into the conda
-created environment reveals that the multiple copies of the same library are already present in the environment. numpy
links against all of them making it impossible to untie the dependency for the existing build.
This behavior seems to be caused especially by using conda
as package manager, which is shipping multiple copies of the same library, also reproducible on Linux or MacOS builds.
Solution is to drop conda
and rely on purely venv
and pip
to set-up your environment in the first place.
Note: numpy
also allows to specify the BLAS library to use during build via pkg-config
.