I'm trying to install mxnet
with gpu on colab.
I guess current colab has cuda 11.1
installed by default as
!nvcc --version
gives
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0
I've tried 3 different approaches to achieve the goal but none of them worked.
Firstly, I tried this set of commands from the nvidia docs:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.2.0/local_installers/cuda-repo-ubuntu1804-11-2-local_11.2.0-460.27.04-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-11-2-local_11.2.0-460.27.04-1_amd64.deb
sudo apt-key add /var/cuda-repo-ubuntu1804-11-2-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda
The installation process went well though, I got the latest version of cuda
, that is, 11.4.
Secondly, I tried the runfile
!wget https://developer.download.nvidia.com/compute/cuda/11.2.2/local_installers/cuda_11.2.2_460.32.03_linux.run
!sh ./cuda_11.2.2_460.32.03_linux.run --toolkit --silent --override
The installation process went well and I guess I've managed to install cuda 11.2 as this command
!nvcc --version
gives
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0
and then I ran this command
!pip install mxnet-cu112
and got
Collecting mxnet-cu112
Downloading mxnet_cu112-1.8.0.post0-py2.py3-none-manylinux2014_x86_64.whl (495.7 MB)
|████████████████████████████████| 495.7 MB 15 kB/s
Collecting graphviz<0.9.0,>=0.8.1
Downloading graphviz-0.8.4-py2.py3-none-any.whl (16 kB)
Requirement already satisfied: numpy<2.0.0,>1.16.0 in /usr/local/lib/python3.7/dist-packages (from mxnet-cu112) (1.19.5)
Requirement already satisfied: requests<3,>=2.20.0 in /usr/local/lib/python3.7/dist-packages (from mxnet-cu112) (2.23.0)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.20.0->mxnet-cu112) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.20.0->mxnet-cu112) (1.24.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.20.0->mxnet-cu112) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.20.0->mxnet-cu112) (2021.5.30)
Installing collected packages: graphviz, mxnet-cu112
Attempting uninstall: graphviz
Found existing installation: graphviz 0.10.1
Uninstalling graphviz-0.10.1:
Successfully uninstalled graphviz-0.10.1
Successfully installed graphviz-0.8.4 mxnet-cu112-1.8.0.post0
Finally, I tested the installation with this command
import mxnet as mx
and I got the libnvrtc
error
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-7-265f02e9c062> in <module>()
----> 1 import mxnet as mx
4 frames
/usr/lib/python3.7/ctypes/__init__.py in __init__(self, name, mode, handle, use_errno, use_last_error)
362
363 if handle is None:
--> 364 self._handle = _dlopen(self._name, mode)
365 else:
366 self._handle = handle
OSError: libnvrtc.so.11.2: cannot open shared object file: No such file or directory
So, I tried to check the existence of the library
!find /usr/ -name "libnvrtc*"
and I got
/usr/local/lib/python3.7/dist-packages/torch/lib/libnvrtc-08c4863f.so.10.2
/usr/local/lib/python3.7/dist-packages/torch/lib/libnvrtc-builtins.so
/usr/local/lib/python2.7/dist-packages/torch/lib/libnvrtc-5e8a26c9.so.10.1
/usr/local/lib/python2.7/dist-packages/torch/lib/libnvrtc-builtins.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc.so.11.2
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc.so.11.2
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc-builtins.so.11.2.152
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc.so.11.2.152
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc-builtins.so.11.2
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc-builtins.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.2.152
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc.so.11.2.152
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.2
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc-builtins.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.0
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc.so.11.0.221
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.0.221
/usr/local/cuda-11.0/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc.so.11.0
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc-builtins.so
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-11.1/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc.so.11.1
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.1.105
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.1
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc.so.11.1.105
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc-builtins.so
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc-builtins.so.10.0.130
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc.so.10.0.130
/usr/local/cuda-10.0/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc-builtins.so.10.0
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc.so.10.0
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc-builtins.so
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc-builtins.so.10.1
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc.so.10.1
/usr/local/cuda-10.1/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc.so.10.1.243
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc-builtins.so.10.1.243
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc-builtins.so
and another command
%ll /usr/local/cuda/lib64/libnvrtc*
gives
lrwxrwxrwx 1 root 25 Sep 22 00:58 /usr/local/cuda/lib64/libnvrtc-builtins.so -> libnvrtc-builtins.so.11.2*
lrwxrwxrwx 1 root 29 Sep 22 00:57 /usr/local/cuda/lib64/libnvrtc-builtins.so.11.2 -> libnvrtc-builtins.so.11.2.152*
-rwxr-xr-x 1 root 6122648 Sep 22 00:57 /usr/local/cuda/lib64/libnvrtc-builtins.so.11.2.152*
lrwxrwxrwx 1 root 16 Sep 22 00:58 /usr/local/cuda/lib64/libnvrtc.so -> libnvrtc.so.11.2*
lrwxrwxrwx 1 root 20 Sep 22 00:57 /usr/local/cuda/lib64/libnvrtc.so.11.2 -> libnvrtc.so.11.2.152*
-rwxr-xr-x 1 root 43954832 Sep 22 00:57 /usr/local/cuda/lib64/libnvrtc.so.11.2.152*
Does it mean I've already had the library that mxnet-cu112
needs?
I tried to specify the directory for mxent as that's where "libnvrtc.so.11.2" is located,
%env LD_LIBRARY_PATH=/usr/local/cuda/lib64/
but it didn't work either.
I also tried this
!apt-get install -y libnvrtc=11.2
and I got this
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package libnvrtc
How do I fix the "libnvrtc" error?
cuda 10.2
I factory-reset the runtime and tried these commands:
!wget https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
!sh ./cuda_10.2.89_440.33.01_linux.run --toolkit --silent --override
!pip install mxnet-cu102
Everything went well until this command
import mxnet as mx
gives
OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory
and this command
%ll /usr/local/cuda/lib64/libcudart*
gives this
lrwxrwxrwx 1 root 17 Sep 22 01:36 /usr/local/cuda/lib64/libcudart.so -> libcudart.so.10.2*
lrwxrwxrwx 1 root 20 Sep 22 01:35 /usr/local/cuda/lib64/libcudart.so.10.2 -> libcudart.so.10.2.89*
-rwxr-xr-x 1 root 509248 Sep 22 01:35 /usr/local/cuda/lib64/libcudart.so.10.2.89*
-rw-r--r-- 1 root 902366 Sep 22 01:36 /usr/local/cuda/lib64/libcudart_static.a
I also tried this thread but none worked for me.
How do I fix the error?
Another possible solution might be to install another version of mxnet though it seems there is no mxnet Binary for CUDA 11.1
The following approach works for cuda-10.0
and cuda-11.0
:
!sudo ln -sfT /usr/local/cuda/cuda-10.0/ /usr/local/cuda
!pip install mxnet-cu100mkl
import mxnet
mxnet.__version__
For cuda-11.0
, just replace the first two lines with:
!sudo ln -sfT /usr/local/cuda/cuda-11.0/ /usr/local/cuda
!pip install mxnet-cu110