pythongoogle-colaboratorymxnet

How to install mxnet on google colab?


I'm trying to install mxnet with gpu on colab.

I guess current colab has cuda 11.1 installed by default as

!nvcc --version

gives

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0

I've tried 3 different approaches to achieve the goal but none of them worked.

First try - cuda 11.2, local deb

Firstly, I tried this set of commands from the nvidia docs:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.2.0/local_installers/cuda-repo-ubuntu1804-11-2-local_11.2.0-460.27.04-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-11-2-local_11.2.0-460.27.04-1_amd64.deb
sudo apt-key add /var/cuda-repo-ubuntu1804-11-2-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda

The installation process went well though, I got the latest version of cuda, that is, 11.4.

Second try - cuda 11.2, runfile

Secondly, I tried the runfile

!wget https://developer.download.nvidia.com/compute/cuda/11.2.2/local_installers/cuda_11.2.2_460.32.03_linux.run
!sh ./cuda_11.2.2_460.32.03_linux.run --toolkit --silent --override

The installation process went well and I guess I've managed to install cuda 11.2 as this command

!nvcc --version

gives

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

and then I ran this command

!pip install mxnet-cu112

and got

Collecting mxnet-cu112
  Downloading mxnet_cu112-1.8.0.post0-py2.py3-none-manylinux2014_x86_64.whl (495.7 MB)
     |████████████████████████████████| 495.7 MB 15 kB/s 
Collecting graphviz<0.9.0,>=0.8.1
  Downloading graphviz-0.8.4-py2.py3-none-any.whl (16 kB)
Requirement already satisfied: numpy<2.0.0,>1.16.0 in /usr/local/lib/python3.7/dist-packages (from mxnet-cu112) (1.19.5)
Requirement already satisfied: requests<3,>=2.20.0 in /usr/local/lib/python3.7/dist-packages (from mxnet-cu112) (2.23.0)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.20.0->mxnet-cu112) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.20.0->mxnet-cu112) (1.24.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.20.0->mxnet-cu112) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.20.0->mxnet-cu112) (2021.5.30)
Installing collected packages: graphviz, mxnet-cu112
  Attempting uninstall: graphviz
    Found existing installation: graphviz 0.10.1
    Uninstalling graphviz-0.10.1:
      Successfully uninstalled graphviz-0.10.1
Successfully installed graphviz-0.8.4 mxnet-cu112-1.8.0.post0

Finally, I tested the installation with this command

import mxnet as mx

and I got the libnvrtc error

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-7-265f02e9c062> in <module>()
----> 1 import mxnet as mx

4 frames
/usr/lib/python3.7/ctypes/__init__.py in __init__(self, name, mode, handle, use_errno, use_last_error)
    362 
    363         if handle is None:
--> 364             self._handle = _dlopen(self._name, mode)
    365         else:
    366             self._handle = handle

OSError: libnvrtc.so.11.2: cannot open shared object file: No such file or directory

So, I tried to check the existence of the library

!find /usr/ -name "libnvrtc*"

and I got

/usr/local/lib/python3.7/dist-packages/torch/lib/libnvrtc-08c4863f.so.10.2
/usr/local/lib/python3.7/dist-packages/torch/lib/libnvrtc-builtins.so
/usr/local/lib/python2.7/dist-packages/torch/lib/libnvrtc-5e8a26c9.so.10.1
/usr/local/lib/python2.7/dist-packages/torch/lib/libnvrtc-builtins.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc.so.11.2
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc.so.11.2
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc-builtins.so.11.2.152
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc.so.11.2.152
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc-builtins.so.11.2
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc-builtins.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.2.152
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc.so.11.2.152
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.2
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc-builtins.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.0
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc.so.11.0.221
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.0.221
/usr/local/cuda-11.0/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc.so.11.0
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc-builtins.so
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-11.1/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc.so.11.1
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.1.105
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.1
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc.so.11.1.105
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc-builtins.so
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc-builtins.so.10.0.130
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc.so.10.0.130
/usr/local/cuda-10.0/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc-builtins.so.10.0
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc.so.10.0
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc-builtins.so
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc-builtins.so.10.1
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc.so.10.1
/usr/local/cuda-10.1/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc.so.10.1.243
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc-builtins.so.10.1.243
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc-builtins.so

and another command

%ll /usr/local/cuda/lib64/libnvrtc*

gives

lrwxrwxrwx 1 root       25 Sep 22 00:58 /usr/local/cuda/lib64/libnvrtc-builtins.so -> libnvrtc-builtins.so.11.2*
lrwxrwxrwx 1 root       29 Sep 22 00:57 /usr/local/cuda/lib64/libnvrtc-builtins.so.11.2 -> libnvrtc-builtins.so.11.2.152*
-rwxr-xr-x 1 root  6122648 Sep 22 00:57 /usr/local/cuda/lib64/libnvrtc-builtins.so.11.2.152*
lrwxrwxrwx 1 root       16 Sep 22 00:58 /usr/local/cuda/lib64/libnvrtc.so -> libnvrtc.so.11.2*
lrwxrwxrwx 1 root       20 Sep 22 00:57 /usr/local/cuda/lib64/libnvrtc.so.11.2 -> libnvrtc.so.11.2.152*
-rwxr-xr-x 1 root 43954832 Sep 22 00:57 /usr/local/cuda/lib64/libnvrtc.so.11.2.152*

Does it mean I've already had the library that mxnet-cu112 needs?

I tried to specify the directory for mxent as that's where "libnvrtc.so.11.2" is located,

%env LD_LIBRARY_PATH=/usr/local/cuda/lib64/

but it didn't work either.

I also tried this

!apt-get install -y libnvrtc=11.2

and I got this

Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package libnvrtc

How do I fix the "libnvrtc" error?

Third try - cuda 10.2

I factory-reset the runtime and tried these commands:

!wget https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
!sh ./cuda_10.2.89_440.33.01_linux.run --toolkit --silent --override
!pip install mxnet-cu102

Everything went well until this command

import mxnet as mx

gives

OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory

and this command

%ll /usr/local/cuda/lib64/libcudart*

gives this

lrwxrwxrwx 1 root     17 Sep 22 01:36 /usr/local/cuda/lib64/libcudart.so -> libcudart.so.10.2*
lrwxrwxrwx 1 root     20 Sep 22 01:35 /usr/local/cuda/lib64/libcudart.so.10.2 -> libcudart.so.10.2.89*
-rwxr-xr-x 1 root 509248 Sep 22 01:35 /usr/local/cuda/lib64/libcudart.so.10.2.89*
-rw-r--r-- 1 root 902366 Sep 22 01:36 /usr/local/cuda/lib64/libcudart_static.a

I also tried this thread but none worked for me.

How do I fix the error?

Another possible solution might be to install another version of mxnet though it seems there is no mxnet Binary for CUDA 11.1


Solution

  • The following approach works for cuda-10.0 and cuda-11.0:

    !sudo ln -sfT /usr/local/cuda/cuda-10.0/ /usr/local/cuda
    !pip install mxnet-cu100mkl
    
    import mxnet
    mxnet.__version__
    

    For cuda-11.0, just replace the first two lines with:

    !sudo ln -sfT /usr/local/cuda/cuda-11.0/ /usr/local/cuda
    !pip install mxnet-cu110