dockerubuntugpunvidianvidia-docker

Cannot run non-root docker container with GPU


I made a fresh install of docker desktop from official docs docker desktop.

Then I installed NVIDIA Container Toolkit following official docs nvidia container toolkit.

When I run docker with non-root permission:

docker run --rm --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi

It gave the following error:

docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

The error was not happened if I use sudo permission. Also, it is fine if I run a non-gpu container (without --gpu all option).

Some information about my PC:

I have tried several suggestions:

sudo chmod 666 /dev/nvidia*
sudo chmod 666 /dev/nvidia-uvm*

sudo chown root:video /usr/local/nvidia/lib64/libnvidia-ml.so.1
sudo chmod 664 /usr/local/nvidia/lib64/libnvidia-ml.so.1

I am really desperate now. Does anyone have any suggestions? Any help will be highly appreciated.


Solution

  • After trying countless methods, I came across this post: docker desktop problem.

    As pointed out in the post, this is an unsolved problem of Docker Desktop. So I switched to Docker Engine only, and I could use GPU normally without any difficulties.

    While waiting for the fix of Docker Desktop, I think Docker Engine is more than enough for me.