I'm new to Stackoverflow and the NVIDIA runtime, and I'm trying to run a Docker container with the NVIDIA runtime using Docker Compose. However, I'm getting an error that I don't get when running the container directly with docker run.
Here's the relevant section of my docker-compose.yml file:
services:
nvidia-test:
image: nvidia/cuda:11.5.2-base-ubuntu20.04
command: nvidia-smi
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
When I run docker-compose up, I get the following error:
Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown
However, when I run the container directly with the docker run command, as follows, I don't get this/any error:
sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.5.2-base-ubuntu20.04 nvidia-smi
I'm not sure what could be causing this error. Can someone help me understand the issue and how to resolve it so that I can run the container with the NVIDIA runtime using Docker Compose? Currently I am using docker-compose version v2.16.0
, and I installed NVIDIA-Container-Toolkit following this link. Here are the NVIDIA Driver and CUDA version installed on my machine:
Please let me know if you need additional information from me to better understand the issue.
I already sudo systemctl status nvidia-persistenced
to check the Persistence Daemon. But it is active (running).
Adding sudo
in front of the docker-compose up
solved the problem. I assume that elevated privileges are required to allow Docker to properly access the necessary NVIDIA tools and libraries. Same as for the sudo docker run ...
. Also, note that the --runtime=nvidia
in sudo docker run ...
is not needed anymore for newer nvidia-container-toolkit versions.