I'm trying to run a docker container created from the image nvidia/cuda:12.0.1-cudnn8-runtime-ubuntu22.04
, using Ubuntu 22.04 under WSL 2 version 1.1.3.0 in Windows 11 and Docker Desktop 4.17.1. Running lsb_release -a
confirms the version of Ubuntu:
user@desktop:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.1 LTS
Release: 22.04
Codename: jammy
In Docker Desktop, the option "Use the WSL 2 based engine" is checked in Settings -> General, as is "Enable integration with my default WSL distro" in Settings -> Resources -> WSL integration. On the same page, "Enable integration with additional distros:" is switched on for Ubuntu-22.04.
Running nvidia-smi
from an Ubuntu terminal produces
user@desktop:~$ nvidia-smi
Tue Mar 21 22:43:15 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.89.02 Driver Version: 528.49 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA RTX A200... On | 00000000:F3:00.0 Off | N/A |
| N/A 61C P8 4W / 17W | 40MiB / 4096MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 32 G /Xwayland N/A |
| 0 N/A N/A 34 G /Xwayland N/A |
| 0 N/A N/A 13020 G /Xwayland N/A |
+-----------------------------------------------------------------------------+
For what's worth, running nvidia-smi.exe
from a PowerShell terminal produces a similar but not identical result; the version of NVIDIA-SMI shows as 528.49 in Windows instead of 525.89.02 as seen above in Ubuntu.
Running the container without --gpus
produces the expected result right away, i.e., a working container without GPU functionality:
user@desktop:~$ docker run -it nvidia/cuda:12.0.1-cudnn8-runtime-ubuntu22.04
==========
== CUDA ==
==========
CUDA Version 12.0.1
Container image Copyright (c) 2016-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
Use the NVIDIA Container Toolkit to start this container with GPU support; see
https://docs.nvidia.com/datacenter/cloud-native/ .
root@80a1fd519f3a:/#
Multiple attempts to run the container with --gpus 0
, --gpus 1
, or --gpus all
produced no output within one hour, after which I closed the terminal window - CTRL+C did not stop execution.
The outcomes above were observed also with Ubuntu 20.04 and variants of the CUDA image, such as nvidia/cuda:11.6.0-deve-ubuntu20.04
and nvidia/cuda:12.1.0-ubuntu22.04
. I also tried breaking down the running of the container into separate create and start steps; the issues described still occur at the start step.
I have benefited from answers to this question, in particular the last answer, from August 2, 2020. Related questions such as pytorch cannot detect gpu in nvidia/cuda:11.3.1-cudnn8-runtime-ubuntu20.04 base image refer to issues after the container starts, but I do not get to that point.
It seems version 4.17.1 of docker-desktop is broken. CUDA containers worked fine in <=4.17.0 for me but, after upgrading to 4.17.1, the container start-up process just hangs.