I've been trying to get clang OpenMP GPU offloading working in the Docker Image and it builds fine at first (cmdline:
~/llvm_project/llvm/utils/docker:$ bash build_docker_image.sh
--source nvidia-cuda
--docker-repository clang-cuda --docker-tag "latest"
-p clang -i stage2-install-clang -i stage2-install-clang-resource-headers
--
-DLLVM_TARGETS_TO_BUILD="host;NVPTX"
-DCMAKE_BUILD_TYPE=Release
-DLLVM_ENABLE_RUNTIMES="openmp;offload"
-DLIBOMPTARGET_DEVICE_ARCHITECHTURES="sm_86"
-DBOOTSTRAP_CMAKE_BUILD_TYPE=Release
-DCLANG_ENABLE_BOOTSTRAP=ON
-DCLANG_BOOTSTRAP_TARGETS="install-clang;install-clang-resource-headers"
-DLLVM_ENABLE_PROJECTS="clang;clang-tools;lld;lldb;openmp"
) but then when I try to compile an example: clang++ -fopenmp -fopenmp-targets=nvptx64 -O3 run.cpp
:
clang++: error: cannot determine nvptx64 architecture: nvptx-arch: posix_spawn failed: No such file or directory; consider passing it via '--offload-arch'; environment variable CLANG_TOOLCHAIN_PROGRAM_TIMEOUT specifies the tool timeout (integer secs, <=0 is infinite)
And when I pass --offload-arch="sm_86"
:
clang++: error: no library 'libomptarget-nvptx.bc' found in the default clang lib directory or in LIBRARY_PATH; use '--libomptarget-nvptx-bc-path' to specify nvptx bitcode library
I tried the current master
branch and tag 20.1.5. Also multiple machines and GPUs (sm_60, sm_75, sm_86), Debian, NixOS (all x86_64).
The Docker container is run with runtime nvidia: docker run -it --runtime=nvidia --gpus all -v $PWD:/app clang-cuda:latest /bin/bash
and nvidia-smi
shows the correct result:
Mon Jun 2 09:39:09 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3080 ... Off | 00000000:01:00.0 Off | N/A |
| N/A 42C P8 14W / 150W | 593MiB / 16384MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
As per this discussion on the LLVM Forum, the solution is to build with -DLLVM_TARGETS_TO_BUILD="host;NVPTX;AMDGPU"
.
The Dockerfile can be found on GitHub