pythonubuntutheanotheano-cuda

WSL, Theano - nvcc compiler not found on $PATH


Before this question gets marked as a duplicate - I have tried all of the existing solutions available on SO and some other websites, but none of them are working.

I'm getting an error using theano (Python 2.7.18, Theano 0.6.0, WSL2 Ubuntu 22, Geforce RTX 3070 Ti Laptop) - ERROR (theano.sandbox.cuda): nvcc compiler not found on $PATH. Check your nvcc installation and try again. I looked it up and saw a lot of existing SO answers, but none of them seemed to work. I have cuda installed in so many different ways I don't really know what I'm doing.

I know my configuration is outdated, but I would like to try to make it work on this configuration, and I know it's been done before. I'm just cloning someone's git repository and trying to run it with their exact versions.

$PATH:

/usr/local/cuda/bin:/usr/local/cuda/bin:/home/william/anaconda3/envs/kg/bin:/home/william/anaconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/libnvvp:/mnt/c/Windows/system32:/mnt/c/Windows:/mnt/c/Windows/System32/Wbem:/mnt/c/Windows/System32/WindowsPowerShell/v1.0:/mnt/c/Windows/System32/OpenSSH:/mnt/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/mnt/c/Program Files/NVIDIA Corporation/NVIDIA NvDLISR:/mnt/c/Program Files/Git/cmd:/mnt/c/Program Files/NVIDIA Corporation/Nsight Compute 2023.2.0:/mnt/c/Users/willi/AppData/Local/Programs/Python/Python310/Scripts:/mnt/c/Users/willi/AppData/Local/Programs/Python/Python310:/mnt/c/Users/willi/AppData/Local/Programs/Python/Python311/Scripts:/mnt/c/Users/willi/AppData/Local/Programs/Python/Python311:/mnt/c/Users/willi/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/willi/miniconda3/Library/bin:/mnt/c/Users/willi/miniconda3/Scripts:/mnt/c/Users/willi/miniconda3:/mnt/c/Program Files/Graphviz/bin:/snap/bin

$LD_LIBRARY_PATH:

/usr/local/cuda/lib64

~/.theanorc:

[global]
floatX = float32
device = gpu0

[cuda]
root = /usr/local/cuda

I somehow have cuda in /usr/lib/cuda, /usr/lib/nvidia-cuda-toolkit, /usr/local/cuda, /usr/local/cuda-12, and /usr/local/cuda-12.2. Some of them only have a few files in their bin, and some of them have a lot more.

ls /usr/local/cuda/bin shows a lot of stuff, including nvcc. /usr/local/cuda-12/bin and /usr/local/cuda-12.2/bin all have the same files as /usr/local/cuda/bin. /usr/lib/cuda/bin has nothing in it. /usr/lib/nvidia-vuda-toolkit/bin has g++ and gcc, which aren't in the above cuda installations, and has nvcc as well.

I also have an nvcc installation in /usr/bin.

List of stuff I tried:

I am on WSL2 running Ubuntu. Using the same installation on Windows, running import theano in a Python console imports successfully with no nvcc error.

One other thing I noticed which is weird is that running lspci doesn't show any nvidia devices. This is what I think is a red flag for the problem but I have no idea how to fix it. I have tried reinstalling my gpu drivers and the steps here. The sample application in step 4 works.

Using the latest version of Theano and Python, and downgrading numpy to work with it, I get the error pygpu.gpuarray.GpuArrayException: b'cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected. Although I'm not trying to get the most recent version working, I think fixing this error might help with the other one.

How can I fix this?


Solution

  • I ended up changing "gpu0" in ~/.theanorc to "cuda0," which I think this does something to the effect of using another backend for the gpu.