I have a weird problem which only occurs since today on my github workflow. These are relevant commands.
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
pip3 install mmengine==0.6.0 mmcv==2.0.0rc3 mmdet==3.0.0rc5 mmaction2==1.0rc3
The former succeeded. The latter stops with following error:
Collecting mmengine==0.6.0
Using cached mmengine-0.6.0-py3-none-any.whl (360 kB)
Collecting mmcv==2.0.0rc3
Using cached mmcv-2.0.0rc3.tar.gz (424 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [18 lines of output]
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-uml22xq3/mmcv_89a43e000b91495e88399ffe3c493514/setup.py", line 329, in <module>
ext_modules=get_extensions(),
^^^^^^^^^^^^^^^^
File "/tmp/pip-install-uml22xq3/mmcv_89a43e000b91495e88399ffe3c493514/setup.py", line 290, in get_extensions
ext_ops = extension(
^^^^^^^^^^
File "/home/github/.pyenv/versions/miniconda3-3.10-22.11.1-1/envs/heavi-analytic/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1048, in CUDAExtension
library_dirs += library_paths(cuda=True)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/github/.pyenv/versions/miniconda3-3.10-22.11.1-1/envs/heavi-analytic/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1179, in library_paths
if (not os.path.exists(_join_cuda_home(lib_dir)) and
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/github/.pyenv/versions/miniconda3-3.10-22.11.1-1/envs/heavi-analytic/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2223, in _join_cuda_home
raise EnvironmentError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
Any idea?
UPDATE 1: So it turns out that pytorch version installed is 2.0.0 which is not desirable.
It turns out that as torch 2 was released on March 15 yesterday, the continuous build automatically gets the latest version of torch.
This hardcoded torch version fix everything:
pip3 install torch==1.13.1+cu117 torchvision==0.14.1+cu117 \
torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
It installs torch 1.13 with cuda 11.7.
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu117
Collecting torch==1.13.1+cu117
Using cached https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp310-cp310-linux_x86_64.whl (1801.8 MB)
Collecting torchvision==0.14.1+cu117
Using cached https://download.pytorch.org/whl/cu117/torchvision-0.14.1%2Bcu117-cp310-cp310-linux_x86_64.whl (24.3 MB)
Collecting torchaudio==0.13.1
Using cached https://download.pytorch.org/whl/cu117/torchaudio-0.13.1%2Bcu117-cp310-cp310-linux_x86_64.whl (4.2 MB)
Collecting typing-extensions
Using cached typing_extensions-4.5.0-py3-none-any.whl (27 kB)
Collecting pillow!=8.3.*,>=5.3.0
Using cached Pillow-9.4.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.4 MB)
Requirement already satisfied: requests in /home/github/.pyenv/versions/miniconda3-3.10-22.11.1-1/lib/python3.10/site-packages (from torchvision==0.14.1+cu117) (2.28.1)
Collecting numpy
Using cached numpy-1.24.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
Requirement already satisfied: certifi>=2017.4.17 in /home/github/.pyenv/versions/miniconda3-3.10-22.11.1-1/lib/python3.10/site-packages (from requests->torchvision==0.14.1+cu117) (2022.12.7)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/github/.pyenv/versions/miniconda3-3.10-22.11.1-1/lib/python3.10/site-packages (from requests->torchvision==0.14.1+cu117) (1.26.13)
Requirement already satisfied: charset-normalizer<3,>=2 in /home/github/.pyenv/versions/miniconda3-3.10-22.11.1-1/lib/python3.10/site-packages (from requests->torchvision==0.14.1+cu117) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in /home/github/.pyenv/versions/miniconda3-3.10-22.11.1-1/lib/python3.10/site-packages (from requests->torchvision==0.14.1+cu117) (3.4)
Installing collected packages: typing-extensions, pillow, numpy, torch, torchvision, torchaudio
Successfully installed numpy-1.24.2 pillow-9.4.0 torch-1.13.1+cu117 torchaudio-0.13.1+cu117 torchvision-0.14.1+cu117 typing-extensions-4.5.0
EDIT 1:
Sometimes pip3 does not succeed. Use conda instead.
conda install pytorch==1.13.1 torchvision==0.14.1 \
torchaudio==0.13.1 cudatoolkit=11.7 pytorch-cuda=11.7 -c pytorch -c nvidia