c++cmakelibtorch

Error when using CMake to compile CUDA project: "tuple ... error: wrong number of template arguments"


Project Structure

Structure is listed below, where _ext serves as a library for PointConv project (root project) and is divided into include and src directories. libtorch dirtory is pytorch library for C.

CMakeLists.txt

| (root)
| main.cpp
| CMakeLists.txt
----| (_ext)
----| CMakeLists.txt
--------| (src)
--------| group_points_gpu.cu
--------| group_points.cpp
--------| CMakeLists.txt
--------| (include)
--------| group_points.h
----| (libtorch)
----| ...(emited)

Here's the CMakeLists.txt in root dir.

############ CMakeLists.txt in root dir ############
cmake_minimum_required (VERSION 3.23.0)
set(CMAKE_C_COMPILER "/usr/bin/gcc-7")    # compiler
set(CMAKE_CXX_COMPILER "/usr/bin/g++-7")
set(CMAKE_CXX_STANDARD 11)                # standard
project (PointConv LANGUAGES CXX CUDA)

add_executable(PointConv main.cpp)

target_compile_features(PointConv PUBLIC cxx_std_11)
set_target_properties(PointConv PROPERTIES CUDA_SEPARABLE_COMPILATION ON)

# include
include_directories("libtorch/include/")
include_directories("libtorch/include/torch/csrc/api/include/")
include_directories("/data_HDD/zhuxingyu/anaconda3/envs/p11/include/python3.8/")
include_directories("${PROJECT_SOURCE_DIR}/_ext/include")
# subdir
add_subdirectory(_ext)
target_link_libraries(PointConv _ext)

file(GLOB LIBTORCH_LIBS "libtorch/lib/*.a" "libtorch/lib/*.so")
target_link_libraries(PointConv ${LIBTORCH_LIBS})
############ CMakeLists.txt in root/_ext/src dir ############
project(_ext CXX CUDA)

set(CUDA_TOOLKIT_ROOT_DIR /usr/local/cuda)
find_package(CUDA 11.4 REQUIRED)

file(GLOB CUDA_FILES "*.cu")
file(GLOB CXX_FILES "*.cpp")
# cmake will not comiple cu files without this statement
add_library(_ext ${CUDA_FILES} ${CXX_FILES})

target_compile_features(_ext PUBLIC cxx_std_11)
set_target_properties(_ext PROPERTIES CUDA_SEPARABLE_COMPILATION ON)

cxx/cu file

### main.cpp ###
#include<stdio.h>
#include "group_points.h"

int main(){
    printf("1\n");
    return 0;
}
### group_points.h ###
#pragma once
#include <torch/extension.h>

at::Tensor group_points(at::Tensor points, at::Tensor idx);
at::Tensor group_points_grad(at::Tensor grad_out, at::Tensor idx, const int n);

Error

Cmake works fine but I failded to compile main.cpp. In summary, this might be an issue of compatibility between g++ and pytorch. Here's part of the error log:

/data_HDD/zhuxingyu/vscode/meshwatermark/model/pointconv/libtorch/include/ATen/ExpandUtils.h:173:169:   required from here
/usr/include/c++/6/tuple:495:244: error: wrong number of template arguments (4, should be 2)
       return  __and_<__not_<is_same<tuple<_Elements...>,

The rest of the error are following the saming pattern: xxx: error: wrong number of template arguments (y, should be z).

Analysis

As I said above, this might be a compability issue. I'm using libtorch of version 1.12.0 with cuda version 11.3. I have both g++ 6.5 and g++ 7.5 but it always includes /usr/include/c++/6/tuple. I dont know how to solve this problem without downgrading my pytorch and cuda.


Solution

  • It turns out to be an issue of NVCC <9.2. Fixed once use NVCC 11.4.

    There are two cuda version (VERSION 9.1, VERSION 11.4)on my system. find_package(CUDA 11.4 REQUIRED) does not let cmake use the compiler of nvcc@11.4. Instead, it uses nvcc@9.1, which causes the issue.

    Should tell cmake which the prefered nvcc compiler path before project statement

    set(CMAKE_CUDA_COMPILER /usr/local/cuda/bin/nvcc) # path to nvcc 11.4
    set(CUDA_TOOLKIT_ROOT_DIR /usr/local/cuda/)