I'm trying to profile OpenCL application, a.out
, in a system with NVIDIA TITAN X and CUDA 8.0.
If it was CUDA application, nvprof ./a.out
would be enough. But I found this does not work with OpenCL application, with a message "No kernels were profiled."
Until CUDA 7.5, I successfully used COMPUTE_PROFILE=1
following this. Unfortunately, the documentation says "The support for command-line profiler using the environment variable COMPUTE_PROFILE has been dropped in the CUDA 8.0 release."
The question is, is there any way other than downgrading CUDA to profile OpenCL application with nvprof?
To the best of my knowledge, nvprof
has never supported OpenCL profiling.
Running code with COMPUTE_PROFILE=1
invokes a driver based profiling mechanism which predates the introduction of nvprof
. That driver based mechanism was deprecated a while ago and has now been removed as of CUDA 8 in favour of using nvprof
.
As a result, it would appear that there is no way to profile OpenCL code running on NVIDIA hardware using the CUDA toolkit.