I am developing a fairly big project (have around 1200 kernels so far). I have 1 kernel that possibly has some memory race which is why it's giving different answers every time. I want to find it by performing cuda-memcheck on that specific kernel. So naturally, I am trying to use --filter
option in cuda-memcheck
with --tool racecheck
option. The codebase is big and performing cuda-memcheck on all kernels especially with racecheck enabled will take an eternity.
The official documentation says using key value pair as: {key1=val1}[{,key2=val2}]
.
I am not really sure what exactly this means and whatever I have tried resulted in invalid options message. I could not find any example online as well as Nvidia cuda-samples provided with the toolkit.
So far, I have tried these (and probably all combinations of these):
cuda-memcheck --filter <kernel_name>,kns <Executable>
cuda-memcheck --filter key1=<kernel_name>, key2=kns <Executable>
cuda-memcheck --filter key1='<kernel_name>', key2='kns' <Executable>
cuda-memcheck --filter <kernel_name>,[kns] <Executable>
I am not sure exactly how to interpret the documentation. An example would be great. Thanks.
Note: I can use cuda-memcheck with other options and my executable is compiled correctly with flags like Xcompiler, lineinfo
etc.
As answered by @paleonix in the comments the correct format is:
cuda-memcheck --filter kns=<kernel_name_substring> <Executable>
or
cuda-memcheck --filter kne=<kernel_name> <Executable>
The key difference between these two options is: (According to Nvidia-documentation)
kns:
User specifies the complete mangled kernel name.
kne:
User specifies a substring in the mangled kernel name.
Note: starting cuda 11.0 the cuda-memcheck
tool is deprecated and replaced with compute-sanitizer
it works as a drop-in replacement.