I am compiling the following fragment of code with nvcc -g -G gdbfail.cu
.
#include <cstdio>
#include <cinttypes>
__global__ void mykernel() {
uint8_t* ptr = (uint8_t*) malloc(8);
for (int i = 0; i < 8; i++) {
ptr[i] = 7 - i;
}
for (int i = 0; i < 8; i++) { // PUT BREAKPOINT HERE
printf("%" PRIx8 " ", ptr[i]);
}
printf("\n");
}
int main() {
uint8_t* ptr = (uint8_t*) malloc(8);
for (int i = 0; i < 8; i++) {
ptr[i] = 7 - i;
}
for (int i = 0; i < 8; i++) { // PUT BREAKPOINT HERE
printf("%" PRIx8 " ", ptr[i]);
}
printf("\n");
mykernel<<<1,1>>>();
cudaDeviceSynchronize();
}
When I run cuda-gdb ./a.out
and put breakpoint at line 10 (b 10
), run the code (r
), and trying to print values at the address located in ptr
I get surprising results
(cuda-gdb) x/8b ptr
0x7fffcddff920: 7 6 5 4 3 2 1 0
(cuda-gdb) x/8b 0x7fffcddff920
0x7fffcddff920: 0 0 0 0 0 0 0 0
When I am doing the same thing in the host code (b 23
, r
), I get expected results:
(cuda-gdb) x/8b ptr
0x5555556000a0: 7 6 5 4 3 2 1 0
(cuda-gdb) x/8b 0x5555556000a0
0x5555556000a0: 7 6 5 4 3 2 1 0
Why cuda-gdb doesn't show correct memory values when it is provided with address as a number (0x7fffcddff920
) instead of a symbol (ptr
)?
Evidently, not all gdb
command features that are usable in host code are also usable in device code. When used in device code, the supported commands may have different syntax or expectations. This is indicated in the cuda-gdb docs.
Those docs indicate that the way to inspect memory is the print
command and indicate some additional decode syntax that is needed for a "bare" address/pointer. Here is your example:
$ cuda-gdb ./t1869
NVIDIA (R) CUDA Debugger
11.4 release
Portions Copyright (C) 2007-2021 NVIDIA Corporation
GNU gdb (GDB) 10.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./t1869...
(cuda-gdb) b 10
Breakpoint 1 at 0x403b05: file t1869.cu, line 14.
(cuda-gdb) r
Starting program: /home/user2/misc/t1869
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
7 6 5 4 3 2 1 0
[Detaching after fork from child process 25822]
[New Thread 0x7fffef475700 (LWP 25829)]
[New Thread 0x7fffeec74700 (LWP 25830)]
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]
Thread 1 "t1869" hit Breakpoint 1, mykernel<<<(1,1,1),(1,1,1)>>> () at t1869.cu:10
10 for (int i = 0; i < 8; i++) { // PUT BREAKPOINT HERE
(cuda-gdb) x/8b ptr
0x7fffbcdff920: 7 6 5 4 3 2 1 0
(cuda-gdb) p/x *(@global unsigned char *)0x7fffbcdff920@8
$1 = {0x7, 0x6, 0x5, 0x4, 0x3, 0x2, 0x1, 0x0}
(cuda-gdb)
Note the above print
command needs some help in interpreting which "space" you are expecting the memory address to refer to (e.g. @shared
, @global
, etc.)
If we give your command the same "help" we get the expected result:
(cuda-gdb) x/8b ptr
0x7fffbcdff920: 7 6 5 4 3 2 1 0
(cuda-gdb) x/8b (@global unsigned char *)0x7fffbcdff920
0x7fffbcdff920: 7 6 5 4 3 2 1 0
(cuda-gdb)