I'm trying to debug what appears to be a completion queue issue:
Apr 14 18:39:15 ST2035 kernel: Call Trace:
Apr 14 18:39:15 ST2035 kernel: [<ffffffff8049b295>] schedule_timeout+0x1e/0xad
Apr 14 18:39:15 ST2035 kernel: [<ffffffff8049a81c>] wait_for_common+0xd5/0x13c
Apr 14 18:39:15 ST2035 kernel: [<ffffffffa01ca32b>]
ib_unregister_mad_agent+0x376/0x4c9 [ib_mad]
Apr 14 18:39:16 ST2035 kernel: [<ffffffffa03058f4>] ib_umad_close+0xbd/0xfd
Is it possible to turn those hex numbers into something close to line numbers?
Not exactly but if you have a vmlinux image built with debugging info, (e.g., in RHEL, you should be able to install the kernel-debug or kernel-dbg or something like that) you can get close. So assuming you have that vmlinux file available. Do the following:
objdump -S vmlinux
This will try it's hardest to match the object code to individual lines of source code.
e.g. for the following C code:
#include <stdio.h>
main() {
int a = 1;
int b = 2;
// This is a comment
printf("This is the print line %d\n", b);
}
compiled with : cc -g test.c
and then running objdump -S on the resulting executable, I get a large output describing the various parts of the executable inclding the following section:
00000000004004cc <main>:
#include <stdio.h>
main() {
4004cc: 55 push %rbp
4004cd: 48 89 e5 mov %rsp,%rbp
4004d0: 48 83 ec 20 sub $0x20,%rsp
int a = 1;
4004d4: c7 45 f8 01 00 00 00 movl $0x1,-0x8(%rbp)
int b = 2;
4004db: c7 45 fc 02 00 00 00 movl $0x2,-0x4(%rbp)
// This is a comment
printf("This is the print line %d\n", b);
4004e2: 8b 75 fc mov -0x4(%rbp),%esi
4004e5: bf ec 05 40 00 mov $0x4005ec,%edi
4004ea: b8 00 00 00 00 mov $0x0,%eax
4004ef: e8 cc fe ff ff callq 4003c0 <printf@plt>
}
You can match the addresses of the object code in the first column against the addresses in your stack trace. Combine that with the line number info interleaved in the assembly output... and you're there.
Now keep in mind that this will not always succeed 100% because the kenrel is normally compiled at -O2 optimization level and the compiler would have done a lot of code re-ordering etc. But if you are familiar with the code that you're trying to debug and have some comfort with deciphering the assembly of the platform you're working on... you should be able to pin down most of your crashes etc.