[SOLVED] why does GDB not tab-complete mmx register name(mm0-mm7)

why does GDB not tab-complete mmx register name(mm0-mm7)

I use gdb info registers <tab> to see all the registers, but I don't see MMX registers.

gdb i r tab, without mmx register mm0-mm7

My CPU is Xeon Platinum 8163, a modern Xeon cpu that supports SSE and MMX. So i think its a gdb problem(if i am right).

Why does gdb not support showing mmx register while mmx register should be of same importance level compared to basic registers and sse registers.

Solution

The MMX registers don't have their own separate architectural state; they alias the x87 registers st0..st7. (Intel did this so OSes wouldn't to special support to save/restore the MMX state on context switch via FXSAVE/FXRSTOR). That's different from all the other registers.

But I think this is a GDB bug, not an intentional decision to not expose the MMX state except via the x87 state. info reg mmx tab-completes but prints nothing. (GDB 10.1 on x86-64 Arch GNU/Linux)

Even when running a program with the FPU in MMX state (after executing movd mm0, eax for example), it still doesn't tab complete. In fact, even p $mm0 just prints void (because that GDB variable name isn't recognized as being tied to an MMX register).

You can see the MMX state via i r float

e.g. after mov eax, 231 / movd mm0, eax,

 starti
 stepi
 si

(gdb) p $mm0
$1 = void
(gdb) i r mm0
Invalid register `mm0'
(gdb) i r mmx
(gdb) i r float
st0            <invalid float value> (raw 0xffff00000000000000e7)
st1            0                   (raw 0x00000000000000000000)
...

After another single step, of pshufw mm1, mm0, 0

(gdb) si
0x000000000040100c in ?? ()
(gdb) i r float
st0            <invalid float value> (raw 0xffff00000000000000e7)
st1            <invalid float value> (raw 0xffff00e700e700e700e7)
st2            0                   (raw 0x00000000000000000000)

So if you ignore the high 16 bits of the 80-bit extended precision bit-pattern, you can look at the 64-bit mantissa part as the MMX register value.

I assume this has gone unfixed for so long because SSE2 makes MMX mostly obsolete, providing more wider registers and not needing a slow emms to leave MMX state before a potential x87 FPU instruction like fld. (And on modern CPUs like Skylake, MMX instructions don't have mov-elimination, and some run on fewer execution ports than their SSE2 equivalents, like paddd)

Of course, some existing code, notably x264 and FFmpeg's h.264 software decoder, still use hand-written MMX asm instead of the low qword of an XMM register. This is sometimes advantageous, e.g. to allow punpcklbw mm0, [rdi].

BTW, the test program I single-stepped was assembled + linked from this NASM source into a static executable:

mov    eax, 231         ; __NR_exit_group = 0xe7
movd   mm0, eax
pshufw mm1, mm0, 0      ; broadcast the low word
emms
nop

syscall