I use gdb info registers <tab>
to see all the registers, but I don't see MMX registers.
My CPU is Xeon Platinum 8163, a modern Xeon cpu that supports SSE and MMX. So i think its a gdb problem(if i am right).
Why does gdb not support showing mmx register while mmx register should be of same importance level compared to basic registers and sse registers.
The MMX registers don't have their own separate architectural state; they alias the x87 registers st0..st7. (Intel did this so OSes wouldn't to special support to save/restore the MMX state on context switch via FXSAVE/FXRSTOR). That's different from all the other registers.
But I think this is a GDB bug, not an intentional decision to not expose the MMX state except via the x87 state. info reg mmx
tab-completes but prints nothing. (GDB 10.1 on x86-64 Arch GNU/Linux)
Even when running a program with the FPU in MMX state (after executing movd mm0, eax
for example), it still doesn't tab complete. In fact, even p $mm0
just prints void (because that GDB variable name isn't recognized as being tied to an MMX register).
You can see the MMX state via i r float
e.g. after mov eax, 231
/ movd mm0, eax
,
starti
stepi
si
(gdb) p $mm0
$1 = void
(gdb) i r mm0
Invalid register `mm0'
(gdb) i r mmx
(gdb) i r float
st0 <invalid float value> (raw 0xffff00000000000000e7)
st1 0 (raw 0x00000000000000000000)
...
After another single step, of pshufw mm1, mm0, 0
(gdb) si
0x000000000040100c in ?? ()
(gdb) i r float
st0 <invalid float value> (raw 0xffff00000000000000e7)
st1 <invalid float value> (raw 0xffff00e700e700e700e7)
st2 0 (raw 0x00000000000000000000)
So if you ignore the high 16 bits of the 80-bit extended precision bit-pattern, you can look at the 64-bit mantissa part as the MMX register value.
I assume this has gone unfixed for so long because SSE2 makes MMX mostly obsolete, providing more wider registers and not needing a slow emms
to leave MMX state before a potential x87 FPU instruction like fld
. (And on modern CPUs like Skylake, MMX instructions don't have mov-elimination, and some run on fewer execution ports than their SSE2 equivalents, like paddd
)
Of course, some existing code, notably x264 and FFmpeg's h.264 software decoder, still use hand-written MMX asm instead of the low qword of an XMM register. This is sometimes advantageous, e.g. to allow punpcklbw mm0, [rdi]
.
BTW, the test program I single-stepped was assembled + linked from this NASM source into a static executable:
mov eax, 231 ; __NR_exit_group = 0xe7
movd mm0, eax
pshufw mm1, mm0, 0 ; broadcast the low word
emms
nop
syscall