assemblyx86masminstruction-setmov

Why doesn't MOVZX work when operands have the same size?


With Z2 dword ?, mov eax, Z2 works fine but movzx eax, Z2 gives "invalid instruction operands" error.

I am a little confused here: even though Z2 is of same size as eax, why couldn't assembly just accept movzx for this? It seems that movzx specifically wants that the operands are not of same size.

What could be the reason for designing an instruction like this?

Wouldn't it be easier to code if it was designed to simply allow operands of same size?


Solution

  • It does work (in machine code), inefficiently, at least on Intel CPUs.
    But it's not documented as a valid form of movzx / movsx (https://www.felixcloutier.com/x86/movzx).
    That's why most assemblers stop you from shooting yourself in the foot.


    What could be the reason for designing an instruction like this ?

    To perform zero-extension from narrow source data.
    That's what the ZX in the mnemonic means.

    If you have same-sized operands, you're expected to use mov.
    movzx and movsx are only intended to be used when the destination is wider than the source so there's some actual Zero eXtension or Sign eXtension to do.


    Just like with MOVSXD, even when it's possible to use the MOVZX opcode to encode an instruction equivalent to mov r, r/m16, it's not recommended for efficiency reasons.

    Like Intel says for MOVSXD: The use of MOVSXD without REX.W (which would encode movsxd r32, r/m32) is discouraged. Regular MOV should be used instead of using MOVSXD without REX.W. (I took out the "in 64-bit mode" from the quote because that's redundant; movsxd only exists in 64-bit mode; the opcode means something else in other modes.)


    Anyway yes, it is possible to movzx ax, bx in x86 machine code, but assemblers save you from yourself and refuse to assemble that undocumented and inefficient instruction. (2-byte opcode instead of 1 for mov; movzx was new in 386 and all the 1-byte opcodes were already used up before that.)

    Copies the contents of the source operand (register or memory location) to the destination operand (register) and zero extends the value. The size of the converted value depends on the operand-size attribute.
    https://www.felixcloutier.com/x86/movzx

    I tested it on my Skylake CPU with the following NASM source, written to probably assemble with MASM as well. (e.g. db 66h instead of using an o16 NASM prefix on the movzx line.)

    There's no guarantee that this will run the same on other or future CPUs; since this combination of opcode and operand-size isn't documented, it at least hypothetically run differently on other current CPUs, and/or on future CPUs.

    mov  edx, -1
    mov  rax, 0x11223344cccccccc
    db   66h             ; operand-size prefix that we're not telling the assembler about
    movzx  eax, dx
    
    mov  ax, dx          ; for comparison
    

    (super minimal, taking advantage of toolchain defaults for this one-off that's never intended to be a proper program.)

    $ nasm -felf64 movzx.asm && ld -o movzx  movzx.o 
    ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000
    $ objdump -drwC -Mintel  ./movzx
    ...
      401000:       ba ff ff ff ff          mov    edx,0xffffffff
      401005:       48 b8 cc cc cc cc 44 33 22 11   movabs rax,0x11223344cccccccc
      40100f:       66 0f b7 c2             movzx  ax,dx
      401013:       66 89 d0                mov    ax,dx       # note it's shorter.  
              # Fun fact: we can see NASM picked the mov r/m16, r form, since the ModRM byte is different.
    

    Interestingly, the disassembler in GNU Binutils (objdump -d and GDB) decodes it as movzx ax, dx, or movzww %dx, %ax in AT&T syntax.

    Using gdb ./movzx on the static executable, I used layout reg and starti / stepi to step through and see registers change:

    66 0f b7 c2 movzx ax,dx executes normally, and
    changes RAX from 0x11223344cccccccc to 0x11223344ccccffff, proving that it behaved exactly like a 16-bit mov, not touching any upper bytes of RAX. (Including not implicitly zero-extending the upper 32 bits of RAX, like a write to EAX would have.)

    (Then quit GDB because I didn't include code to exit, only the code I actually wanted to single-step.)


    This is impossible for movzx al, dl - there is no movzx opcode with an 8-bit destination. 16-bit vs. 32 vs. 64-bit operand-size is selected by 66 or REX prefixes to override the mode's default, but 8-bit operand-size is only set via the opcode. There's no prefix that can override an instruction to 8-bit operand-size. And of course there's no form of movzx with an 8-bit destination operand. (If you want to zero-extend a nibble to a byte, copy and and reg, 0x0f.)


    Assemblers that allow it: just GAS in .intel_syntax mode?

    NASM and YASM reject movzx ax, dx
    So does clang (with .intel_syntax noprefix).
    But llvm-objdump -d will disassemble it the same as GNU Binutils.

    But GNU Binutils not only disassembles it (Intel movzx ax,dx, AT&T movzww %dx, %ax), it (GNU as) accepts the Intel-syntax version. GAS:

    .intel_syntax noprefix
        movzx  ax, dx             # works, producing the above machine code.
    
    .att_syntax
        movzw   %dx, %ax         # Error: operand size mismatch for `movzw'
        movzww  %dx, %ax         # Error: invalid instruction suffix for `movzw'
    

    Related: