I am playing a little bit with xed with in mind the purpose to write a little emulator of the intel 8086 and I want to use xed as the decoder. But when I am writing a little code in asm (compiled with nasm):
[CPU 8086]
mov al, 0x7F
xor bx, bx
xchg bx, bx
cli
hlt
and try to display some things to see if understand how xed works, I have this behavior :
0x0:0x0 (0x0)
MOV : length = 2
operand0: AL (REG0)
operand1: 7f (IMM0)
0x0:0x2 (0x2)
XOR : length = 3
operand0: BX (REG0)
operand1: BX (REG1)
operand2: (REG2)
0x0:0x5 (0x5)
XCHG : length = 3
operand0: BX (REG0)
operand1: BX (REG1)
0x0:0x8 (0x8)
CLI : length = 1
operand0: EFLAGS (REG0)
0x0:0x9 (0x9)
HLT : length = 1
I don't understand why I have 3 operands for xor and 1 operand for cli, and in general, there is many cases where the operands displayed don't match the number of operands specified by intel. What am I doing wrong ?
There is the code I used in a gist (I did my best to make it as minimal as possible)
[edit]
Things are a little bit more clear now: I compiled xor bx, bx
with nasm -f bin test.s
and my program gives me that :
0x0:0x0 (0x0)
XOR : length = 2
operand0: BX (REG0)
operand1: BX (REG1)
operand2: FLAGS (REG2)
The length of xor is 2 : that's right we are in 16 bits mode. There is 2 explicit operands : bx and bx thats right There is one implicit suppressed operand : flags (like @Peter Corde said)
Everything looks good now
CLI clears the IF bit in EFLAGS, so that makes sense.
It looks like XED is including implicit operands, not just ones that are explicit in the machine code. i.e. all changes to the architectural state.
XOR writes flags, but XCHG doesn't. So REG2 is probably EFLAGS. But your code has only case XED_OPERAND_REG0
and ...REG1
in a switch
statement, so probably it had a name (probably EFLAGS) but your code chose not to print it.
I was curious so I read the XED docs for you: XED classifies operands according to their visibility: either explicit (like bx
in xor bx,bx
) or implicit, or "IMPLICIT SUPPRESSED (SUPP
)". SUPP operands are:
SUPP operands are:
- not used in picking an encoding, (this is the difference from plain implicit)
- not printed in disassembly,
- not represented using operand bits in the encoding.
So you should check xed_operand_visibility_enum_t
and only print the explicit operands.
BTW, you seem to have assembled your code in 32-bit or 64-bit mode, because your 16-bit instructions like xor bx,bx
are 3 bytes long. In 16-bit mode it would just be opcode + modrm. An operand-size prefix (66
) added by the assembler (and correctly decoded by the disassembler) would explain it.
[CPU 8086]
doesn't mean [BITS 16]
. Unless you really want 16-bit mode for some reason, you should probably keep using 32-bit mode. (Your disassembler was already decoding it in the same mode your assembler was assembling for. Using BITS 16
would let you put 16-bit machine code in a 32-bit object file, which would just make it decode wrong.