The classical explanation of intel opcodes using octal says this:
As an example to see how this works, the mov instructions in octal are:
210 xrm mov Eb, Rb
211 xrm mov Ew, Rw
212 xrm mov Rb, Eb
213 xrm mov Rw, Ew
214 xsm mov Ew, SR
216 xsm mov SR, Ew
The meanings of the octal digits (x, m, r, s) and their correspondence to the
operands (Eb, Ew, Rb, Rw, SR) are the following:
The digit r (0-7) encodes the register operand as follows:
REGISTER (r): 0 1 2 3 4 5 6 7
Rb = Byte-sized register AL CL DL BL AH CH DL BH
Rw = Word-sized register AX CX DX BX SP BP SI DI
Why is the 6th digit for Rb DL
instead of DH
, breaking the high byte pattern?
While I'm asking this question, is there a more up to date octal explanation of the 8086 intel opcodes that was not written in the 90s?
That's a typo; DL appears twice, DH appears nowhere in that table.
You're right, it follows the pattern of 4 low then 4 high half registers, as you can see by assembling mov dl, 0
and mov dh, 0
where the destination register-number is the low 3 bits of the opcode. Pick any popular non-buggy assembler, they all get this right. (NASM is good, clang and the GNU assembler are also decent choices albeit GAS has less nice error messages.)
is there a more up to date octal explanation of the 8086 intel opcodes that was not written in the 90s?
Intel's manual is up to date, but aims for precision over clarity and readability. It sometimes doesn't mention patterns that exist in the encodings (like how the low 2 bits of most opcodes distinguish width and direction; 8 vs. 16/32/64-bit and memory source vs. destination).
https://wiki.osdev.org/X86-64_Instruction_Encoding#Registers is quite good, and does have a correct table of register numbers. It's for x86-64 so it includes the extra bit which a REX prefix can supply. (Also, the mere presence of a REX prefix changes the meaning of the 8-bit register numbers 4 through 7 from AH-BH to SPL-DIL, the low-8 of RSP through RDI. So you can't do mov ah, r8b
because that would need a REX prefix for R8, but that makes AH inaccessible.)
Most documentation uses hex or decimal, or groups of binary digits, because with REX, VEX, and EVEX prefixes supplying additional register-number bits, it's no longer always groups of 3 bits. (And because octal isn't widely used anymore.)