In 8086 this structure is correct:
mov bh,[bx]
but this is not correct:
mov bh,[cx]
I don't know why. I think that the general purpose registers (AX, BX, CX, DX, SP, BP, SI and DI) are registers that we can use for any purpose and the statement that BX is for base address or CX is for counter is just a convention and they don't differ at all. But it seems that I'm wrong. Can you explain the reason? And what is the exact difference between these registers? (For example why can't I save the base address in cx register?)
On the 8086 (and 16-bit addressing in x86), only addressing modes of the form
[bp|bx] + [si|di] + disp0/8/16
are available. Listing them all:
[bx] [bx + foo]
[foo] [bp + foo]
[si] [si + foo]
[di] [di + foo]
[bx + si] [bx + si + foo]
[bx + di] [bx + di + foo]
[bp + si] [bp + si + foo]
[bp + di] [bp + di + foo]
where foo
is some constant value, e.g. 123
or the offset of a symbol within a segment, e.g. a literal foo
to reference a foo:
label somewhere.
(Fun fact: the only way to encode [bp]
is actually as [bp+0]
, and assemblers will do this for you. Notice in the table [foo]
is where [bp]
would otherwise be; this reflects how x86 machine code special-cases that encoding to mean displacement with no registers.)
bp
as the base implies the SS (stack) segment; other addressing modes imply the DS (data) segment. This can be overridden with a prefix if necessary.
Note that no addressing mode involving cx
exists, so [cx]
is not a valid memory operand.
The registers ax, cx, dx, bx, sp, bp, si, and di are called general purpose registers because they are accessible as operands in all general-purpose instructions. This is in contrast to special-purpose registers like es, cs, ss, ds (segment registers), ip (the instruction pointer) or the flags register which are only accessible with special instructions made just for this purpose.
As you see, not all general purpose registers can be used as index registers for memory operands. This has to be kept in mind when registrating your code.
In addition to this restriction, there are some instructions that implicitly operate on fixed registers. For example, the loop
instruction exclusively operates on cx
and a 16-bit imul r/m16
operates exclusively on dx:ax
. If you want to make effective use of these instructions, it is useful to keep each general purpose register's suggested purpose in mind.
Notably, lods
/ stos
/ scas
/ movs
/ cmps
use DS:SI or/and ES:DI implicitly, and on cx
when used with a rep
or repz
/ repnz
prefix, so those registers for looping a pointer over an array allow code-size optimizations.