Using 8-bit registers in x86-64 indexed addressing modes

Is it possible to use the 8-bit registers (al, ah, bl, bh, r8b) in indexed addressing modes in x86-64? For example:

mov ecx, [rsi + bl]
mov edx, [rdx + dh * 2]

In particular, this would let you use the bottom 8-bits of a register as a 0-255 offset, which could be useful for some kernels.

I poured over the Intel manuals and they aren't explicit on the matter, but all the examples they give only have 32-bit or 64-bit base and index registers. In 32-bit code I only saw 16 or 32-bit registers. Looking at the details of mod-r/m and SIB byte encoding also seems to point towards "no" but that's complex enough with enough corner cases that I'm not sure I got it right.

I'm mostly interested in the x86-64 behavior, but of course if it's possible in 32-bit mode only I'd like to know.

As an add-on question too small and related to deserve another post - can 16-bit registers be used for base or index? E.g., mov rax, [rbx + cx]. My investigation pointed towards basically the same answer as above: probably not.

Solution

No, you cannot use 8-bit or 16-bit registers in addressing calculations in 64-bit mode, nor can you use 8-bit registers in 32-bit mode. You can use 16-bit registers in 32-bit mode, and 32-bit registers in 64-bit mode, via use of the 0x67 address size prefix byte.

(But using a narrower register makes the whole address narrow, not a 16-bit index relative to a 32-bit array address. Any registers need to be the same width as the address, which you normally want to match the mode you're in, unless you have stuff in the low 16 or low 32 bits of address space.)

This table summarizes well the various options for operand and address sizes. The general pattern is that the default address size is the same as the current mode (i.e., 32-bits in 32-bit mode, 64-bits in 64-bit mode)¹, and then if the 0x67 prefix is included, the address size is changed to half the usual size (i.e., 16-bits in 32-bit mode, 32-bits in 64-bit mode).

Here's an excerpt of the full table linked above showing 64-bit long-mode behavior only, for various values of the REX.W, 0x66 operand and 0x67 address size prefixes:

REX.W	0x66 prefix (operand)	0x67 prefix (address)	Operand size (footnote 2)	Address size
0	No	No	32-bit	64-bit
0	No	Yes	32-bit	32-bit
0	Yes	No	16-bit	64-bit
0	Yes	Yes	16-bit	32-bit
1	Ignored	No	64-bit	64-bit
1	ignored	Yes	64-bit	32-bit

¹ That might seem obvious, but it's the opposite to the way operand sizes work in 64-bit mode: most default to 32-bits, even in 64-bit mode, and a REX prefix is needed to promote them to 64-bits.

² Some instructions default to 64-bit operand size without any REX prefix, notably push, pop, call and conditional jumps, and as Peter points out below, this leads to the odd situation where at least some of these instructions (push and pop included) can't be encoded to use 32-bit operands, but can use 16-bit operands (with the 0x66 prefix).