The add x1, sp, x2, lsl #1
instruction is supposed to be an "Add (shifted register)" but I have problems with the encoding to differentiate when SP and XZR is used. I'm astonished with the results.
This is the encoding for the "ADD (shifted register)":
31:1, 30:0, 29:0, 28:0, 27:1, 26:0, 25:1, 24:1, 23..22:shift, 21:0, 20..16:Rm, 15..10:imm6, 9..5:Rn, 4..0:Rd
If I assemble add x1, xzr, x2, lsl #1
, it matches the above encoding. Just to remark important bits:
E1 07 02 8B add x1, xzr, x2, lsl #1
bit 21 = 0;
shift = 0 (lsl);
imm6 = 1
But if I assemble add x1, sp, x2, lsl #1
, there are strange changes in the encoding. The changes are:
E1 67 22 8B add x1, sp, x2, lsl #1
bit 21 = 1;
shift = 0;
imm6 = 011001 (lsl 25)
So, I guess that bit 21 means SP instead of XZR. But why imm6 = 25? That's a "lsl 25"!
Am I looking at the wrong instruction encoding?
The instruction encoded changes from “ADD (shifted register)” (§ C6.2.5) to “ADD (extended register)” (§ C6.2.3) if you use SP
in an operand. This is necessary as only the latter supports using SP
or WSP
in its first or second operands.
I have problems with the encoding to differentiate when SP and XZR is used.
The instruction says which one of the two is used. If SP
is used, the field says something like <Xn|SP>
and the text says “general-purpose register or stack pointer.” When XZR
is used, the field says something like <Xn>
and the text just says “general-purpose register.” You need to know which instruction you are decoding to know whether to decode XZR
or SP
. And note that there are some times multiple instructions sharing the same mnemonic. For example, the ADD
mnemonic is shared by these (ignoring different operand sizes available for the same instruction):
ADD <Xd|SP>, <Xn|SP>, <R><m>{, <extend> {#<amount>}}
ADD <Xd|SP>, <Xn|SP>, #imm{, <shift>}
ADD <Xd>, <Xn>, <Xm>{, <shift> #<amount>}
ADD <V><d>, <V><n>, <V><m>