armdisassemblycortex-mopcodethumb

Mysterious ARM Opcode


In decompiling a hex file for a Texas Instruments ARM (Thumb 2) Cortex-M4f processor (CC2652RB), I have come across an opcode that I can't figure out. What does "90 FF FF 00" do (maybe the context below helps)? Is Texas Instruments allowed to use custom opcodes if they aren't claimed in the ARM standard?

19 46          mov     r1, r3
06 4A          ldr     r2, [pc, #0x18]
00 28          cmp     r0, #0
11 60          str     r1, [r2]
18 BF          it      ne
01 20          movne   r0, #1
BC BD          pop     {r2, r3, r4, r5, r7, pc}
90 FF FF 00    ?
14 20          movs    r0, #0x14
02 40          ands    r2, r0
04 04          lsls    r4, r0, #0x10
00 20          movs    r0, #0
30 72          strb    r0, [r6, #8]
05 00          movs    r5, r0
08 04          lsls    r0, r1, #0x10
00 20          movs    r0, #0
F0 B5          push    {r4, r5, r6, r7, lr}
40 F6 FF 7C    movw    ip, #0xfff
10 F8 01 3B    ldrb    r3, [r0], #1
00 24          movs    r4, #0
08 2C          cmp     r4, #8

EDIT: Here is more of what precedes this mysterious line (with an arbitrary address start) since Nate Eldredge pointed out that this line probably starts literal data used by the preceding function:

0x40:  18 B1          cbz    r0, #    0x4a
0x42:  00 F0 1F FA    bl     #    0x484
0x46:  00 20          movs   r0, #0
0x48:  BC BD          pop    {r2, r3, r4, r5, r7, pc}
0x4a:  00 F0 07 F8    bl     #    0x5c
0x4e:  BC BD          pop    {r2, r3, r4, r5, r7, pc}
0x50:  00 09          lsrs   r0, r0, #4
0x52:  3D 00          movs   r5, r7
0x54:  B4 01          lsls   r4, r6, #6
0x56:  00 10          asrs   r0, r0, #    0x20
0x58:  14 20          movs   r0, #    0x14
0x5a:  02 40          ands   r2, r0
0x5c:  BC B5          push   {r2, r3, r4, r5, r7, lr}
0x5e:  16 48          ldr    r0, [pc, #    0x58]
0x60:  16 4D          ldr    r5, [pc, #    0x58]
0x62:  04 21          movs   r1, #4
0x64:  01 90          str    r0, [sp, #4]
0x66:  00 20          movs   r0, #0
0x68:  28 70          strb   r0, [r5]
0x6a:  01 A8          add    r0, sp, #4
0x6c:  00 F0 8A FA    bl     #    0x584
0x70:  10 B1          cbz    r0, #    0x78
0x72:  01 20          movs   r0, #1
0x74:  28 70          strb   r0, [r5]
0x76:  07 E0          b      #    0x88
0x78:  11 4C          ldr    r4, [pc, #    0x44]
0x7a:  02 21          movs   r1, #2
0x7c:  20 46          mov    r0, r4
0x7e:  00 F0 AF F9    bl     #    0x3e0
0x82:  01 21          movs   r1, #1
0x84:  29 70          strb   r1, [r5]
0x86:  08 B1          cbz    r0, #    0x8c
0x88:  00 20          movs   r0, #0
0x8a:  BC BD          pop    {r2, r3, r4, r5, r7, pc}
0x8c:  0D 4B          ldr    r3, [pc, #    0x34]
0x8e:  22 78          ldrb   r2, [r4]
0x90:  64 78          ldrb   r4, [r4, #1]
0x92:  03 F1 20 01    add.w  r1, r3, #    0x20
0x96:  18 68          ldr    r0, [r3]
0x98:  40 B1          cbz    r0, #    0xac
0x9a:  1D 79          ldrb   r5, [r3, #4]
0x9c:  AA 42          cmp    r2, r5
0x9e:  02 D1          bne    #    0xa6
0xa0:  5D 79          ldrb   r5, [r3, #5]
0xa2:  AC 42          cmp    r4, r5
0xa4:  01 D0          beq    #    0xaa
0xa6:  08 33          adds   r3, #8
0xa8:  F5 E7          b      #    0x96
0xaa:  19 46          mov    r1, r3
0xac:  06 4A          ldr    r2, [pc, #    0x18]
0xae:  00 28          cmp    r0, #0
0xb0:  11 60          str    r1, [r2]
0xb2:  18 BF          it     ne
0xb4:  01 20          movne  r0, #1
0xb6:  BC BD          pop    {r2, r3, r4, r5, r7, pc}

Solution

  • My guess is that it's not an instruction at all, but rather data. The preceding instruction is a pop {..., pc} that would normally be the return at the end of function. So this "instruction" isn't reachable by straight line execution. You could only execute it by a branch from somewhere else, and I bet you won't find one.

    But after the end of a function is a natural place to find a literal pool. In fact, I suspect everything from the mystery word up to the push { ..., lr} several lines further down (which would be the natural first instruction of the next function) is a literal pool. Note for instance that the ldr r2, [pc, #0x18] above, which is most definitely a load from a literal pool, would also load from within this region.

    The disassembly of the bytes following the mystery word looks at first glance like reasonable code, but on further inspection are questionable. It would clobber registers that would normally be call-preserved (r4, r5), and further down there's lsls r0, ... whose result is immediately overwritten with mov r0, #0.

    And as mentioned the fall-through into a more plausible function prologue would be strange. In fact, it would then execute the ldrb r3, [r0, #1] with r0 = 0 which would inevitably crash.

    I suspect if you disassemble the rest of the preceding function, you will find somewhere a pc-relative load that loads the mystery word as data.