assemblyarmgnu-assemblerthumb

Why do forward reference ADR instructions assemble with even offsets in Thumb code?


To bx to a Thumb function, the least significant bit of the address needs to be set. The GNU as documentation states how this works when the address is generated from an adr pseudo-instruction:

adr <register> <label>

This instruction will load the address of label into the indicated register. [...]

If label is a thumb function symbol, and thumb interworking has been enabled via the -mthumb-interwork option then the bottom bit of the value stored into register will be set. This allows the following sequence to work as expected:

adr r0, thumb_function

blx r0

So it sounds like things should just work. However, looking at some disassembly, it seems like certain addresses do not have that bottom bit set.

For example, assembling and linking:

.syntax unified
.thumb

.align 2
table:
    .4byte f1
    .4byte f2
    .4byte f3

.align 2
.type f1, %function
.thumb_func
f1:
    adr r1, f1
    adr r2, f2
    adr r3, f3
    bx r1

.align 2
.type f2, %function
.thumb_func
f2:
    adr r1, f1
    adr r2, f2
    adr r3, f3
    bx r2

.align 2
.type f3, %function
.thumb_func
f3:
    adr r1, f1
    adr r2, f2
    adr r3, f3
    bx r3

With:

arm-none-eabi-as adr_test.s -mthumb -mthumb-interwork -o adr_test.o
arm-none-eabi-ld adr_test.o

And checking with arm-none-eabi-objdump -D a.out, I get:

00008000 <table>:
    8000:   0000800d    .word   0x0000800d
    8004:   00008019    .word   0x00008019
    8008:   00008025    .word   0x00008025

0000800c <f1>:
    800c:   f2af 0103   subw    r1, pc, #3
    8010:   a201        add r2, pc, #4  ; (adr r2, 8018 <f2>)
    8012:   a304        add r3, pc, #16 ; (adr r3, 8024 <f3>)
    8014:   4708        bx  r1
    8016:   46c0        nop         ; (mov r8, r8)

00008018 <f2>:
    8018:   f2af 010f   subw    r1, pc, #15
    801c:   f2af 0207   subw    r2, pc, #7
    8020:   a300        add r3, pc, #0  ; (adr r3, 8024 <f3>)
    8022:   4710        bx  r2

00008024 <f3>:
    8024:   f2af 011b   subw    r1, pc, #27
    8028:   f2af 0213   subw    r2, pc, #19
    802c:   f2af 030b   subw    r3, pc, #11
    8030:   4718        bx  r3
    8032:   46c0        nop         ; (mov r8, r8)

There are a few things to note:

  1. In table, the absolute addresses of f1, f2, and f3 are all odd, as expected. So, clearly, the assembler and linker know that those three functions should be Thumb.
  2. For backward references, where the adr pseudo-instruction assembles down to a subw, the offset is odd, as expected.
  3. But for forward references, where the adr pseudo-instruction assembles to an add, the offset is even.

What am I missing?


Solution

  • This was a bug in the GNU Assembler (gas). It should be fixed in v2.37.