assemblylinkermipsrelocation

How does a linker relocate branch instructions in MIPS?


Background

I'm working on a 2015 CS61C (Berkeley) course project on writing a linker to link object files generated from the following subset of the MIPS instruction set.

Add Unsigned: addu $rd, $rs, $rt
Or: or $rd, $rs, $rt
Set Less Than:  slt $rd, $rs, $rt
Set Less Than Unsigned: sltu $rd, $rs, $rt
Jump Register:  jr $rs
Shift Left Logical: sll $rd, $rt, shamt
Add Immediate Unsigned: addiu $rt, $rs, immediate
Or Immediate:   ori $rt, $rs, immediate
Load Upper Immediate:   lui $rt, immediate
Load Byte:  lb $rt, offset($rs)
Load Byte Unsigned: lbu $rt, offset($rs)
Load Word:  lw $rt, offset($rs)
Store Byte: sb $rt, offset($rs)
Store Word: sw $rt, offset($rs)
Branch on Equal:    beq $rs, $rt, label
Branch on Not Equal:    bne $rs, $rt, label
Jump:   j label
Jump and Link:  jal label
Load Immediate: li $rt, immediate
Branch on Less Than:    blt $rs, $rt, label

From this subset of instructions, I think the ones that need relocation are j, bne, beq instructions (blt is a pseudo-instruction), the latter two needing to be relocated if the label is not present in the same file.

The comments of the MIPS function that does the relocation of an instruction reads

#------------------------------------------------------------------------------
# function relocate_inst()
#------------------------------------------------------------------------------
# Given an instruction that needs relocation, relocates the instruction based
# on the given symbol and relocation table.
#
# You should return error if 1) the addr is not in the relocation table or
# 2) the symbol name is not in the symbol table. You may assume otherwise the 
# relocation will happen successfully.
#
# Arguments:
#  $a0 = an instruction that needs relocating
#  $a1 = the byte offset of the instruction in the current file
#  $a2 = the symbol table
#  $a3 = the relocation table
#
# Returns: the relocated instruction, or -1 if error

Note that the relocation table contains addresses relative to the start of the object file being linked, while the symbol table is an aggregate of the symbol tables of all the object files being linked and contains absolute addresses.

Problem

Looking at various solutions online, it seems that only j relocations are handled.

Am I missing something?

EDIT: We are only considering the text segment.


Solution

  • My guess is that this linker does not handle branch instructions (bne or beq) to external labels.

    This will preclude using beq label where label is external (global and in another object file), but this is only really possible to do in assembly.

    Compiler output, for example, will have both the branch instruction and target location all within a single function, which goes into a single code chunk. (modulo certain tail call optimization).

    With that limitation, then all bne and beq instructions are already fixed up by the compiler or assembler, using pc-relative addressing — there would be no need for an entry in the relocation table for these.

    Further, the range of the branch (beq/bne) instructions (+/-128k) is shorter than for j, so if the linker were really intending to support branching to external label, it might also have to provide the capability to introduce branch islands to handle the ones that are branching too far away.


    To expand on your example:

    if ( a1 == a0 )
        printf ("hello")
    

    would be

        bne a1, a0, endIf1
        la a0, Lhello
        jal printf
    endIf1:
    

    Some compilers don't know which function is in what DLL's so, even if printf was in a DLL, the compiler output could still look the same.