assemblyriscv

What is this "Myriad sequences"? (What li gets expanded to?)


In the RISC-V unprivileged manual, it is written that there is this pseudoinstruction called li:

li rd, immediate     | Myriad sequences               | Load immediate

But it only says that the base instruction is Myriad sequences, which after googling gives no promising answer of what it is.

Does anyone know what Myriad sequences is and what li get expanded to?

Also after playing a while with godbolt, I see that writing the following function:

int func(void) {
    return 0x12345LL;
}

Outputs the following asm output (With -O2 flag):

func:
        li      a0,73728
        addi    a0,a0,837
        ret

But if I tick the compile to binary option, godbolt gives me

main:
        lui     a0,0x12
        addi    a0,a0,837 # 12345 <__BSS_END__+0x30d>
        ret

Does this mean that li get expanded to addi? Or was it the linker doing some optimizations?


Solution

  • If you're programming in assembly or a compiler writer generating assembly code, you can choose to use the li pseudo instruction or avoid it, as you like.

    For 32-bit integers, it will generate some combination of lui + addi, though either one may be omitted depending on the constant's value.

    For constants larger than 32 bits, the choice of sequence can be 3 or more instructions.  We have seen compilers generate working but suboptimal code sequences for certain large constants.

    There is a trade off here between number of registers used to construct a large immediate and the number of instructions needed to do so — sometimes using an extra register can shorten the code sequence, reusing an intermediate value.  Other optimizations are possible as well, such as reusing (as in common subexpression elimination or loop invariant code motion) an intermediate value to be used in generation of two separate larger immediates.

    However, while the option of using more registers so as to use fewer instructions is available to compilers and assembly programmers (since they are aware of other register usage in the compilation) this trade off is not available to assemblers themselves as they don't track register usages within or between functions.  Assemblers are thus limited to using one scratch register, namely the target of the li as a temporary to construct the constant.

    Many clever sequences are possible to construct an immediate.  An addi followed by a shift left may be appropriate for constants that are 12 non-zero bits or less but have zero bits as LSBs (even though the lui + addi at the same instruction length will work as well).  These sequence variations become more relevant for larger (>32 bit) constant values.

    There's no consumer need for the assembler to use a well-defined sequence to generate the immediate, so the RISC V specification omits any specification and defers to the assembler implementation.

    In fact, the RISC V specification goes beyond most ISA specs in rolling into (and standardizing) the specifications of many useful pseudo instructions like ret and call that are not really part of the ISA literally, but do help both with assembly programming as well as facilitation of hardware optimization like shadow call stack, via helping to cement certain conventions.


    The lui + addi sequence has some oddities as follows: the addi uses a 12-bit signed immediate, which means that in order to construct a 32-bit immediate that has the 12th bit set, the constant provided to the lui must be biased by +1 !  This is because the 12-bit immediate from the addi will sign extend and become negative so the bias of +1 in the lui is needed.

    One might ask why did the RISC V designers use addi that sign extends?

    For background let's note that MIPS designers chose two forms of sign extension, signed for addi and unsigned/zero for ori.  So for certain sequences MIPS could avoid the +1/-1 issue.  However, they also wanted to be able to put the lower part of the immediate into the offset available in lw and sw, and these have sign extending immediates (which is desirable for other reasons).  So all the machinery needed to use immediates in lui + lw is needed anyway, meaning the bias of lui by +1 in certain cases.

    RISC V embraces this — even eliminated the zero extending immediates altogether to simplify decoding — so they must contend with the +1/-1 biasing for all lui + ... combinations anyway.