compiler-constructioncilssastack-machine

Converting SSA to stack machine


It is well known how to convert code from SSA representation to a register machine. (Basically, graph coloring register allocation is the core of such conversion.)

But what's the general method for converting from SSA to a stack machine? (CIL byte code, in the case I'm looking at.) I would expect it to be simpler, given the lack of need for register allocation?


Solution

  • It's been over 15 years since I was involved in compiler construction, so I might not remember all the details.

    Basically when going out of SSA, you need to generate load/store instructions into virtual registers at the end of all blocks leading to phi-nodes in successor blocks. This will lead to generating a number of virtual registers, which is usually higher than the available registers on the actual machine. So you apply register allocation on the local variables to come up with the real registers, spilling on the stack those values that don't fit.

    For a stack-based machine, just don't do the last step. You end up with roughly the same number of virtual registers as phi-nodes in the compiled function (the algorithm is actually not trivial, a good starting point is the paper Efficiently Computing Single Static Assignment Form and the Control Dependence Graph by Ron Cytron, Jeane Ferrante, et. al.) These virtual registers will be your local variables.

    When reading a value from a virtual register (local variable) to be used by an operation, first use an instruction to push it on the stack. The Java VM iload index instruction is such an example: it loads the local variable at index and pushed its value on the stack. (see https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.iload) When writing a value into a local variable, you pop it from the stack. See Java VM istore index instruction (see https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.istore).

    For example, if after going out of SSA you need to encode

    local 5 = MUL local[2], local[4]

    then you need to generate something like this:

    ILOAD 4 ILOAD 2 MUL ISTORE 5

    For CIL bytecode, you have the equivalent ldarg and starg operations.

    Of course, there is a lot of room for optimizations, to avoid redundant load/stores.