I can't quite figure out why the register allocator considers the frame pointer a good candidate for node coalescing - it should interfere with every other temporary and therefore any move involving the frame pointer should be constrained. For example, here is the generated assembly for the function isdigit
from merge.tig
(including the coalesced moves):
L2_isdigit:
push rbp
mov rbp, rsp
push rbx
push r12
push r13
mov r10, rdi
L69:
mov r10, rbp
add r10, 16
mov r10, qword [r10]
mov r10, qword [r10+16]
add r10, -8
mov r10, qword [r10]
mov rdi, r10
call ord
mov r10, rax
mov r10, r10
mov r12, r10
mov r10, L4
mov rdi, r10
call ord
mov r10, rax
mov r10, r10
cmp r12, r10
jge L8
L9:
mov r10, 0
L10:
mov rax, r10
jmp L68
L8:
mov r13, 1
mov rbp, rbp
add rbp, 16
mov r10, qword [rbp]
mov r10, qword [r10+16]
add r10, -8
mov r10, qword [r10]
mov rdi, r10
call ord
mov r10, rax
mov r10, r10
mov r12, r10
mov r10, L5
mov rdi, r10
call ord
mov r10, rax
mov r10, r10
cmp r12, r10
jle L6
L7:
mov r13, 0
L6:
mov r10, r13
jmp L10
L68:
pop r13
pop r12
pop rbx
leave
ret 8
Specifically, this part:
L8:
mov r13, 1
mov rbp, rbp
add rbp, 16
mov r10, qword [rbp]
This is going to mess up the stack when the leave
instruction is executed. Without coalescing, it looks something like this:
L8:
mov r13, 1
mov r10, rbp
add r10, 16
mov r10, qword [r10]
Of course, you could hardcode it so that no move instructions related to the frame pointer are ever considered for coalescing but that's an inelegant solution. How can I make the register allocator realize that it's not safe to coalesce in this case? Normally, it does not attempt to use the frame pointer since I use the sink instruction trick taught in the book, which I assumed would work for this too.
I implemented the algorithm exactly as written in the book. It's possible I made a mistake but I don't think so. Here is the code:
Thanks very much in advance.
I've discovered the problem, it was just a silly mistake. Have accidentally been using a directed graph for the interference graph rather than an undirected one, so the mem_edge
function from the Ocamlgraph library wasn't reporting correct interferences.