ccompiler-constructioncompiler-optimizationtail-recursiondecompiler

What is the exact compiler optimization applied here expect for tail recursion elimination?


I'm compiling a simple C program, implementing an inorder tree traversal function:

void inorderTraversal(struct TreeNode* root) {
    if (root == NULL) {
        return;
    }
    inorderTraversal(root->left);  
    printf("%d ", root->val);  
    inorderTraversal(root->right);  
}

Then I compile the .c file using: gcc inorder.c -O2 -o inorder_O2

When I try to decompile the inorder_O2 file with BinaryNinja, I get the High Level IL version of this function:


void inorderTraversal(int32_t* arg1){
    int32_t* i_1 = arg1
    if (arg1 != 0)
        int32_t* i
        do
            int32_t* j_2 = *(i_1 + 8)
            int32_t* j_1 = j_2
            if (j_2 != 0)
                int32_t* j
                do
                    int32_t* k_2 = *(j_1 + 8)
                    int32_t* k_1 = k_2
                    if (k_2 != 0)
                        int32_t* k
                        do
                            int32_t* r15_1 = *(k_1 + 8)
                            if (r15_1 != 0)
                                do
                                    int32_t* rbx_1 = *(r15_1 + 8)
                                    if (rbx_1 != 0)
                                        do
                                            int32_t* r13_1 = *(rbx_1 + 8)
                                            if (r13_1 != 0)
                                                do
                                                    int32_t* r12_1 = *(r13_1 + 8)
                                                    if (r12_1 != 0)
                                                        do
                                                            int32_t* r14_1 = *(r12_1 + 8)
                                                            if (r14_1 != 0)
                                                                do
                                                                    int32_t* r9_1 = *(r14_1 + 8)
                                                                    if (r9_1 != 0)
                                                                        do
                                                                            inorderTraversal(*(r9_1 + 8))
                                                                            __printf_chk(flag: 1, format: &data_2004, zx.q(*r9_1))
                                                                            r9_1 = *(r9_1 + 0x10)
                                                                        while (r9_1 != 0)
                                                                    __printf_chk(flag: 1, format: &data_2004, zx.q(*r14_1))
                                                                    r14_1 = *(r14_1 + 0x10)
                                                                while (r14_1 != 0)
                                                            __printf_chk(flag: 1, format: &data_2004, zx.q(*r12_1))
                                                            r12_1 = *(r12_1 + 0x10)
                                                        while (r12_1 != 0)
                                                    __printf_chk(flag: 1, format: &data_2004, zx.q(*r13_1))
                                                    r13_1 = *(r13_1 + 0x10)
                                                while (r13_1 != 0)
                                            __printf_chk(flag: 1, format: &data_2004, zx.q(*rbx_1))
                                            rbx_1 = *(rbx_1 + 0x10)
                                        while (rbx_1 != 0)
                                    __printf_chk(flag: 1, format: &data_2004, zx.q(*r15_1))
                                    r15_1 = *(r15_1 + 0x10)
                                while (r15_1 != 0)
                            __printf_chk(flag: 1, format: &data_2004, zx.q(*k_1))
                            k = *(k_1 + 0x10)
                            k_1 = k
                        while (k != 0)
                    __printf_chk(flag: 1, format: &data_2004, zx.q(*j_1))
                    j = *(j_1 + 0x10)
                    j_1 = j
                while (j != 0)
            __printf_chk(flag: 1, format: &data_2004, zx.q(*i_1))
            i = *(i_1 + 0x10)
            i_1 = i
        while (i != 0)
}

I guess a tail recursioin elimination is conducted since there is only one recursion call left. What I do not know is what make the function a big bunch of nested loops? Is there an exact compiler optimization name for it?

That is, I'm mainly want to know the name or a description of the technique, rather than a compiler option to turn it on or off.


Solution

  • The recursion on root->right is a tail call, so it can be eliminated. All the do-while loops are these eliminated tail calls.

    The recursion on root->left is not in tail position, so it can't be eliminated like this. Instead, the compiler open-codes the first 8 levels of recursion using nested if statements.

    If the tree is deeper than 8 levels, it makes an actual call back to the function. You can see that in the innermost do block.

    Now that I think about this a little more, I might have it backward which optimizations are the tail calls and which are the non-tail calls.