ccompiler-optimizationcalling-conventionabicompiler-theory

How do C compilers implement functions that return large structures?


The return value of a function is usually stored on the stack or in a register. But for a large structure, it has to be on the stack. How much copying has to happen in a real compiler for this code? Or is it optimized away?

For example:

struct Data {
    unsigned values[256];
};

Data createData() 
{
    Data data;
    // initialize data values...
    return data;
}

(Assuming the function cannot be inlined..)


Solution

  • None; no copies are done.

    The address of the caller's Data return value is actually passed as a hidden argument to the function, and the createData function simply writes into the caller's stack frame.

    This is known as the named return value optimisation. Also see the c++ faq on this topic.

    commercial-grade C++ compilers implement return-by-value in a way that lets them eliminate the overhead, at least in simple cases

    ...

    When yourCode() calls rbv(), the compiler secretly passes a pointer to the location where rbv() is supposed to construct the "returned" object.

    You can demonstrate that this has been done by adding a destructor with a printf to your struct. The destructor should only be called once if this return-by-value optimisation is in operation, otherwise twice.

    Also you can check the assembly to see that this happens:

    Data createData() 
    {
        Data data;
        // initialize data values...
        data.values[5] = 6;
        return data;
    }
    

    here's the assembly:

    __Z10createDatav:
    LFB2:
            pushl   %ebp
    LCFI0:
            movl    %esp, %ebp
    LCFI1:
            subl    $1032, %esp
    LCFI2:
            movl    8(%ebp), %eax
            movl    $6, 20(%eax)
            leave
            ret     $4
    LFE2:
    

    Curiously, it allocated enough space on the stack for the data item subl $1032, %esp, but note that it takes the first argument on the stack 8(%ebp) as the base address of the object, and then initialises element 6 of that item. Since we didn't specify any arguments to createData, this is curious until you realise this is the secret hidden pointer to the parent's version of Data.