C/C++ returning struct by value under the hood

(This question is specific to my machine's architecture and calling conventions, Windows x86_64)

I don't exactly remember where I had read this, or if I had recalled it correctly, but I had heard that, when a function should return some struct or object by value, it will either stuff it in rax (if the object can fit in the register width of 64 bits) or be passed a pointer to where the resulting object would be (I'm guessing allocated in the calling function's stack frame) in rcx, where it would do all the usual initialization, and then a mov rax, rcx for the return trip. That is, something like

extern some_struct create_it(); // implemented in assembly

would really have a secret parameter like

extern some_struct create_it(some_struct* secret_param_pointing_to_where_i_will_be);

Did my memory serve me right, or am I incorrect? How are large objects (i.e. wider than the register width) returned by value from functions?

Solution

Here's a simple disassembling of a code exampling what you're saying

typedef struct 
{
    int b;
    int c;
    int d;
    int e;
    int f;
    int g;
    char x;
} A;

A foo(int b, int c)
{
    A myA = {b, c, 5, 6, 7, 8, 10};
    return myA; 
}

int main()
{   
    A myA = foo(5,9);   
    return 0;
}

and here's the disassembly of the foo function, and the main function calling it

main:

push    ebp
mov     ebp, esp
and     esp, 0FFFFFFF0h
sub     esp, 30h
call    ___main
lea     eax, [esp+20]        ; placing the addr of myA in eax
mov     dword ptr [esp+8], 9 ; param passing 
mov     dword ptr [esp+4], 5 ; param passing
mov     [esp], eax           ; passing myA addr as a param
call    _foo
mov     eax, 0
leave
retn

foo:

push    ebp
mov     ebp, esp
sub     esp, 20h
mov     eax, [ebp+12]  
mov     [ebp-28], eax
mov     eax, [ebp+16]
mov     [ebp-24], eax
mov     dword ptr [ebp-20], 5
mov     dword ptr [ebp-16], 6
mov     dword ptr [ebp-12], 7
mov     dword ptr [ebp-8], 9
mov     byte ptr [ebp-4], 0Ah
mov     eax, [ebp+8]
mov     edx, [ebp-28]
mov     [eax], edx     
mov     edx, [ebp-24]
mov     [eax+4], edx
mov     edx, [ebp-20]
mov     [eax+8], edx
mov     edx, [ebp-16]
mov     [eax+0Ch], edx
mov     edx, [ebp-12]
mov     [eax+10h], edx
mov     edx, [ebp-8]
mov     [eax+14h], edx
mov     edx, [ebp-4]
mov     [eax+18h], edx
mov     eax, [ebp+8]
leave
retn

now let's go through what just happened, so when calling foo the paramaters were passed in the following way, 9 was at highest address, then 5 then the address the myA in main begins

lea     eax, [esp+20]        ; placing the addr of myA in eax
mov     dword ptr [esp+8], 9 ; param passing 
mov     dword ptr [esp+4], 5 ; param passing
mov     [esp], eax           ; passing myA addr as a param

within foo there is some local myA which is stored on the stack frame, since the stack is going downwards, the lowest address of myA begins in [ebp - 28], the -28 offset could be caused by struct alignments so I'm guessing the size of the struct should be 28 bytes here and not 25 as expected. and as we can see in foo after the local myA of foo was created and filled with parameters and immediate values, it is copied and re-written to the address of myA passed from main ( this is the actual meaning of return by value )

mov     eax, [ebp+8]
mov     edx, [ebp-28]

[ebp + 8] is where the address of main::myA was stored ( memory address go upwards hence ebp + old ebp ( 4 bytes ) + return address ( 4 bytes )) at overall ebp + 8 to get to the first byte of main::myA, as said earlier foo::myA is stored within [ebp-28] as stack goes downwards

mov     [eax], edx

place foo::myA.b in the address of the first data member of main::myA which is main::myA.b

mov     edx, [ebp-24]
mov     [eax+4], edx

place the value that resides in the address of foo::myA.c in edx, and place that value within the address of main::myA.b + 4 bytes which is main::myA.c

as you can see this process repeats itself through out the function

mov     edx, [ebp-20]
mov     [eax+8], edx
mov     edx, [ebp-16]
mov     [eax+0Ch], edx
mov     edx, [ebp-12]
mov     [eax+10h], edx
mov     edx, [ebp-8]
mov     [eax+14h], edx
mov     edx, [ebp-4]
mov     [eax+18h], edx
mov     eax, [ebp+8]

which basically proves that when returning a struct by val, that could not be placed in as a param, what happens is that the address of where the return value should reside in is passed as a param to the function and within the function being called the values of the returned struct are copied into the address passed as a parameter...

hope this exampled helped you visualize what happens under the hood a little bit better :)

EDIT

I hope that you've noticed that my example was using 32 bit assembler and I KNOW you've asked regarding x86-64, but I'm currently unable to disassemble code on a 64 bit machine so I hope you take my word on it that the concept is exactly the same both for 64 bit and 32 bit, and that the calling convention is nearly the same