c++assemblyx86-64calling-conventionabi

C++ on x86-64: when are structs/classes passed and returned in registers?


Assuming the x86-64 ABI on Linux, under what conditions in C++ are structs passed to functions in registers vs. on the stack? Under what conditions are they returned in registers? And does the answer change for classes?

If it helps simplify the answer, you can assume a single argument/return value and no floating point values.


Solution

  • The ABI specification is defined here.
    A newer version is available here.

    I assume the reader is accustomed to the terminology of the document and that they can classify the primitive types.


    If the object size is larger than two eight-bytes, it is passed in memory:

    struct foo
    {
        unsigned long long a;
        unsigned long long b;
        unsigned long long c;               // Commenting this gives mov rax, rdi
    };
    
    unsigned long long foo(struct foo f)
    { 
      return f.a;                           // mov     rax, QWORD PTR [rsp+8]
    } 
    

    If it is non POD, it is passed in memory:

    struct foo
    {
        unsigned long long a;
        foo(const struct foo& rhs){}            // Commenting this gives mov rax, rdi
    };
    
    unsigned long long foo(struct foo f)
    {
      return f.a;                               // mov     rax, QWORD PTR [rdi]
    }
    

    Copy elision is at work here

    If it contains unaligned fields, it passed in memory:

    struct __attribute__((packed)) foo         // Removing packed gives mov rax, rsi
    {
        char b;
        unsigned long long a;
    };
    
    unsigned long long foo(struct foo f)
    {
      return f.a;                             // mov     rax, QWORD PTR [rsp+9]
    }
    

    If none of the above is true, the fields of the object are considered.
    If one of the field is itself a struct/class the procedure is recursively applied.
    The goal is to classify each of the two eight-bytes (8B) in the object.

    The class of the fields of each 8B are considered.
    Note that an integral number of fields always totally occupy one 8B thanks to the alignment requirement of above.

    Set C be the class of the 8B and D be the class of the field in consideration class.
    Let new_class be pseudo-defined as

    cls new_class(cls D, cls C)
    {
       if (D == NO_CLASS)
          return C;
    
       if (D == MEMORY || C == MEMORY)
          return MEMORY;
    
       if (D == INTEGER || C == INTEGER)
          return INTEGER;
    
       if (D == X87 || C == X87 || D == X87UP || C == X87UP)
          return MEMORY;
    
       return SSE;
    }
    

    then the class of the 8B is computed as follows

    C = NO_CLASS;
    
    for (field f : fields)
    {
        D = get_field_class(f);        // Note this may recursively call this proc
        C = new_class(D, C);
    }
    

    Once we have the class of each 8Bs, say C1 and C2, then

    if (C1 == MEMORY || C2 == MEMORY)
        C1 = C2 = MEMORY;
    
    if (C2 == SSEUP AND C1 != SSE)
       C2 = SSE;
    

    Note: This is my interpretation of the algorithm given in the ABI document.


    Example

    struct foo
    {
        unsigned long long a;
        long double b;
    };
    
    unsigned long long foo(struct foo f)
    {
      return f.a;
    }
    

    The 8Bs and their fields

    First 8B: a Second 8B: b

    a is INTEGER, so the first 8B is INTEGER. b is X87 and X87UP so the second 8B is MEMORY. The final class is MEMORY for both 8Bs.


    Example

    struct foo
    {
        double a;
        long long b;
    };
    
    long long foo(struct foo f)
    {
      return f.b;                     // mov rax, rdi
    }
    

    The 8Bs and their fields

    First 8B: a Second 8B: b

    a is SSE, so the first 8B is SSE.
    b is INTEGER so the second 8B is INTEGER.

    The final classes are the one calculated.


    Return values

    The values are returned accordingly to their classes:


    PODs

    The technical definition is here.

    The definition from the ABI is reported below.

    A de/constructor is trivial if it is an implicitly-declared default de/constructor and if:

       • its class has no virtual functions and no virtual base classes, and
       • all the direct base classes of its class have trivial de/constructors, and
       • for all the nonstatic data members of its class that are of class type (or array thereof), each such class has a trivial de/constructor.


    Note that each 8B is classified independently so that each one can be passed accordingly.
    Particularly, they may end up on the stack if there are no more parameter registers left.