Different instructions when compiling C into ARM64 on linux vs mac

If I compile a c file into ARM64 assembly I get different instructions (not only a different syntax and directives - eg .cfi_def_cfa vs .cfi_def_cfa_offset) depending on whether I compile on linux or mac. Why is this if the ISA is the same? I know there will be different target binary formats (ELF/Mach-O), but I was expecting identical instructions that would then be compiled into different objects. Is this because apple use apple clang which may process things differently from the internals of the aarch64 gcc toolchain?

Is there any way to enforce using the same instructions?

Input file (fib.c):

int fib(int n)
{
    if (n <= 1)
        return 1;

    return fib(n - 1) + fib(n - 2);
}

Mac (compiled on a mac mini with gcc -S fib.c -o fib.s):

    .section    __TEXT,__text,regular,pure_instructions
    .build_version macos, 11, 0 sdk_version 12, 1
    .globl  _fib                            ; -- Begin function fib
    .p2align    2
_fib:                                   ; @fib
    .cfi_startproc
; %bb.0:
    sub sp, sp, #32                     ; =32
    stp x29, x30, [sp, #16]             ; 16-byte Folded Spill
    add x29, sp, #16                    ; =16
    .cfi_def_cfa w29, 16
    .cfi_offset w30, -8
    .cfi_offset w29, -16
    str w0, [sp, #8]
    ldr w8, [sp, #8]
    subs    w8, w8, #1                      ; =1
    b.gt    LBB0_2
; %bb.1:
    mov w8, #1
    stur    w8, [x29, #-4]
    b   LBB0_3
LBB0_2:
    ldr w8, [sp, #8]
    subs    w0, w8, #1                      ; =1
    bl  _fib
    str w0, [sp, #4]                    ; 4-byte Folded Spill
    ldr w8, [sp, #8]
    subs    w0, w8, #2                      ; =2
    bl  _fib
    mov x8, x0
    ldr w0, [sp, #4]                    ; 4-byte Folded Reload
    add w8, w0, w8
    stur    w8, [x29, #-4]
LBB0_3:
    ldur    w0, [x29, #-4]
    ldp x29, x30, [sp, #16]             ; 16-byte Folded Reload
    add sp, sp, #32                     ; =32
    ret
    .cfi_endproc
                                        ; -- End function
.subsections_via_symbols

Linux (compiled on ubuntu with aarch64-linux-gnu-gcc -S fib.c -o fib.s):

    .arch armv8-a
    .file   "fib.c"
    .text
    .align  2
    .global fib
    .type   fib, %function
fib:
.LFB0:
    .cfi_startproc
    stp x29, x30, [sp, -48]!
    .cfi_def_cfa_offset 48
    .cfi_offset 29, -48
    .cfi_offset 30, -40
    mov x29, sp
    str x19, [sp, 16]
    .cfi_offset 19, -32
    str w0, [sp, 44]
    ldr w0, [sp, 44]
    cmp w0, 1
    bgt .L2
    mov w0, 1
    b   .L3
.L2:
    ldr w0, [sp, 44]
    sub w0, w0, #1
    bl  fib
    mov w19, w0
    ldr w0, [sp, 44]
    sub w0, w0, #1
    bl  fib
    add w0, w19, w0
.L3:
    ldr x19, [sp, 16]
    ldp x29, x30, [sp], 48
    .cfi_restore 30
    .cfi_restore 29
    .cfi_restore 19
    .cfi_def_cfa_offset 0
    ret
    .cfi_endproc
.LFE0:
    .size   fib, .-fib
    .ident  "GCC: (Ubuntu 11.2.0-17ubuntu1) 11.2.0"
    .section    .note.GNU-stack,"",@progbits

Solution

gcc on a Mac is actually clang, unless you install an actual GCC package.

(They do this because some Makefiles use CC=gcc instead of CC=cc, and it does accept almost all the same options. Use gcc -v to find out.)

Or we can tell from looking at the asm output:

The LBB0_3 label naming is indicative of LLVM, numbering by function and then basic-block within function.
True GCC just auto-numbers labels throughout the compilation unit.

It's not surprising that two totally different compilers make different un-optimized asm. They have different internals and transformations they go through to make asm from C source, e.g. clang going through LLVM-IR, GCC going through GIMPLE and RTL.

With optimization enabled, for simple enough code they tend to agree if there's only one good choice (other than register numbers), but once things get complex enough to be non-trivial (like optimizing a doubly-recursive function into iterative) there's room for each one to make different choices.

Same is true across different versions / options of the same compiler, too.