I'm trying to write a program exactly like this. The only difference is that I'm using the stack memory instead of the .bss section to hold the value I'm getting from the calling function. After this change, I'm getting Bus Error when returning from the assembly function.
Any ideas why?
C-program:
#include<stdio.h>
extern double func(double d);
int main(){
double d_1 = 1.22;
double d_2 = func(d_1);
printf("%lf\n", d_2);
return 0;
}
Assembly:
section .text
global func
func:
enter 0,0
sub rsp, 8
movq qword[rbp],xmm0 ; Store current value in memory
fld qword[rbp] ; Load current value from memory
fld qword[rbp] ; Load current value from memory again
fadd ; Add top two stack items
leave
ret
movq [rbp],xmm0
overwrites the saved RBP value that enter
pushed. This would be more obvious if you hadn't used enter
, but [rbp+0]
is not an address you can use in a function with a stack frame.
([rbp-8]
is the highest address you can use for locals. [rsp]
would have worked, because you decremented RSP after enter
set RBP=RSP, but you used RBP.)
When execution returns to main
, gcc -O0
(anti-optimized for debugging) runs these instructions to store the function return value from xmm0
into stack space for d_2
instead of just passing it directly to printf
while it's still in a register:
movq rax,xmm0
mov QWORD PTR [rbp-0x8],rax # Using RBP after you clobbered it.
Un-optimized gcc output is really silly: copying FP data to an integer register instead of storing directly with movsd
makes no sense. But that's not the issue.
RBP
holds the IEEE double
precision bit-pattern for 1.22
(0x3ff3851eb851eb85
) because that's what your func
clobbered it with.
The address rbp-8
is not canonical: the high 16 bits don't match bit 47, so it's not a sign-extended 48-bit virtual address. (See this ASCII-art diagram).
Using a non-canonical address on current x86-64 hardware generates a #GP(0)
exception (according to Intel's manual entry for mov
), and Linux maps this x86 exception to SIGBUS.
This is why you get a bus error instead of the usual segmentation fault for trying to access unmapped memory with a bogus pointer.
Your code is over-complicated and wrong
In both mainstream x86-64 calling conventions (Linux/OS X use x86-64 System V), double
is returned in xmm0
. Use addsd xmm0,xmm0
/ ret
like a normal person, like the answer on the question you linked shows.
func:
addsd xmm0,xmm0 ; first FP arg in (low 64 bits of) xmm0
ret ; return value in (low 64 bits of) xmm0
Or if you insist on x87, then look how much code you have to write:
func:
movsd [rsp-8], xmm0 ; double arg in xmm0
fld qword [rsp-8]
fadd st0, st0 ; use x87 regs instead of uselessly loading twice.
fstp qword [rsp-8] ; empty the x87 stack
movsd xmm0, [rsp-8] ; return value in xmm0
ret
That's using 8 bytes below RSP as scratch space, in the red-zone to store/reload to get data between SSE2 registers and x87, because the x86-64 calling conventions are designed around SSE2, using xmm registers. Use sub rsp, 8
/ add rsp, 8
if you don't want to use the red-zone.
Don't use x87 in x86-64 unless you need 80-bit floating-point precision.
(enter
is slow and not recommended; make a stack frame with push rbp
/ mov rbp,rsp
if you want one. leave
is fine, though. Making a stack frame is optional; I left that out.)
printf
doesn't need "%lf"
to print a double
, only scanf
needs lf
. You can't printf
a single-precision float, because C default promotion rules apply to args of variadic functions, and thus any float
is promoted to double
.
In most C implementations (including glibc), "%lf"
works anyway, silently ignoring the meaningless l
modifier on the %f
conversion.
I mention this in case you try to do that with call printf
with a "%f"
format string from asm later, and run into How to print a single-precision float with printf.