I am in the following situation:
kernel_fpu_begin/kernel_fpu_end
is used at every float point calculation blocks.make
to compile the kernel code I get this error: SSE register return with SSE disabled
, the corresponding line is input[3] = (float)util / (float)max;
Here are my questions:
-mno-sse -mno-sse2
in the Makefile, what can I do to enable the SSE?float x[10]
, should I use kernel_fpu_begin/kernel_fpu_end
?Thanks!
You need to stop the compiler from using SSE to e.g. copy a 16 byte struct before kernel_fpu_begin
or after end
. So you can't just use a block inside a function in a file compiled with -msse2
.
You might be able to put your FP code in a separate function and use __attribute__((target("sse2")))
or "avx"
on that function, to enable it without command-line options, for x86 specifically.
Obviously that function can't return a float
by value because the standard calling convention returns in XMM0, and you need the caller compiled without.