[SOLVED] How to compile a linux kernel with SSE enabled?

How to compile a linux kernel with SSE enabled?

I am in the following situation:

I am adding a float point based algorithm to Linux kernel. I know I shouldn't do that, but I want to give it a try and see how bad it can be.
kernel_fpu_begin/kernel_fpu_end is used at every float point calculation blocks.
When I run make to compile the kernel code I get this error: SSE register return with SSE disabled, the corresponding line is input[3] = (float)util / (float)max;

Here are my questions:

I didn't find -mno-sse -mno-sse2 in the Makefile, what can I do to enable the SSE?
When I declare some float point variables, for example, float x[10], should I use kernel_fpu_begin/kernel_fpu_end?

Thanks!

Solution

You need to stop the compiler from using SSE to e.g. copy a 16 byte struct before kernel_fpu_begin or after end. So you can't just use a block inside a function in a file compiled with -msse2.

You might be able to put your FP code in a separate function and use __attribute__((target("sse2"))) or "avx" on that function, to enable it without command-line options, for x86 specifically.

Obviously that function can't return a float by value because the standard calling convention returns in XMM0, and you need the caller compiled without.