I understand gcc's --ffast-math
flag can greatly increase speed for float ops, and goes outside of IEEE standards, but I can't seem to find information on what is really happening when it's on. Can anyone please explain some of the details and maybe give a clear example of how something would change if the flag was on or off?
I did try digging through S.O. for similar questions but couldn't find anything explaining the workings of ffast-math.
As you mentioned, it allows optimizations that do not preserve strict IEEE compliance.
An example is this:
x = x*x*x*x*x*x*x*x;
to
x *= x;
x *= x;
x *= x;
Because floating-point arithmetic is not associative, the ordering and factoring of the operations will affect results due to round-off. Therefore, this optimization is not done under strict FP behavior.
I haven't actually checked to see if GCC actually does this particular optimization. But the idea is the same.