I was "playing" with the fast inverse sqrt function by trying to optimize certain things (fewer variables etc). Here is the final code:
float Q_rsqrt(float x) { //IS SMALLER
const float x2 = x * 0.5F;
uint_fast32_t i;
memcpy(&i, &x, sizeof(float));
i = 0x5f3759df - ( i >> 1 );
memcpy(&x, &i, sizeof(float));
return x * ( 1.5F - ( x2 * x * x ) );
}
First of all, it is useful to know that on my architecture uint_fast32_t
is represented on 64 bits and float
on 32 bits. It can therefore be surprising to make memcpy()
on variable types of different sizes.
The problem I have once the code is compiled, a call to this function with the same argument each time gives the same return value : sometimes this one is negative, other times positive (but always of the same absolute value).
The usefulness of memcpy()
is to bypass the warning (dereferencing type-punned pointer will break strict-aliasing rules) of the following code (which works exactly as desired) :
float Q_rsqrt_test(float x) {
const float xHalf = x * 0.5F;
const uint_fast32_t i = 0x5f3759df - ( (* ( uint_fast32_t * ) &x) >> 1 );
x = * ( float * ) &i;
return x * ( 1.5F - ( xHalf * x * x ) );
}
For this code, I want to say that there is no problem of type size because already the source (visible on the link just above) uses a double type
, which is represented on 64 bits on my architecture (instead of float
on 32 bits).
I really can't understand why I can get negative answers with the first function...
Thanks for the help.
As Barmar said, initializing i to 0 works. I also think it's better to avoid dereferecing to a type with another size (for example *(uint64_t *) &x
where x is a float). Probably better using dereferecing with uint32_t
and then properly cast the result to a 64 bits variable.