math floating-point bit-manipulation simd

Fast bithacked log2 approximation

In order to estimate entropy of a sequence, I came up with something like this:

float log2_dirty(float x) {
    return std::bitcast<float>((std::bitcast<int>(x) >> 4) + 0x3D800000)) - 15.0f;
}

This is loosely based on the identity of log2(1+x) ~= x which over the interval of x=[0..1] has the maximum error of about 8%.

Also for log2(2^e * (1+a)) ~= e + a the idea now is to move parts of the exponent to a linear part, then adding a bias some range of the original value x reasonably closely matches the true value as in

            appr    std::log2
x = 2.0        1            1
x = 3.0      1.5      1.58496
x = 4.0        2            2
x = 5.0     2.25      2.32193
x = 6.0      2.5      2.58496
...
x = 29.0  4.8125      4.85798
x = 30.0   4.875      4.90689
x = 31.0  4.9375       4.9542

Other parametrisation would be x >> 5 + 0x40000000 in integer domain, then -31.0f in fp domain.

The ranges where this works is between 2 <= x < 32.f or 2.0 <= x < 64.f, but I'm wondering if and how I could include also the range 1..2 or even 1/16 <= x < 32 with as little tinkering as possible.

The best I was thinking was return log2_dirty(x * 16) - 4.0f for 1/16 <= x < 2 range, but maybe something better is available (also targeting SIMD and branchless programming).

One additional target is to compute n log2 n, as that is used for entropy computation (And here I'm actually not using p log2 p, p<=1.0 but positive values n which if necessary I can clamp to 1.0 and which in my case is also limited to x < 31).

Entropy is traditionally computed as - Sigma log2(p_i)*p_i, where p_i = n_i / Sigma(n_i), but that can be also expressed as Sigma log2(n_i)*n_i - Sigma(n_i) * log2(Sigma(n_i)), allowing to reuse the logarithms of overlapping sequences. The traditional method has the nice property of 0<=p_i<=1 -- that is the input is automatically limited, but it requires not only division, but also if we want to search over a linear signal to find a position where there's a sudden change in the entropy (similar to Otsu Thresholding, i.e. maximising inter-class variance) then we come up with an O(n^2) algorithm as all the p_i must be individually divided.

TLDR; how to best extend the applicable domain of the algorithm -- or maybe there's another reasonable variant just like the traditional fast_exp algorithm as in https://stackoverflow.com/a/78509807/1716339

Solution

On x86-architectures with native FMA support, I guess the fastest way is

return static_cast<float>(std::bit_cast<int>(x))*(1.f/(1<<23))- 127.f;

You get slightly less bit-loss (i.e., a more monotonic behavior), if you first subtract the offset in the integer domain and scale afterwards:

return static_cast<float>(std::bit_cast<int>(x)-0x3f800000)*(1.f/(1<<23));

If you want to avoid the static_cast, your idea is generally sound. And by scaling the input value you can move the window of valid values. Notice though that scaling by a power of two is just adding/subtracting from the exponent (as long as you don't leave the range of normalized numbers). Thus by subtracting your lowest possible input value, you get an exponent in the range [0, Emax-Emin], and if Emax-Emin is 2^N, you can shift by N to get the reduced exponent into the significant of the floating point representation, after which you add (still in the integer domain) the float representation of the corresponding power of two, bitcast back to float and subtract (as a float) (2^N)-Emin.

Example, N=3, Emin=-3, Emax=5 (i.e., valid between 1.f/8 and 32.f):

return std::bit_cast<float>(
       ((std::bit_cast<int32_t>(x)-std::bit_cast<int32_t>(1.f/8)) >>3)
             +std::bit_cast<int32_t>(8.f))
       - (8.f - -3.f);

Now by moving the subtraction after the shift, you get an expression exactly in your desired form:

return std::bit_cast<float>(
      (std::bit_cast<int32_t>(x) >>3)
        +(std::bit_cast<int32_t>(8.f) - (std::bit_cast<int32_t>(1.f/8) >> 3)))
        - (8.f - -3.f);

If you shift by 4 instead of 3, you can change the constants to have range of 16 in the exponent, e.g., -8 to 8, i.e. 1/256 to 256.

N.B.: Since the approximation is always too small, except for powers of two, you could add something around 0.043 to reduce the maximal absolute error.