encodingdecimalieee-754floatingieee

Exponent bias or the subtraction "Exponent - Bias" for IEEE 754 Floating Point Format


Currently reviewing some material and I see that in IEEE 754 floating point format it says:

x = (-1)^S × (1 + Fraction) × 2^(Exponent - Bias)

  1. I am unclear on weather the 2^(Exponent - Bias) indicates 2 raised to the exponent bias, or the result of the bias subtracted from exponent aka exponent minus bias.

I was looking at some examples and saw it says for the number 0.75, when converting it into single precision floating point encoding the exponent would be -1 + Bias, or -1 + 0111 1111. This is odd because the exponent = power + bias.

  1. However looking at this video: https://youtu.be/K1XgRO4pvFs?t=352 it somehow has the exponent part be 2 ^ (e - 127), where 127 is being subtracted. from this source: https://class.ece.iastate.edu/arun/CprE281_F05/ieee754/ie3.html#:~:text=For%20single%2Dprecision%20floating%2Dpoint,the%20exponent%20%3D%20power%20%2B%20bias, however, "How do you find the bias exponent? For single-precision floating-point, the bias=127. For double-precision, the bias=1023. The sum of the bias and the power of 2 is the exponent that actually goes into the IEEE 754 string. Remember, the exponent = power + bias." What am I missing?

To clarify, x = (-1)^S × (1 + Fraction) × 2^(Exponent - Bias) is it the exponent bias or exponent - bias?

And why in this video https://youtu.be/K1XgRO4pvFs?t=352 was 2 ^ (e - 127) where 127 is being subtracted?


Solution

  • This is easier to understand if we distinguish the mathematical values from the encodings of the values.

    Consider some number we have put into a normal binary floating-point form: (−1)sf•2e. Here s indicates the sign, f is the fraction portion (called the significand), and e is the exponent. s is 0 or 1 according to whether the number is positive or negative, 1 ≤ f < 2, and e is an integer. For this answer, I assume f fits in 24 bits, so no rounding is needed when representing it in the binary32 format, a.k.a “single precision.”

    We are going to encode this number as a string of 32 bits: 1 bit called S, 8 bits called E, and 23 bits called F. While these are bit strings, we will also identify them with binary numerals and speak of them as having the value of those numerals. So, for our purposes, the bit string 01111111 is the number 127.

    The sign is easy; we simply set S = s.

    For the exponent, if e is in the normal exponent range of the format, [−126, 127], we encode E using E = e + 127. (What happens if e is outside this range is not covered in this answer.)

    For the fraction, we set F = (f−1)•223. This is equivalent to writing f as a 24-bit binary numeral in the form 1.bbbbbbbbbbbbbbbbbbbbbbb, where each b is a bit, and then setting F to those trailing 23 bits.

    Concatenating those bits strings S, E, and F gives us 32 bits that encode the number (−1)sf•2e in the binary32 format.

    The numbers used above, [−126, 127] for the exponent range and 127 for the bias, are fixed parameters of the binary32 type. They do not change according to the number being encoded.

    Again, to keep this clear, remember that f and e are the fraction portion and the exponent of the number in a mathematical format. f is the fraction portion, and e is the exponent.

    F is not the fraction portion (or the significand) and E is not the exponent. F is an encoding of the fraction portion, and E is an encoding of the exponent. They are numbers or bit strings that tell you about the fraction and the exponent. They are not the actual fraction or the exponent. (Also, F is incomplete by itself. To know f in general, you have to know E so you can determine whether the number is normal or subnormal.)