floating-point-conversion6502

6502 Wozniak and Rankin's Floating Point Routines


I am having some trouble figuring out how to convert a given floating point number, say 3.14 into the FP representation chosen by the authors and just need some help figuring it out.

I'm going through some old FP routines written by Wozniak and Rankin and I'm having some trouble understanding FP representation chosen, specifically how to convert a given FP number to the chosen representation and vice versa.

For instance, I want to convert 3.14 to the representation needed for those subroutines to work, ie into this format SEEEEEEE SM.MMMMMM MMMMMMMM MMMMMMMM.

The preamble claims that the mantisa is in the range of 1. to 2. which I cant seem to figure out how they came to that range. Can someone explain that? If the second bit of the matinsa is a 1, 01000000, then would this be +1.00000? Now if the binary mantissa is 11000000 would this be -2+1+0... which is -1 in two's compliment?


Solution

  • The preamble claims that the mantissa is in the range of 1. to 2. which I cant seem to figure out how they came to that range.

    The range is derived so there is always exactly one digit to the left of the decimal point and that one digit is not zero (for positive numbers). 2 is represented as 10 in binary which is two digits, so it and any bigger numbers are not allowed.

    Thinking about positive numbers only, if the mantissa is not in the range [1, 10)1 you can always adjust by shifting it so that it is and compensate by manipulating the exponent. This is true no matter what number base you are working in.

    In base 10, 31.4 can be represented as 3.14 x 101, 0.314 can be represented as 3.14 x 10-1. In base 10, a floating point number can always be represented with a mantissa with the first digit in the range 1 to 9 inclusive (equivalently [1, 9] or [1, 10) ).

    In binary floating point, it is the same except that the exponent denotes a power of 2 and not a power of 10 and the mantissa is a binary number not a decimal number. In binary, the range [1, 10) is the range [1, 2) in decimal and it only has one number in it and that is 1.

    In binary, 3.14 is 11.0010001... x 100 It doesn't actually have a finite exact representation in binary but we only need the first few digits for demonstration (note that the exponent isn a binary exponent so 100 is the same as 20 in decimal). 11 is not in the range [1, 10) so we divide it by 10 ( = 2 in decimal) and add 1 to the exponent to compensate. This gives us 1.10010001... x 101. This is the process of normalisation.

    For positive numbers, the mantissa in your scheme will always start 01 where 0 is the sign bit and the 1 is the first and only non fractional digit.

    Now if the binary mantissa is 11000000 would this be -2+1+0... which is -1 in two's compliment?

    This is where it gets a bit more confusing. The comments on the listing say that the mantissa is in 2's complement and there are two binary digits to the left of the decimal point, including the sign. So there are four possible values (ignoring the fraction digits)

    01 - 1
    00 - 0
    11 - -1
    10 - -2
    

    For negative numbers, the range allowed is [-2, -1). This is because, in twos complement 11.something represents a number that is actually -0.something and the something can always be multiplied by powers of 2 to get to -1.something with an adjusted exponent.

    So for this scheme of floating point, the mantissa will always start 01 or 10 if it is properly normalised.

    You'll note that, if you know the sign of the mantissa, you know what the bit to the left of the decimal point is going to be. Several modern formats therefore omit the bit to the left of the decimal point saying that it is implied. This means the fraction part can have an extra bit of precision for the same length of mantissa.


    1This is a mathematical notation denoting a half open range. The "[" means the lower bound is included in the range and the ")" means all the numbers between the lower bound and the upper bound are included but not the upper bound itself.