
Floating Point numbers on VAX machine

I have the source code in RTL/2 program languange of an old application running on a VAX machine. Unfortunately I don't have the possibility\ability to re-compile the application. I have to change some coefficent ( real numbers, "code wired")

So I had an idea: i could change directly these numbers in the executables ( some .cm8 files, these are big files where all lines start with a ":" then a sort of ADDRESS and the HEX data)

unfortunately if i take for istance one of the coefficents (es 3.8262619e-09) and i rapresent it in binary I obtain:

es 3.8262619e-09 In binary is : 00110001100000110111100000101000 hex is: 0x31837828  hex in reverse endianess:  0x28788331

But if I search for those HEX in the executable files... I do not find matches. If i could find these number in the executable i would like to change them directly. The problem, I presume, is that the VAX machine does not rapresent floating point using IEEE 754 standard. I found this link https://nssdc.gsfc.nasa.gov/nssdc/formats/VAXFloatingPoint.htm Which explains the floating point rapresentation on a vax machine, But I do not understand how to rapresent my real numbers ( es the 0.38262619E-08 I found directly in the source code) in VAX floating point format.

Any help?


  • This answer assumes that the format used for the floating-point data is the 32-bit VAX F_floating format. This is similar to IEEE-754 binary32. A normalized binary floating-point format, allowing the most significant bit of the significand (mantissa) to be assumed to be 1 and not stored. Both use an 8-bit biased exponent.

    The binary32 format has a significand range of [1, 2) while F_floating has a significand range of [0.5, 1). The exponent bias used by the binary32 format is 127 while the exponent bias of the F_floating format is 128. In combination, this means that identical encodings in the two formats are numerically offset by a factor of four. The F_floating format does not support signed zero, subnormals, infinities, and NaNs.

    Because of compatibility with the 16-bit PDP-11, F_floating uses a non-intuitive byte storage ordering. When examining the memory image in ascending address order, the four bytes of a F_floating operand occur in the order 2, 3, 0, 1.

    For the following ISO-C99 program, I assume that the code is executing on a system that utilizes IEEE-754 floating-point arithmetic.

    #include <stdio.h>
    #include <stdlib.h>
    #include <stdint.h>
    #include <limits.h>
    #include <string.h>
    #include <math.h>
    uint32_t float_as_uint32 (float a)
        uint32_t r;
        memcpy (&r, &a, sizeof r);
        return r;
    /* Convert a IEEE-754 'binary32' operand into VAX F-float format, represented 
       by a sequence of four bytes in ascending address order. Underflow is handled 
       with a flush to zero. Overflow is clamped to the maximum magnitude encoding.
    void float_to_vaxf (float a, uint8_t *b)
        const float TWO_TO_M128 = 2.93873588e-39f; // 2**(-128)
        const float TWO_TO_127  = 1.70141184e+38f; // 2**127
        const float TWO_TO_126  = 8.50705917e+37f; // 2**126
        const float SCAL = 4; // factor between IEEE-754 'binary32' and VAX F-float
        uint32_t t;
        // format underflow: flush to zero
        if (fabsf (a) < TWO_TO_M128) {
            t = 0;
        // format overflow: clamp to maximum magnitude
        else if (fabsf (a) >= TWO_TO_127) {
            t = (a < 0) ? 0xffffffff : 0x7fffffff;
        // large: scale by exponent manipulation to avoid overflow in intermediates
        else if (fabsf (a) >= TWO_TO_126) {
            t = float_as_uint32 (a);
            t = t + (2 << 23); // increment exponent by 2; equivalent multiply by 4
        // common case: scale by multiplication
        else {
            a = a * SCAL;
            t = float_as_uint32 (a);
        // adjust to VAX F-float byte ordering
        b[0] = (uint8_t)(t >> 2 * CHAR_BIT);
        b[1] = (uint8_t)(t >> 3 * CHAR_BIT);
        b[2] = (uint8_t)(t >> 0 * CHAR_BIT);
        b[3] = (uint8_t)(t >> 1 * CHAR_BIT);
    int main (void)
        float a = 3.8262619e-9f;
        uint8_t vf[4];
        float_to_vaxf (a, vf);
        printf ("% 15.8e as VAX F-float bytes: 0x%02x,0x%02x,0x%02x,0x%02x\n", 
                a, vf[0], vf[1], vf[2], vf[3]);
        return EXIT_SUCCESS;