c network-programming serialization floating-point ieee-754

Is Serializing Floats Necessary for Cross-Platform Network Code?

I'm reading this guide about network programming, which I'm liking a lot: https://beej.us/guide/bgnet/html/split/slightly-advanced-techniques.html#serialization

I'm confused about something though. In this section about serialization, he talks about serializing ints for byte-ordering reasons, which makes sense to me, but he also includes these two functions pack754 and unpack754 for serializing floats in IEEE-754 format.

uint64_t pack754(long double f, unsigned bits, unsigned expbits)
{
    long double fnorm;
    int shift;
    long long sign, exp, significand;
    unsigned significandbits = bits - expbits - 1; // -1 for sign bit

    if (f == 0.0) return 0; // get this special case out of the way

    // check sign and begin normalization
    if (f < 0) { sign = 1; fnorm = -f; }
    else { sign = 0; fnorm = f; }

    // get the normalized form of f and track the exponent
    shift = 0;
    while(fnorm >= 2.0) { fnorm /= 2.0; shift++; }
    while(fnorm < 1.0) { fnorm *= 2.0; shift--; }
    fnorm = fnorm - 1.0;

    // calculate the binary form (non-float) of the significand data
    significand = fnorm * ((1LL<<significandbits) + 0.5f);

    // get the biased exponent
    exp = shift + ((1<<(expbits-1)) - 1); // shift + bias

    // return the final answer
    return (sign<<(bits-1)) | (exp<<(bits-expbits-1)) | significand;
}

long double unpack754(uint64_t i, unsigned bits, unsigned expbits)
{
    long double result;
    long long shift;
    unsigned bias;
    unsigned significandbits = bits - expbits - 1; // -1 for sign bit

    if (i == 0) return 0.0;

    // pull the significand
    result = (i&((1LL<<significandbits)-1)); // mask
    result /= (1LL<<significandbits); // convert back to float
    result += 1.0f; // add the one back on

    // deal with the exponent
    bias = (1<<(expbits-1)) - 1;
    shift = ((i>>significandbits)&((1LL<<expbits)-1)) - bias;
    while(shift > 0) { result *= 2.0; shift--; }
    while(shift < 0) { result /= 2.0; shift++; }

    // sign it
    result *= (i>>(bits-1))&1? -1.0: 1.0;

    return result;
}

What I'm confused about is that these functions work by looking at the first bit for the sign, then the next X bits for the exponent, then the next Y bits for the mantissa. So doesn't that mean the float has to already be in IEEE-754 format on the host machine for this to work?

Is this just here to explain the format, or is this something you would actually do in real life?

Solution

Is Serializing Floats Necessary for Cross-Platform Network Code?

Yes. FP encoding has many variations across implementations including variations is size, endian, precision ,exponent range, sub-normal support (and possible even base).

So doesn't that mean the float has to already be in IEEE-754 format on the host machine for this to work?

No, the pack/unpack will "work" (see following problems) even if long double is not IEEE.

Is this just here to explain the format, or is this something you would actually do in real life?

Looks like learner code. I would not use the provided pack/unpack code, given its weaknesses (below) and especially the 2 very inefficient while loops. Loops may iterate thousands of times with binary128.

The code is a hole-riddled attempt to pack an arbitrary encoded long double into an IEEE binary64. It fails for values near 0.0, rounding, handle overflow and infinity/NAN well.

pack754() has at least these short-comings:

if (f == 0.0) return 0; loses information during serialization as it returns 0 for both +0.0 and -0.0. When testing the FP sign bit, do not use if (f < 0), but if (signbit(f)) to well extract the sign bit even if f is zero or NAN.
long double may be more than 64 bits so uint64_t pack754(long double f, unsigned bits, unsigned expbits) loses info in trying to pack into 64-bits. I suppose OP is tolerating this info loss.
1LL<<significandbits is UB on overflow (significandbits >= 63). 1ULL<<significandbits has some advantage, yet overflow (significandbits >= 64) remains a problem.
Using float math with the later long double math is short sighted. ((1LL<<significandbits) + 0.5L) makes a little more sense.
Rather than while(fnorm >= 2.0) like code, use long double frexpl(long double value, int *p) to extract a normalized value and exponent. Use long double ldexpl(long double x, int p) to re-combine. while(fnorm >= 2.0) { fnorm /= 2.0; shift++; } risks an infinite loop when fnorm is infinity.
+ 0.5f for rounding has many corners issues. Better to use lround() and friends.
...

For simple cross platform exchange of FP values, I'd consider sprintf(buf, "%La", x) as a first step to pack and strtold() to unpack.

Packing a FP into a tight intN_t and maintaining precision/range faithfulness across many computer implementations are competing goals.
Which is more important: faithful conversions or small packet size?
Most systems I've worked with prize faithful conversions over small packet size.

Packing a long double, for portability, into a 64-bit is simply an unwise design.