cstdio

Is printf's %a formatting for floating-points not unique?


C23 defines the %a conversion specifier in § 7.23.6.1.8 (see here on page 333) as:

A double argument representing a floating-point number is converted in the style [-]0xh.hhhhp±d, where there is one hexadecimal digit (which is nonzero if the argument is a normalized floating-point number and is otherwise unspecified) before the decimal-point character and the number of hexadecimal digits after it is equal to the precision; if the precision is missing and FLT_RADIX is a power of 2, then the precision is sufficient for an exact representation of the value

[…]

The letters abcdef are used for a conversion and the letters ABCDEF for A conversion. […] The exponent always contains at least one digit, and only as many more digits as necessary to represent the decimal exponent of 2. If the value is zero, the exponent is zero.

(emph. mine)

Does this mean that the representation need not to be unique?

For instance, could printf("%a", 1.0) output 0x1p+0, as well as 0x2p-1, 0x4p-2, 0x8p-3? They all have 1 digit in the exponent, so they should be all equivalent as per the requirement above.


Solution

  • You are correct: the representation need not be unique because the first hex digit, before the ., is unspecified, thus it can represent 1 to 4 bits of the mantissa. This means the number 1.0 can be represented as 0x1p+0, 0x2p-1, 0x4p-2 or 0x8p-3.

    The highlighted phrase and only as many more digits as necessary to represent the decimal exponent of 2 means a non zero exponent cannot have extra leading zeroes, excluding representations for 1.0 such as 0x2p-01 or 0x2p-001. The next phrase If the value is zero, the exponent is zero. excludes representations such as 0x1p+00. Note that for this case the specification should have been more explicit and specified +0, excluding 0x1p-0.

    Note also that the C Standard does not specify if the '.' must be omitted when the precision is missing in case no hexadecimal digits are required after the . to represent the number. Hence 0x1.p+0, 0x2.p-1 seem as compliant as 0x1p+0 and 0x2p-1. The C Standard does specify if the precision is zero and the # flag is not specified, no decimal-point character appears, which does not cover the case where precision is missing and no digits are necessary. Omitting the . unless # is specified seems consistent and is indeed the observed behavior on various POSIX systems.

    For illustration, the default C library for printf("%a", 1.0) produces 0x1p+0 on macOS and FreeBSD, but 0x8p-3 on Debian linux and OpenBSD.

    The case of printf("%a", 3.0) is somewhat consistent: 0x1.8p+1 and 0xcp-2 respectively, yet these representations that do not even have the same length.

    The rationale for macOS and FreeBSD seems to always have 1 as the initial digit whereas Debian linux default libc (the GNU libC) and that of OpenBSD pack 4 bits of the mantissa into the initial digit, minimizing the total number of digits in 75% of cases and more importantly cramming more precision into the requested number of places should precision be specified in the format, which is valuable and IMHO better.