cvariadic-functionswchar-tinteger-promotion

Effect of default argument promotions on wchar_t


I am a bit confused about how default argument promotions effect wchar_t.

I understand that char is promoted to int, and therefore I have to supply int as the second parameter of va_arg, otherwise I may (GCC) or may not (MSVC) get an error, as demonstrated by the "%c" example below.

So, I thought that - analogically - I should take into account some similar promotion in case of wchar_t, and read the definitions of the relevant types in the C99 standard:

7.17 wchar_t ... is an integer type whose range of values can represent distinct codes for all members of the largest extended character set specified among the supported locales; the null character shall have the code value zero. Each member of the basic character set shall have a code value equal to its value when used as the lone character in an integer character constant if an implementation does not define __STDC_MB_MIGHT_NEQ_WC__.

7.24.1 wint_t ... is an integer type unchanged by default argument promotions that can hold any value corresponding to members of the extended character set, as well as at least one value that does not correspond to any member of the extended character set (see WEOF below).

It is clear to me that wint_t is not promoted to anything, and I suspect but do not know for sure that wchar_t is not promoted either.

I tried fetching arguments with va_arg as wchar_t, wint_t and int, and all of these worked, however, this may have happened because of luck:

#include <stdarg.h>
#include <stdio.h>
#include <string.h>
#include <wchar.h>

void print( char const* format, ... );

int main()
{
    printf( "char == %zu, int == %zu, wchar_t == %zu, wint_t == %zu.\n",
        sizeof( char ), sizeof( int ), sizeof( wchar_t ), sizeof( wint_t ) );
        // MSVC x86: char == 1, int == 4, wchar_t == 2, wint_t == 2.
        // MSVC x64: char == 1, int == 4, wchar_t == 2, wint_t == 2.
        // GCC  x64: char == 1, int == 4, wchar_t == 4, wint_t == 4.

    char charA = 'A';
    print( "%c", charA );

    wchar_t wchar_tA = L'A';
    print( "%lc", wchar_tA );

    printf( "\n" );
}

void print( char const* format, ... )
{
    va_list arguments;
    va_start( arguments, format );
    if( strcmp( format, "%c" ) == 0 ) {
        // char c = va_arg( arguments, char );     // -> Bad (1)
        char c = va_arg( arguments, int );         // -> Good
        putchar( ( int ) c );
    } else if( strcmp( format, "%lc" ) == 0 ) {
        wchar_t w = va_arg( arguments, wchar_t );  // -> Good
        // wint_t w = va_arg( arguments, wint_t ); // -> Good
        // int w = va_arg( arguments, int );       // -> Good
        putwchar( ( wchar_t ) w );
    }
    va_end( arguments );
}

// (1) GCC prints:
//       warning: 'char' is promoted to 'int' when passed through '...'
//       note: (so you should pass 'int' not 'char' to 'va_arg')
//     Running the program prints:
//       Illegal instruction

The question: Which of the three lines containing va_arg in the else if block is the correct, standard-compliant one?


Solution

  • I thought that - analogically - I should take into account some similar promotion in case of wchar_t,

    Yes, the type you specify to the va_arg() macro must be compatible with the type of the corresponding actual argument, as promoted according to the default argument promotions, except that you can interchange signed and unsigned versions of the same type as long as both can represent the actual value, and you can interchange void * and char *.

    It is clear to me that wint_t is not promoted to anything,

    Yes, inasmuch as I take you to mean by the default argument promotions, the specifications say that explicitly.

    and I suspect but do not know for sure that wchar_t is not promoted either.

    It is not safe to assume that wchar_t is unchanged by the default argument promotions. If it is neither int nor unsigned int but its integer conversion rank is less than or equal to that of int, then it is affected. Otherwise not. C does not specify which case applies, and that may vary from implementation to implementation.

    Note also that although integer conversion rank is related to the size of a representation of the type (narrow is, generally, lower), it is a distinct concept, and in principle, you cannot reliably judge based on size.


    Which of the three lines containing va_arg in the else if block is the correct, standard-compliant one?

    It depends on your implementation. There is no available alternative that is certain to be correct for every conforming implementation, because the spec does not constrain wchar_t sufficiently for that. But of the three, this one is your best bet:

            // int w = va_arg( arguments, int );       // -> Good
    

    That covers you in all remotely likely variations of wchar_t being promoted to a different type via the default argument promotions. It is definitely correct when wchar_t is int. And it's fine if wchar_t is unsigned int, as long as the actual value of the argument does not exceed INT_MAX.

    It would be incorrect for an implementation in which wchar_t had greater integer conversion rank than int, but I do not know any implementation with that characteristic, and I don't expect ever to see one.

    In contrast, this one is unsafe:

            wchar_t w = va_arg( arguments, wchar_t );  // -> Good
    

    It is ok if wchar_t is int or unsigned int, but the previous is also good in that case (except if wchar_t is unsigned int and the argument value exceeds INT_MAX). This would be the only correct alternative for an implementation where wchar_t had greater integer conversion rank than int and was not the same as wint_t, but again, I don't expect ever to see such an implementation. But otherwise, it is wrong when wchar_t has integer conversion rank less than or equal to that of int, and that is a characteristic of some real-world implementations.

    And this one is less safe than the int alternative:

            // wint_t w = va_arg( arguments, wint_t ); // -> Good
    

    If wint_t happens to be int then it is equivalent to the int variation, of course. But if it happens to be neither int nor unsigned int nor wchar_t, however, then it is definitely wrong, whether wchar_t is affected by the integer promotions or not.