ccasting

typecasting to unsigned in C


int a = -534;
unsigned int b = (unsigned int)a;
printf("%d, %d", a, b);

prints -534, -534

Why is the typecast not taking place?

I expected it to be -534, 534


If I modify the code to

int a = -534;
unsigned int b = (unsigned int)a;
if(a < b)
  printf("%d, %d", a, b);

its not printing anything... after all a is less than b??


Solution

  • First, you don't need the cast: the value of a is implicitly converted to unsigned int with the assignment to b. So your statement is equivalent to:

    unsigned int b = a;
    

    Now, an important property of unsigned integral types in C and C++ is that their values are always in the range [0, max], where max for unsigned int is UINT_MAX (it's defined in limits.h). If you assign a value that's not in that range, it is converted to that range. So, if the value is negative, you add UINT_MAX+1 repeatedly to make it in the range [0, UINT_MAX]. For your code above, it is as if we wrote: unsigned int b = (UINT_MAX + a) + 1. This is not equal to -a (534).

    Note that the above is true whether the underlying representation is in two's complement, ones' complement, or sign-magnitude (or any other exotic encoding). One can see that with something like:

    signed char c = -1;
    unsigned int u = c;
    printf("%u\n", u);
    assert(u == UINT_MAX);
    

    On a typical two's complement machine with a 4-byte int, c is 0xff, and u is 0xffffffff. The compiler has to make sure that when value -1 is assigned to u, it is converted to a value equal to UINT_MAX.

    Now going back to your code, the printf format string is wrong for b. You should use %u. When you do, you will find that it prints the value of UINT_MAX - 534 + 1 instead of 534.

    When used in the comparison operator <, since b is unsigned int, a is also converted to unsigned int. This, given with b = a; earlier, means that a < b is false: a as an unsigned int is equal to b.

    Let's say you have a ones' complement machine, and you do:

    signed char c = -1;
    unsigned char uc = c;
    

    Let's say a char (signed or unsigned) is 8-bits on that machine. Then c and uc will store the following values and bit-patterns:

    +----+------+-----------+
    | c  |  -1  | 11111110  |
    +----+------+-----------+
    | uc | 255  | 11111111  |
    +----+------+-----------+
    

    Note that the bit patterns of c and uc are not the same. The compiler must make sure that c has the value -1, and uc has the value UCHAR_MAX, which is 255 on this machine.

    There are more details on my answer to a question here on SO.