c++castingchartype-conversionshort

C++ casting char into short


Pardon me for this newbie question. I recently found that a strange thing when casting char into short. Basically, if the char is overflowed, when casting into short the binary number is prepended with 11111111. If the char is not overflowed, it will be prepended with 00000000.

For example,

char a = 130;
short b = (short)a;
printf("%hhx\n", a);
printf("%hx\n", b);

prints

82
ff82

While

char a = 125;
short b = (short)a;
printf("%hhx\n", a);
printf("%hx\n", b);

prints

7d
7d

So when doing casting, do variable type and value get checked before deciding what exactly binary number it's casted into (deciding b/w prepending 0xFF or 0x00)? Is there any reason behind this? It seems always doing (short)a & 0x00FF would be a good practice?


Solution

  • Read up on: 2's complement for how negative numbers are encoded in binary.

    In a signed char, assuming an 8-bit char width and 2's complement arch, a char can hold a value between -128 to +127.

    When you say:

    char a = 130;
    

    That's out of range.

    130 as integer in 32-bit binary is: 00000000 00000000 00000000 10000010

    In Hex, it's: 00 00 00 82. That's where your 82 value is coming from.

    When int(130) is cast to char it's basically just chopping off all by the last byte of bits: 10000010.

    Hence char a = <binary:10000010> is -126 in 2's complement arithmetic.

    So when you assign short b = a, you're just assigned -126 to a short.

    In 2's complement architecture, when a negative number gets promoted to a larger type, it gets "sign extended". That is, if the most significant bit of the signed char is 1, then when it gets converted to short, the extra byte is prepended with leading 1s as well. That is, -126 as a 16-bit binary is: 11111111 10000010 or 0xff82

    Try declaring a as unsigned char and you should get different results.