I'm learning C by reading books about. There was a listing to show some concepts. In this case type-casts. After I copied the listing and ran the program I got an output which I don't understand
int main(){
float value = 65.78;
float *flt_ptr = &value;
char *ch_ptr = (char*) &value;
printf("type-cast (char*) flt_ptr: %c\n", (char*) flt_ptr); // line 1
printf("type-cast (char) *flt_ptr: %c\n", (char) *flt_ptr); // line 2
printf("type-cast (char*) ch_ptr: %c\n", (char*) ch_ptr); // line 3
printf("type-cast (char) *ch_ptr: %c\n", (char) *ch_ptr); // line 4
return 0;
}
As outputted values I get the lines:
char
are senseless and unknown by the ASCII-table65
. So far so good.""
which has the converted value 92
When I convert these values to binary the first 8
bits (the size of the pointer on which its pointing) of 65
are identical to the first 8
bits of my floating number 65.78
. But the first 8
bits of the binary value of 92
are different to the floating number. I don't understand which part of my memory i am reading and why I'm reading another part of my memory in this printf()
-line?
I thought every pointer in this program is pointing at the same address, but the size they're pointing on is different. But than the converted binaries had to be identical but they aren't.(char''') &value;
is nonsense and invalid C so the code will not even compile on a standard C compiler.
(char*) flt_ptr
and (char*) ch_ptr
are nonsense since you cannot pass a pointer to printf where it expects a character through %c
. You invoke undefined behavior so anything can happen - the strange output isn't necessarily caused by "memory addresses in char are senseless", but rather because you lied to printf
and told it you'd pass a character, then passed something else entirely.
(char*) ch_ptr
is nonsense since it casts from a char*
to char*
.
(char) *ch_ptr
is nonsense since it casts from a char
to a char
.
So you have quite a bit of conceptual problems here.
That aside, we have to realize the difference between (char)some_float
and *(char*)&some_float
. Either is valid C but the two examples do very different things.
With (char)some_float
, char
is treated just like any of the integer types, so it is pretty much the same thing as writing (int)some_float
. What happens here is that we tell the compiler to helpfully reinterpret our float number into a fixed point format, which is done by discarding the decimals. In this case you will end up with 65
. Any integer 65
printed with %c
will result in 'A'
indeed, given the most common symbol tables (ASCII/UTF8). That's what you got when you did (char) *flt_ptr
.
But in case of *(char*)&some_float
we tell the compiler to take the float address and treat what's stored there as an array of characters, which we can access through a character pointer. The left-most *
in my example is dereferencing the first byte of the float. Here you stumbled upon a very special feature of C which only works when we go from pointer-to-anything to pointer-to-character. The purpose of this is to allow us to do hardware-related programming and inspect the raw binary contents of any object, byte by byte. Hence character type, because a character is C's type used for representing a byte.
So we end up with the raw binary representation of the float
, which will just be gibberish if printed as %c
. Because the raw binary will be IEEE 754 floating point with sign bit, exponent and fraction parts. How float numbers are stored in memory is a whole chapter of its own.
This is now turning into a more advanced topic:
In case we do want to print the binary representation of a float
with printf
, we better use unsigned char
since the char
type may or may not be a signed integer type depending on compiler.
We may try this code:
unsigned char *ch_ptr = (unsigned char*) &value;
for(size_t i=0; i<sizeof(float); i++)
{
printf("%.2X ", ch_ptr[i]);
}
This gives us the output 5C 8F 83 42
. 0x5C being the 92 you spotted. The raw binary representation of 65.78 can be conveniently obtained from a site like https://www.h-schmidt.net/FloatConverter/IEEE754.html. The actual number stored isn't 100% accurate since we are dealing with float numbers. But we should be expecting 42838f5c
when translated to hex. And that's the numbers we just got, but backwards. Backwards because I used an Intel x86 computer, which utilizes little endian, meaning I get the least significant byte first - why it appears backwards when printed byte by byte starting at the lowest address.
We should note that casting a pointer to a different type in C and then de-reference it is not OK in most cases however. That's an advanced topic where things like alignment and C's underlying type system come into play, as well as issues regarding const
correctness etc.