Often in C++, one has a parameter void* user_data
that one can use to pass an arbitrary type.
I used this to pass an array of booleans. However, I had a bug where I cast from bool*
--> void*
--> int*
and I got weird results. Here is an example.
#include <iostream>
int main() {
bool test[2] = { };
void *ptr = static_cast<void*>(test);
std::cout << static_cast<bool*>(ptr)[0] << '\n';
std::cout << static_cast<int*>(ptr)[0] << '\n';
std::cout << static_cast<int>(test[0]) << '\n';
}
Output:
$ g++ int_bool.cpp
$ ./a.out
0
-620756992
0
Can someone explain to me what the problem is? Normally when I cast from bool to int, there is no problem: false maps to 0 and true maps to 1. Clearly, that's not the case here.
static_cast<int*>(ptr)[0]
casts ptr
to int*
and reads the first element. Since the original array is only 2 bytes, you're reading outside it (because you're reading a 4-byte int
) and invokes undefined behavior, unless int
is a 2-byte type on your system. You're also violating the strict aliasing rule by accessing a type using a different pointer type which also invokes UB. Besides you'll get UB if the bool array isn't properly aligned. On x86 it doesn't cause any problems because x86 allows unaligned access by default but you'll get a segfault on most other architectures
static_cast<int>(test[0])
OTOH converts test[0]
(which is a bool
) to int
and is a completely valid value conversion.
Update:
The type
int*
refers to a pointer whose object is 4-bytes long, whereasbool*
refers to a pointer whose object is 2-bytes long
No. When dereferencing a variable var
, an amount of memory of length sizeof(var)
will be read from memory starting from that address and treat as the value of that variable. So *bool_ptr
will read 1 byte and *int_ptr
will read 4 bytes from memory (if bool
and int
are 1 and 4-byte types respectively)
In your case the bool
array contains 2 bytes, so when 4 bytes is read from static_cast<int*>(ptr)
, 2 byte inside the array and 2 bytes outside the array are read. If you declared bool test[4] = {};
(or more elements) you'll see that the int*
dereferencing completes successfully because it reads all 4 bools that belong to you, but you still suffer from the unalignment issue
Now try changing the bool values to nonzero and see
bool test[4] = { true, false, true, false };
You'll quickly realize that casting a pointer to a different pointer type isn't a simple read in the old type and convert to the new type like a simple value conversion (i.e. a cast) but a different "memory treatment". This is essentially just a reinterpret_cast
which you can read to understand more about this problem
I don't understand what you are saying about
char*
. You're saying casting from any type tochar*
is valid?
Casting from any other pointer types to char*
is valid. Read the question about strict aliasing rule above:
You can use
char*
for aliasing instead of your system's word. The rules allow an exception forchar*
(includingsigned char
andunsigned char
). It's always assumed thatchar*
aliases other types.
It's used for things like memcpy
where you copy the bytes representing a type to a different destination
bool test[4] = { true, true, true, true };
int v;
memcpy((char*)&test, (char*)&v, sizeof v);
Technically mempcy
receives void*
, the cast to char*
is just used for demonstration
See also