They are regularly questions on SO about the validity of type-punning operations. For instance I contributed to a recent one: Is reinterpret_cast from char* to uint32_t* undefined behaviour in CPP?.
But I have a question that occurred to me when answering the above, that I can illustrate with this snippet:
#include <array>
#include <cstddef>
#include <cstdint>
#include <memory>
struct S {
alignas(std::uint64_t) alignas(std::uint32_t)
std::array<std::byte, sizeof(std::uint64_t)> storage = {some init values};
};
int main()
{
S s;
std::uint64_t* pi64 = std::start_lifetime_as<std::uint64_t>(s.storage.data());
// implicitely creating an uint32_t at the same location, ending lifetime of the uint64_t there
std::uint32_t* pi32 = std::launder(reinterpret_cast<std::uint32_t*>(s.storage.data()));
// is there any guarantee that the value representation is unchanged?
pi64 = std::start_lifetime_as<std::uint64_t>(s.storage);
// is s.storage guaranteed to be unchanged?
}
This is, so far, a theoretical code, as there is no compiler support I know of for std::start_lifetime_as
. But I deem it useful to illustrate my understanding issue.
When defining pi64
, I explicitly start a std::uint64_t
object whose value representation is, on the byte level, the one I stored within the bytes array.
But then I'm implicitly creating an overlapping std::uint32_t
which, AFAIU, ends the lifetime of the mentioned std::uint64_t
object.
What happen to the underlying bytes? What are the standard rules (I know that unsigned char
/std::byte
arrays have special rules)?
I'm leaving the original question above but regarding https://stackoverflow.com/a/61442225/21691539, this snippet is meaningless as std::uint32_t* pi32 = std::launder(reinterpret_cast<std::uint32_t*>(s.storage.data()));
is not, contrary to my claim above, a storage reuse.
The meaningful snippet would be:
#include <array>
#include <cstddef>
#include <cstdint>
#include <memory>
struct S {
alignas(std::uint64_t) alignas(
std::uint32_t) std::array<std::byte, sizeof(std::uint64_t)> storage = {
some init values};
};
int main() {
S s;
// explicitly creating an uint64_t at the storage location
std::uint64_t* pi64 =
std::start_lifetime_as<std::uint64_t>(s.storage.data());
// explicitly reusing storage, thus ending the uint64_t lifetime
new (storage.data()) std::byte[sizeof(std::uint64_t)];
// implicitely creating an uint32_t at the same location, ending lifetime of
// the uint64_t there
std::uint32_t* pi32 =
std::launder(reinterpret_cast<std::uint32_t*>(s.storage.data()));
// explicitly re-creating an uint64_t at the storage location
// is it sufficient to end the uint32_t lifetime or do I need an explicit
// storage reuse as above?
pi64 = std::start_lifetime_as<std::uint64_t>(s.storage);
// is s.storage guaranteed to be unchanged?
}
So the question remains the same, saying it another way: is *pi64
guaranteed to be the same before and after the storage reuse?
NB this snippet may still be incorrect, see the comment before the second std::start_lifetime_as
. One may tell me in comment if I need an extra line to end the std::uint32_t
lifetime and I will fix the snippet accordingly.
But then I'm implicitly creating an overlapping
std::uint32_t
No, you are not.
You wrote std::launder(reinterpret_cast<std::uint32_t*>(s.storage.data()))
which is undefined behavior if there isn't a std::uint32_t
object at the pointer fed to std::launder
(see Precondition for std::launder
).
Neither reinterpret_cast
nor std::launder
implicitly create objects, so the only object within the storage is one of type std::uint64_t
, making this undefined behavior.
Your example would make sense if you instead wrote
std::uint32_t* pi32 = std::start_lifetime_as<std::uint32_t>(s.storage.data()));
What happen to the underlying bytes?
They remain untouched. [obj.lifetime] p3 explains that for std::start_lifetime_as
The object representation of [the new object] is the contents of the storage prior to the call to
start_lifetime_as
. [...] except that the storage is not accessed.
std::start_lifetime_as
does not access any of the bytes belonging to the storage; it can be seen purely as an instruction to the compiler.
Therefore, you can chain together as many std::start_lifetime_as
calls with the same storage together as you want, and you will always get the same value for the same type.
new
wipes out the bytes, meaning that the value of the std::uint64_t
is not preserved, and the subsequent call to std::start_lifetime_as
is UB.
This happens because objects created by new
have dynamic storage duration ([expr.new] p10), which means:
When storage for an object with automatic or dynamic storage duration is obtained, the bytes comprising the storage for the object have the following initial value:
- If the object has dynamic storage duration, or [...], the bytes have indeterminate values; [...]
In practice, it's not like the compiler will forcefully scramble the bytes somehow; it just means that the compiler from then on considers the bytes to be indeterminate, even if they had some value prior to the new
expression and the memory was not overwritten.
The call to std::start_lifetime_as
is UB because it determines the value of the result as if through std::bit_cast
, ([obj.lifetime] p3) and std::bit_cast
is UB when producing an indeterminate result ([bit.cast] p2).
Note: The call to std::start_lifetime_as
being UB seems like an unintentional consequence of forwarding the semantics to std::bit_cast
; I've reported this defect at LWG4168.
Regardless, the use of new
within storage where a std::uint64_t
lived would kill the std::uint64_t
, and that's the core of this Q&A.