In the C++ standard the meaning of memcpy is taken from C (I will quote C++23 here but I don't believe the wording has changed from earlier versions):
The contents and meaning of the header are the same as the C standard library header <string.h>.
And in C23 standard we find in 7.26.2.1:
The memcpy function copies n characters from the object pointed to by s2 [second argument] into the object pointed to by s1 [first argument].
However, frequently, in C++, we see uses such as:
constexpr std::size_t N = sizeof(T);
char buf[N];
T obj;
std::memcpy(buf, &obj, N);
std::memcpy(&obj, buf, N);
This is taken from an example from the C++ standard.
In this code, the "buf"
-pointer passed to memcpy
technically points to the first element of buf
, as it is produced by array-to-pointer decay. Strictly speaking, the object from and to which memcpy
copies should therefore be the first element, not the entire array. Might it be necessary to use &buf
here instead to be correct and completely avoid UB? Is this a standard defect?
Note: Please note that this is a language-lawyer-tagged question. It's primarily about the wording of the standard and the behavior of the "abstract machine" rather than whether this works in the real world of C++ programs as produced by real compilers.
You're right that in a pedantic reading of the standard, we would need to take the address of the array instead of passing buf
because array-to-pointer conversion obtains a pointer to the first element.
std::memcpy(buf, &obj, N);
would therefore write past the end of the first char
object in buf
, which is not okay if it
writes
into the object pointed to by
s1
Another bit of broken wording in this area is that we never clearly specify that you can use an unsigned char*
/char*
to "navigate" the bytes of an object, and P1839R7 Accessing object representations deals with that problem.
Currently, memcpy
just "magically" copies the object even though the user couldn't implement that.
In general, the wording in these areas of both the C and C++ standard is pretty defective.
However, this doesn't really matter in practice.
std::memcpy(buf, &obj, N);
is expected to work by users,
compilers consider it to be valid,
and disallowing it would break immeasurable amounts of code.
The C++ standard generally only specifies additional semantic on top of C functions like in [cstring.syn] p3; it doesn't fully redefine them.
Therefore, this wording would have to be fixed on the C side, and the process is explained at https://www.open-std.org/jtc1/sc22/wg14/www/contributing.html
What we probably want to say is that memcpy
copies bytes of storage from the source into the destination, and that this can access bytes not just from a single object but also from the surrounding array, if any.