Imagine that I am implementing inserting and deleting within a small vector. (If this is C++ then assume further that the vector elements are trivially copyable.)
When inserting into the middle of this vector (assuming that I have ascertained that no reallocation is necessary), I know that the copy to make space for a new element must move bytes to higher addresses. Similarly, when implementing erase in the middle of this vector, I know that the copy to eliminate the erased object must move bytes to lower addresses.
memmove will sort this out, but it will spend time comparing the supplied addresses so as to choose a 'move up' or 'move down' loop. I expect my vectors to be quite small. (In reality they are the buckets in a open addressing, linear probing, RobinHood hash table.) Thus I am interested in optimizing the entire data move operation. My question is, can I eliminate that initial memmove start-up overhead? Ideally, I would like to achieve such an optimization across the big three platforms (Windows, Mac and Linux).
No. There is no function in the C Standard Library that
But (C++ version)
If you have access to C++, you can use the template function std::copy
and std::copy_backward
to specify direction at compile time.
But (possible compiler magic)
You may be able to copy/paste each platform's implementation into your own code and rely on the compiler to optimize out the direction checks when the compiler can reason about them at compile time.
But (since you're copying anyway)
If you decide to copy each platform's implementation, you might as well split memcpy
into a my_memcpy_forward
and my_memcpy_backward
functions that omit the runtime check.
As always
Premature optimization is the root of all evil, so profile your code to make sure this optimization even matters for your market needs.