Are there optimized versions of memmove for when I know the direction?

Imagine that I am implementing inserting and deleting within a small vector. (If this is C++ then assume further that the vector elements are trivially copyable.)

When inserting into the middle of this vector (assuming that I have ascertained that no reallocation is necessary), I know that the copy to make space for a new element must move bytes to higher addresses. Similarly, when implementing erase in the middle of this vector, I know that the copy to eliminate the erased object must move bytes to lower addresses.

memmove will sort this out, but it will spend time comparing the supplied addresses so as to choose a 'move up' or 'move down' loop. I expect my vectors to be quite small. (In reality they are the buckets in a open addressing, linear probing, RobinHood hash table.) Thus I am interested in optimizing the entire data move operation. My question is, can I eliminate that initial memmove start-up overhead? Ideally, I would like to achieve such an optimization across the big three platforms (Windows, Mac and Linux).

Solution

No. There is no function in the C Standard Library that

copies overlapping memory ranges
when you know the copy direction at compile time
without having runtime checks for correct copy direction.

But (C++ version)

If you have access to C++, you can use the template function std::copy and std::copy_backward to specify direction at compile time.

But (possible compiler magic)

You may be able to copy/paste each platform's implementation into your own code and rely on the compiler to optimize out the direction checks when the compiler can reason about them at compile time.

But (since you're copying anyway)

If you decide to copy each platform's implementation, you might as well split memcpy into a my_memcpy_forward and my_memcpy_backward functions that omit the runtime check.

As always

Premature optimization is the root of all evil, so profile your code to make sure this optimization even matters for your market needs.