c++undefined-behaviorcopy-elisionc++-coroutine

Unexpected memcpy on uncopyable & unmovable type when using co_await


Preamble

This is a description of what I am trying to do with the code, skip to the next section to see the actual issue.

I want to use coroutines in an embedded system, where I can't afford too many dynamic allocations. Therefore, I am trying the following: I have non-copyable, non-movable awaitable types for the various queries to peripherals. When querying a peripheral, I use something like auto result = co_await Awaitable{params}. The constructor of the awaitable prepares the request to the peripheral, registers its internal buffer to receive the reply, and registers its ready flag in the promise. The coroutine is then suspended.

Later, the buffer will be filled, and the ready flag will be set to true. After this, the coroutine knows that it can be resumed, which the causes the awaitable to copy out the result from the buffer before being destroyed.

The awaitable is non-copyable and non-movable to force guaranteed copy elision everywhere, so that I can be sure that the pointers to buffer and ready remain valid until the awaitable has been awaited (at least that was the plan...)

The issue

I am encountering an issue with ARM GCC 11.3 in the following code:

#include <cstring>
#include <coroutine>

struct AwaitableBase {
    AwaitableBase() = default;
    AwaitableBase(const AwaitableBase&) = delete;
    AwaitableBase(AwaitableBase&&) = delete;

    AwaitableBase& operator=(const AwaitableBase&) = delete;
    AwaitableBase& operator=(AwaitableBase&&) = delete;

    
    char buffer[65];
};

struct task {
    struct promise_type
        {
            bool* ready_ptr;

            task get_return_object() { return {}; }
            std::suspend_never initial_suspend() noexcept { return {}; }
            std::suspend_always final_suspend() noexcept { return {}; }
            void return_void() {}
            void unhandled_exception() {}
        };
};

struct Awaitable{
    AwaitableBase base;
    bool ready{false};

    bool await_ready() {return false;}
    void await_suspend(std::coroutine_handle<task::promise_type> handle)
    {
        handle.promise().ready_ptr = &ready;
    }
    int await_resume() { return 2; }
};

AwaitableBase make_awaitable_base()
{
    return AwaitableBase{};
}


task example()
{
    co_await Awaitable{make_awaitable_base()};
}

When compiling this with ARM GCC 11.3 without any optimizations, the code contains a memcpy call that moves around the AwaitableBase object (excerpt from Godbolt):

ldr     r3, [r7, #4]
adds    r3, r3, #87
mov     r0, r3
bl      make_awaitable_base()
ldr     r2, [r7, #4]
ldr     r3, [r7, #4]
add     r0, r2, #21
adds    r3, r3, #87
movs    r2, #65
mov     r1, r3
bl      memcpy
ldr     r3, [r7, #4]
movs    r2, #0
strb    r2, [r3, #86]
ldr     r3, [r7, #4]
adds    r3, r3, #21
mov     r0, r3
bl      Awaitable::await_ready()

This breaks my code, as I am relying the fact that the object cannot be moved/copied. It was my understanding that making an object non-copyable & non-movable should prevent it from being memcopied.

Observations/Comments

Question(s)

How can I work around this? Is it a bug with the compiler, or am I misunderstanding something about guaranteed copy elision? Is it undefined behavior to rely on the fact that the address of the temporary should not change during the duration of the co_await call?


Solution

  • As pointed out in the comments, this is a GCC bug, where prvalues created by constructing objects in co_await expressions are erroneously treated as trivially copyable aggregates, creating a temporary that is memcpy'd from.

    The fix is to never construct a non-trivial object directly in a co_await expression. E.g., co_await Class{ ... }, co_await function_call(Class{ ... }) and co_await Class{ ... }.member_function() are all prone to this bug.

    You can replace these with co_await [&]{ return ...; }(); (which is co_await lambda_type(captured_references...)(), where that lambda type can be memcpy copied)

    You might want to macro-ify this to #define CO_AWAIT(...) co_await [&]() -> decltype(auto) { return __VA_ARGS__ ; }() so you can just search for lowercase co_await in your code base to completely eliminate this bug.