c++functional-programmingc++20c++-coroutine

Is there any guidance on how to choose what each component of a coroutine should do?


(... because there's so many degrees of freedom that I feel disoriented!)

For the purpose of understanding coroutines, I've implemented a generator which, given a unary function f of type T(T) and initial value x of type T, yields the infinite sequence of applications of f to x, i.e. the range [x, f(x), f(f(x)), f(f(f(x))), ...], via a range interface. (In fact, I was inspired by Haskell's iterate, hence the name I chose and the tag.)

It wasn't that hard, as there's examples of generators everywhere, from cppreference to Josuttis' book.

However, then I started to play with it by moving things around, and I've realized that the degrees of freedom in implementing a couroutine are far more than an example like the one I mentioned needs. And since the functionality that the coroutine in my example provides is actually not a toy¹, I feel like even for production code it is a bit hard to decide how to "distribute work" across the various participants, namely the promise, the interface, the awaiters, and the body of the coroutine. Therefore I wanted to know if there some guidelines, or if C++23 will bring in a bit of clarity to this topic.

To give another hint as to why the whole matter confuses me, it's not clear to me how the "persons" that would write each of those participants would relate to each other. Would they all be the same person?² For instance, I don't see how the coroutine's interface's writer can be other than the same writer of the coroutine's promise.

Going to the concrete example, here is the version I originally came up with, live on Compiler Explorer. The main bits are these:

But then I thought: yield_value is a non-const member function of promise_type, and it stores stuff in it, but anything else having access to the handle can retrieve the promise and do the same; and one thing that has access to the handle is the await_suspend method of the Awaiter returned by yield_value, so an alternative approach is to construct an Awaiter with the value to be set, and let its await_suspend do the job. That's how I came up with the second solution, where

I see that the two solutions do look a lot alike, but I wouldn't be sure that they are indeed the same thing (the generated code is not identical down to the bit, after all).

But then I tried one more change in line the following reasoning. The body of the coroutine is just doing the some work after every suspension; but what runs right after a suspension? await_suspend of the Awaiter! So why not moving the work there? The Awaiter doesn't have access to the parameters of the coroutine, so how can it do the work of the coroutine without f and x? The solution is suggested by the standard, we can make the promise_type accept the same arguments as the coroutine and store them locally, so that the Awaiter can retrieve them. That's how I came up with the third solution, where

To my inexperienced eyes, this does look fairly different from where I started and even from where I got with the first re-elaboration (and the assembly is visibly even if not substantially shorter).


(¹) It might look a toy example at first (and maybe it needs to be made more generic and robust, and maybe I should see what happens with a move-only argument; whatever), but it isn't. Think of how easy it makes to traverse a tree while looking for something:

auto nameOfOldestForeFather = *(iterate(getFather, me) | take_while(alive) | transform(name)).end();

or more generally

for_each(iterate(getParent, leafNode) | take_while(someCond), someWork)

(²) I say "person", but I'm referring to any group of people which work together on the same thing, if needed. As in, if I write a function/class/whatever for doing xyz and somebody helps me to any extent, we are still the same "collective brain" that works on that thing.


Solution

  • All of this seems to be confusing two distinct concepts: the coroutine function and what I will call the "coroutine machinery". A coroutine function is just any function that uses one of the co_* keywords.

    The coroutine machinery are the various support objects and their implementation which are used to mediate between a coroutine function and code trying to interact with that function. The coroutine machinery includes, but is not limited to, the promise type, the corresponding future type returned by a coroutine function, and any types specific to those types.

    Note that I say "a coroutine function", not "the coroutine function". This is because coroutine machinery is not intended to be coupled with any specific coroutine function. Coroutine machinery does not define what a coroutine does; it defines how that coroutine talks to and interacts with the outside world.

    A coroutine function defines which machinery it is associated with by its function signature. Your coroutine uses the IterCoro machinery because it uses that as its return type.

    But this is not an accident; this is an intentional part of design. A function signature describes how to interact with a function, but not what it does. Knowing that a function takes an int and returns an int tells you how to talk to it (passing an int and receiving one), but it tells you nothing about what that function will do.

    The same goes for coroutine machinery; it defines how you can interact with any coroutine which uses that machinery. But it does not say what that coroutine will do. Coroutine machinery is meant to apply to a family of coroutine functions that all share a similar coroutine interface.

    IterCoro represents a particular interface, a specific way of talking to coroutine functions. But only for functions which conform to its expected requirements of the coroutine.

    Specifically, it expects the coroutine function to:

    1. Be a yielding generator;
    2. Which only yields values of type int (or types convertible to that); and
    3. Will never terminate by flowing off the end of its function.

    A coroutine function which conforms to this interface may use the IterCoro machinery. Things which your IterCoro does not support include:

    1. Using co_await within the coroutine. That is, pausing execution of the coroutine and scheduling its resumption with some external code.
    2. Using co_return within the coroutine.
    3. Supplying a value to the coroutine when it uses co_yield. Yes, that's a thing you can do.

    And that's fine; no coroutine machinery should allow any coroutine to function with it. The machinery has an expected interface, and it is bad for a coroutine to not conform to that interface.

    But the coroutine machinery should not define what that coroutine function is actually doing. An IterCoro is a coroutine that infinitely generates values; how it does so is not IterCoro's business. If you put the business logic of a coroutine function within the coroutine machinery it happens to use, you are using the feature wrong.