concurrencyrakumessage-passingrakudocro

Understanding the point of supply blocks (on-demand supplies)


I'm having trouble getting my head around the purpose of supply {…} blocks/the on-demand supplies that they create.

Live supplies (that is, the types that come from a Supplier and get new values whenever that Supplier emits a value) make sense to me – they're a version of asynchronous streams that I can use to broadcast a message from one or more senders to one or more receivers. It's easy to see use cases for responding to a live stream of messages: I might want to take an action every time I get a UI event from a GUI interface, or every time a chat application broadcasts that it has received a new message.

But on-demand supplies don't make a similar amount of sense. The docs say that

An on-demand broadcast is like Netflix: everyone who starts streaming a movie (taps a supply), always starts it from the beginning (gets all the values), regardless of how many people are watching it right now.

Ok, fair enough. But why/when would I want those semantics?

The examples also leave me scratching my head a bit. The Concurancy page currently provides three examples of a supply block, but two of them just emit the values from a for loop. The third is a bit more detailed:

my $bread-supplier = Supplier.new;
my $vegetable-supplier = Supplier.new;
 
my $supply = supply {
    whenever $bread-supplier.Supply {
        emit("We've got bread: " ~ $_);
    };
    whenever $vegetable-supplier.Supply {
        emit("We've got a vegetable: " ~ $_);
    };
}
$supply.tap( -> $v { say "$v" });
 
$vegetable-supplier.emit("Radish");   # OUTPUT: «We've got a vegetable: Radish␤» 
$bread-supplier.emit("Thick sliced"); # OUTPUT: «We've got bread: Thick sliced␤» 
$vegetable-supplier.emit("Lettuce");  # OUTPUT: «We've got a vegetable: Lettuce␤» 

There, the supply block is doing something. Specifically, it's reacting to the input of two different (live) Suppliers and then merging them into a single Supply. That does seem fairly useful.

… except that if I want to transform the output of two Suppliers and merge their output into a single combined stream, I can just use

my $supply = Supply.merge: 
                 $bread-supplier.Supply.map(    { "We've got bread: $_" }),
                 $vegetable-supplier.Supply.map({ "We've got a vegetable: $_" });

And, indeed, if I replace the supply block in that example with the map/merge above, I get exactly the same output. Further, neither the supply block version nor the map/merge version produce any output if the tap is moved below the calls to .emit, which shows that the "on-demand" aspect of supply blocks doesn't really come into play here.

At a more general level, I don't believe the Raku (or Cro) docs provide any examples of a supply block that isn't either in some way transforming the output of a live Supply or emitting values based on a for loop or Supply.interval. None of those seem like especially compelling use cases, other than as a different way to transform Supplys.

Given all of the above, I'm tempted to mostly write off the supply block as a construct that isn't all that useful, other than as a possible alternate syntax for certain Supply combinators. However, I have it on fairly good authority that

while Supplier is often reached for, many times one would be better off writing a supply block that emits the values.

Given that, I'm willing to hazard a pretty confident guess that I'm missing something about supply blocks. I'd appreciate any insight into what that might be.


Solution

  • Given you mentioned Supply.merge, let's start with that. Imagine it wasn't in the Raku standard library, and we had to implement it. What would we have to take care of in order to reach a correct implementation? At least:

    1. Produce a Supply result that, when tapped, will...
    2. Tap (that is, subscribe to) all of the input supplies.
    3. When one of the input supplies emits a value, emit it to our tapper...
    4. ...but make sure we follow the serial supply rule, which is that we only emit one message at a time; it's possible that two of our input supplies will emit values at the same time from different threads, so this isn't an automatic property.
    5. When all of our supplies have sent their done event, send the done event also.
    6. If any of the input supplies we tapped sends a quit event, relay it, and also close the taps of all of the other input supplies.
    7. Make very sure we don't have any odd races that will lead to breaking the supply grammar emit* [done|quit].
    8. When a tap on the resulting Supply we produce is closed, be sure to close the tap on all (still active) input supplies we tapped.

    Good luck!

    So how does the standard library do it? Like this:

    method merge(*@s) {
        @s.unshift(self) if self.DEFINITE;  # add if instance method
        # [I elided optimizations for when there are 0 or 1 things to merge]
        supply {
            for @s {
                whenever $_ -> \value { emit(value) }
            }
        }
    }
    

    The point of supply blocks is to greatly ease correctly implementing reusable operations over one or more Supplys. The key risks it aims to remove are:

    The second is easy to overlook, especially when working in a garbage-collected language like Raku. Indeed, if I start iterating some Seq and then stop doing so before reaching the end of it, the iterator becomes unreachable and the GC eats it in a while. If I'm iterating over lines of a file and there's an implicit file handle there, I risk the file not being closed in a very timely way and might run out of handles if I'm unlucky, but at least there's some path to it getting closed and the resources released.

    Not so with reactive programming: the references point from producer to consumer, so if a consumer "stops caring" but hasn't closed the tap, then the producer will retain its reference to the consumer (thus causing a memory leak) and keep sending it messages (thus doing throwaway work). This can eventually bring down an application. The Cro chat example that was linked is an example:

    my $chat = Supplier.new;
    
    get -> 'chat' {
        web-socket -> $incoming {
            supply {
                whenever $incoming -> $message {
                    $chat.emit(await $message.body-text);
                }
                whenever $chat -> $text {
                    emit $text;
                }
            }
        }
    }
    

    What happens when a WebSocket client disconnects? The tap on the Supply we returned using the supply block is closed, causing an implicit close of the taps of the incoming WebSocket messages and also of $chat. Without this, the subscriber list of the $chat Supplier would grow without bound, and in turn keep alive an object graph of some size for each previous connection too.

    Thus, even in this case where a live Supply is very directly involved, we'll often have subscriptions to it that come and go over time. On-demand supplies are primarily about resource acquisition and release; sometimes, that resource will be a subscription to a live Supply.

    A fair question is if we could have written this example without a supply block. And yes, we can; this probably works:

    my $chat = Supplier.new;
    
    get -> 'chat' {
        web-socket -> $incoming {
            my $emit-and-discard = $incoming.map(-> $message {
                    $chat.emit(await $message.body-text);
                    Supply.from-list()
                }).flat;
            Supply.merge($chat, $emit-and-discard)
        }
    }
    

    Noting it's some effort in Supply-space to map into nothing. I personally find that less readable - and this didn't even avoid a supply block, it's just hidden inside the implementation of merge. Trickier still are cases where the number of supplies that are tapped changes over time, such as in recursive file watching where new directories to watch may appear. I don't really know how'd I'd express that in terms of combinators that appear in the standard library.

    I spent some time teaching reactive programming (not with Raku, but with .Net). Things were easy with one asynchronous stream, but got more difficult when we started getting to cases with multiple of them. Some things fit naturally into combinators like "merge" or "zip" or "combine latest". Others can be bashed into those kinds of shapes with enough creativity - but it often felt contorted to me rather than expressive. And what happens when the problem can't be expressed in the combinators? In Raku terms, one creates output Suppliers, taps input supplies, writes logic that emits things from the inputs into the outputs, and so forth. Subscription management, error propagation, completion propagation, and concurrency control have to be taken care of each time - and it's oh so easy to mess it up.

    Of course, the existence of supply blocks doesn't stop being taking the fragile path in Raku too. This is what I meant when I said:

    while Supplier is often reached for, many times one would be better off writing a supply block that emits the values

    I wasn't thinking here about the publish/subscribe case, where we really do want to broadcast values and are at the entrypoint to a reactive chain. I was thinking about the cases where we tap one or more Supply, take the values, do something, and then emit things into another Supplier. Here is an example where I migrated such code towards a supply block; here is another example that came a little later on in the same codebase. Hopefully these examples clear up what I had in mind.