raku method-resolution-order multimethod

How does a Perl 6 object find a multi method that might be in a parent class or role?

Consider this example where a subclass has a multi method with no signature and one with a slurpy parameter:

class Foo {
    multi method do-it { put "Default" }
    multi method do-it ( Int $n ) { put "Int method" }
    multi method do-it ( Str $s ) { put "Str method" }
    multi method do-it ( Rat $r ) { put "Rat method" }
    }

class Bar is Foo {
    multi method do-it { put "Bar method" }
    multi method do-it (*@a) { put "Bar slurpy method" }
    }

Foo.new.do-it: 1;
Foo.new.do-it: 'Perl 6';
Foo.new.do-it: <1/137>;
Foo.new.do-it;

put '-' x 10;

Bar.new.do-it: 1;
Bar.new.do-it: 'Perl 6';
Bar.new.do-it: <1/137>;
Bar.new.do-it: 5+3i;
Bar.new.do-it;

How is the method lookup structured? I'm looking more for a way to explain it and specifically not complaining about it.

Int method
Str method
Rat method
Default
----------
Int method
Str method
Rat method
Bar slurpy method
Bar method

There's a call to Bar's do-it with 1 for instance. Some reasonable people might think that it looks for a matching signature in Bar first and that slurpy would never let anything get past it. Yet, the call finds the right multi in the inheritance chain.

Does Bar already know all the signatures? Does it search or is all of that stuff already resolved when it is composed?

And, is there a way to find out at run time which class provided the method? Maybe with some call into HOW? This would be a handy debugging tool when I have a multi I've incorrectly specified and is being handled elsewhere.

Solution

The key thing to keep in mind with multiple dispatch is that it happens after sub or method resolution has taken place. So all multiple dispatch is actually a two step process. The two steps are also independent of each other.

When writing something like:

multi sub foo($x) { }
multi sub foo($x, $y) { }

The compiler will generate a:

proto sub foo(|) {*}

That is, unless you wrote a proto sub by yourself. The proto is what actually gets installed into the lexpad; a multi sub is never installed directly into the lexpad, but instead installed into the candidates list of the proto.

So, when calling a multi sub, the process is:

Find the sub to call using a lexical lookup, which resolves to the proto
Call the proto, which picks the best multi candidate and calls it

When there are multi candidates in nested scopes, the proto from an outer scope will be cloned and installed into the inner scope, and the candidate added to the clone.

A very similar process happens with multi methods, except:

Multi methods are just stored up in a todo list until the closing } of the class, role, or grammar
A proto may be provided by a role or a class, so composing a role with multi candidates just adds them to the todo list also
Finally, if there is multi methods with no proto, but a parent class has such a proto, that will be cloned; otherwise an empty proto will be made

Meaning that a call to a multi-method is:

Find the method using the usual method dispatch algorithm (which just searches classes using the C3 Method Resolution Order), which resolves to the proto
Call the proto, which picks the best multi candidate and calls it

The exact same sorting and selection algorithm are used for both multi subs and multi methods. The invocant, so far as the multiple dispatch algorithm cares, is just the first parameter. Furthermore, the Perl 6 multiple dispatch algorithm does not weight earlier arguments more heavily than later ones, so just as:

class A { }
class B is A { }
multi sub f(A, B) { }
multi sub f(B, A) { }

Would be considered tied, and give an ambiguous dispatch error if called with f(B, B), so would defining:

class B { ... }
class A {
    multi method m(B) { }
}
class B is A {
    multi method m(A) { }
}

And then calling B.m(B), since again the multi-dipsatcher just sees the type tuples (A, B) and (B, A).

Multiple dispatch itself is concerned with the concept of narrowness. A candidate C1 is narrower than C2 if at least one argument of C1 is a narrower type than the argument in the same position in C2, and all other arguments are tied (that is, not narrower, not wider). If the inverse is true then it is wider. Otherwise, it is tied. Some examples:

(Int) is narrower than (Any)
(Int) is tied with (Num)
(Int) is tied with (Int)
(Int, Int) is narrower than (Any, Any)
(Any, Int) is narrower than (Any, Any)
(Int, Any) is narrower than (Any, Any)
(Int, Int) is narrower than (Int, Any)
(Int, Int) is narrower than (Any, Int)
(Int, Any) is tied with (Any, Int)
(Int, Int) is tied with (Int, Int)

The multi-dipsatcher builds a directed graph of the candidates, where there is an edge from C1 to C2 whenever C1 is narrower than C2. It then finds all of the candidates with no incoming edges, and removes them. These are the first group of candidates. The removal will produce a new set of candidates with no incoming edges, which are then removed and become the second group of candidates. This continues until all candidates are taken from the graph, or if we reach a state where we can take nothing from the graph (a very rare situation, but this will be reported to the programmer as a circularity). This process happens once, not per dispatch, and it produces a set of groups of candidates. (Yes, it is just a topological sort, but the grouping detail is significant for what comes next.)

When a call happens, the groups are searched in order for a matching candidate. If two candidates in the same group match, and there are no tie-breakers (named parameters, where clauses or implied where clauses from subset types, unpacks, or is default) then an ambiguous dispatch will be reported. If all the groups are searched without a result being found, then the dispatch fails.

There are also some narrowness considerations with regard to arity (required parameter beats optional parameter or slurpy) and is rw (it's narrower than an otherwise equal candidate without is rw).

Once one or more candidates in a group have been found to match, then tie-breakers are considered. These include the presence of named parameters, where clauses, and unpacks, and work on a first-match-wins basis.

multi f($i where $i < 3) { } # C1
multi f($i where $i > 1) { } # C2
f(2) # C1 and C2 tied; C1 wins by textual ordering due to where

Note that this textual ordering is only applicable to the tie-breaking; so far as types go, the order of candidates in the source code is not important. (That named parameters also act only as tie-breakers is sometimes a source of surprise.)

Finally, I'll note that while the results of a multiple dispatch will always match the 2-step process I've described, in reality a good amount of runtime optimization takes place. While all lookups are initially resolved exactly as described, the outcome is placed into a dispatch cache, which provides much faster lookups than searching the groups delivered by the topological sort could. This is installed in such a way that the call of the proto can be entirely bypassed, saving a callframe. You can see artifacts of this behavior if you --profile; the auto-generated proto for any type-based dispatch (without tie-breakers) will receive a tiny number of calls compared to the multi candidates. This doesn't apply if you write custom logic in your proto, of course.

Beyond that, if you're running on MoarVM, the dynamic optimizer can go a bunch further. It can use collected and inferred type information both to resolve the method/sub dispatch and the multi dispatch, turning a 2-step process into a 0-step process. Small candidates can be inlined into the caller also (again, the profiler can tell you that the inlining has happened), which arguably turns a multi-dispatch into a -1 step process. :-)