recursiongroovytrampolines

Groovy's trampoline() makes recursive execution much slower - why?


I'm experimenting with recursion:

def fac
//fac = { int curr, res = 1G -> 1 >= curr ? res : fac( curr - 1, res * curr ) }
fac = { int curr, res = 1G -> 1 >= curr ? res : fac.trampoline( curr - 1, res * curr ) }
fac = fac.trampoline()

def rnd = new Random()

long s = System.currentTimeMillis()

100000.times{ fac rnd.nextInt( 40 ) }

println "done in ${System.currentTimeMillis() - s} ms / ${fac(40)}"

If I use it like this, I'm getting this:

done in 691 ms

If I uncomment line #2 and comment lines #3-4 to remove trampoline() and run it, I'm getting significantly lower numbers:

done in 335 ms

So, with trampoline the recursion works 2 times slower.

What am I missing?

P.S.

If I run the same example in Scala 2.12:

def fac( curr:Int, acc:BigInt = 1 ):BigInt = if( 1 >= curr ) acc else fac( curr - 1, curr * acc )
val s = System.currentTimeMillis
for( ix <- 0 until 100000 ) fac( scala.util.Random.nextInt(40).toInt )

println( s"done in ${System.currentTimeMillis - s} ms" )

it executes a bit faster:

done in 178 ms

UPDATE

Rewriting the closure to a method with the annotation:

@groovy.transform.TailRecursive
def fac( int curr, res = 1G ) { 1 >= curr ? res : fac( curr - 1, res * curr ) }
// the rest

gives

done in 164 ms

and is super-coll. Nevertheless, I still want to know about trampoline() :)


Solution

  • As stated in the documentation, Closure.trampoline() prevents from overflowing the call stack.

    Recursive algorithms are often restricted by a physical limit: the maximum stack height. For example, if you call a method that recursively calls itself too deep, you will eventually receive a StackOverflowException.

    An approach that helps in those situations is by using Closure and its trampoline capability.

    Closures are wrapped in a TrampolineClosure. Upon calling, a trampolined Closure will call the original Closure waiting for its result. If the outcome of the call is another instance of a TrampolineClosure, created perhaps as a result to a call to the trampoline() method, the Closure will again be invoked. This repetitive invocation of returned trampolined Closures instances will continue until a value other than a trampolined Closure is returned. That value will become the final result of the trampoline. That way, calls are made serially, rather than filling the stack.


    Source: http://groovy-lang.org/closures.html#_trampoline

    However, using trampoline comes with a cost. Let's take a look at the JVisualVM samples.

    Non-trampoline use case

    Running an example without trampoline() we get a result in ~441 ms

    done in 441 ms / 815915283247897734345611269596115894272000000000
    

    This execution allocates ~2,927,550 objects and consumes around 100 MB of memory.

    enter image description here

    The CPU has a little to do, and except spending time on main() and run() methods, it spends some cycles on coercing arguments.

    enter image description here

    The trampoline() use case

    Introducing the trampoline does change a lot. Firstly, it makes execution time almost two times slower compared to the previous attempt.

    done in 856 ms / 815915283247897734345611269596115894272000000000
    

    Secondly, it allocates ~5,931,470 (!!!) objects and consumes ~221 MB of memory. The main difference is that in the previous case a single of $_main_closure1 was used across all executions, and in case of using trampoline - every call to trampoline() method creates:

    Only this allocates more than 1,200,000 objects.

    enter image description here

    If it comes to the CPU, it also has much more things to do. Just look at the numbers:

    enter image description here

    And this is exactly why introducing trampoline in your case makes the code execution much slower.

    So why @TailRecursive does much better?

    In short - @TailRecursive annotation replaces all closures and recursive calls with good old while-loop. The factorial function with @TailRecursive looks something like this at the bytecode level:

    //
    // Source code recreated from a .class file by IntelliJ IDEA
    // (powered by Fernflower decompiler)
    //
    
    package factorial;
    
    import groovy.lang.GroovyObject;
    import groovy.lang.MetaClass;
    import java.math.BigInteger;
    import org.codehaus.groovy.runtime.ScriptBytecodeAdapter;
    import org.codehaus.groovy.runtime.dgmimpl.NumberNumberMultiply;
    import org.codehaus.groovy.transform.tailrec.GotoRecurHereException;
    
    public class Groovy implements GroovyObject {
        public Groovy() {
            MetaClass var1 = this.$getStaticMetaClass();
            this.metaClass = var1;
        }
    
        public static BigInteger factorial(int number, BigInteger acc) {
            BigInteger _acc_ = acc;
            int _number_ = number;
    
            try {
                while(true) {
                    try {
                        while(_number_ != 1) {
                            int __number__ = _number_;
                            int var7 = _number_ - 1;
                            _number_ = var7;
                            Number var8 = NumberNumberMultiply.multiply(__number__, _acc_);
                            _acc_ = (BigInteger)ScriptBytecodeAdapter.castToType(var8, BigInteger.class);
                        }
    
                        BigInteger var4 = _acc_;
                        return var4;
                    } catch (GotoRecurHereException var13) {
                        ;
                    }
                }
            } finally {
                ;
            }
        }
    
        public static BigInteger factorial(int number) {
            return factorial(number, (BigInteger)ScriptBytecodeAdapter.castToType(1, BigInteger.class));
        }
    }
    

    I have documented this use case on my blog some time ago. You can read the blog post if you want to get more information:

    https://e.printstacktrace.blog/tail-recursive-methods-in-groovy/