[SOLVED] Groovy Gpars poor parallel performance compared to serial

Groovy Gpars poor parallel performance compared to serial

I'm experiencing worse than expected performance from Groovy Gpars while experimenting with multithreading on an i7-2960xm (4 core hyperthreaded). In my test I've been using a recursive fib calculator to simulate workload:

def fibRecursive(int index) {
    if (index == 0 || index == 1) {
        return index
    }
    else {
        return fibRecursive(index - 2) + fibRecursive(index - 1)
    }
}

To test Gpars I am currently using the following code:

def nums = [36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36]

GParsPool.withPool(4) {
    nums.eachParallel {
        SplitTimer internalTimer = new SplitTimer()
        println("fibRecursive(${it}): ${fibRecursive(it)}")
        internalTimer.split("fibRecursive(${it})")

        for (instance in internalTimer.splitTimes) {
            println(instance)
        }
    }
}

Calculating fib(36) in parallel takes around 1.9 seconds for withPool(4). withPool(1) takes around 1.4 seconds which I assumed would be somewhat similar to calling the function outside of Gpars but that only takes 0.4 seconds, eg:

nums.each {
    SplitTimer internalTimer = new SplitTimer()
    println("fibRecursive(${it}): ${fibRecursive(it)}")
    internalTimer.split("fibRecursive(${it})")

    for (instance in internalTimer.splitTimes) {
        println(instance)
    }
}

Could someone explain why I might be experiencing this kind of performance hit? Thanks!

Here is my SplitTimer just in case:

class SplitTimer {
    long initialTime
    int instances = 0

    class Instance {
        int index
        String name
        long time

        def elapsed() {
            return time - initialTime
        }

        def Instance(String instanceName) {
            this.index = this.instances++
            this.name = instanceName
            this.time = System.nanoTime()
        }

        String toString() {
            return "[Instance ${this.index}: \"${this.name}\" (${Formatter.elapsed(this.elapsed())} elapsed)]"
        }
    }

    def splitTimes = []

    def SplitTimer() {
        def initialInstance = new Instance("Start")
        this.initialTime = initialInstance.time
        splitTimes.add(initialInstance)
    }

    def split(String instanceName) {
        splitTimes.add(new Instance(instanceName))
    }
}

class Formatter {
    static int hours
    static int minutes
    static int seconds
    static int nanoseconds

    static setValues(time) {
        nanoseconds = time % 10**9

        seconds = time / 10**9
        minutes = seconds / 60
        hours = minutes / 60

        seconds %= 60
        minutes %= 60
    }

    static elapsed(time) {
        setValues(time)
        return "${hours}:" + "${minutes}:".padLeft(3, "0") + "${seconds}.".padLeft(3, "0") + "${nanoseconds}".padLeft(9,"0")
    }

    static absolute(time) {
        setValues(time)
        hours %= 24
        return "${hours}:".padLeft(3, "0") + "${minutes}:".padLeft(3, "0") + "${seconds}.".padLeft(3, "0") + "${nanoseconds}".padLeft(9,"0")
    }
}

Solution

The workload it takes to parallelize a small number of function calls can be more than it takes to run them sequentially. However, this could change if you, for example, called fibonacci with very big numbers.