multithreadinggroovyprocessgpars

How to make capturing output from an external process thread-safe?


I've written a small method to execute the git command line tool and capture its output:

def git(String command) {
    command = "git ${command}"

    def outputStream = new StringBuilder()
    def errorStream = new StringBuilder()
    def process = command.execute()
    process.waitForProcessOutput(outputStream, errorStream)

    return [process.exitValue(), outputStream, errorStream, command]
}

I'm using it with GPars to clone multiple repositories simultaneously like

GParsPool.withPool(10) {
    repos.eachParallel { cloneUrl, cloneDir->
        (exit, out, err, cmd) = git("clone ${cloneUrl} ${cloneDir}")
        if (exit != 0) {
            println "Error: ${cmd} failed with '${errorStream}'."
        }
    }
}

However, I believe my git method it not thread-safe: For example, a second thread could modify command in the first line of the method before the first thread reached command.execute() in the fifth line of the method.

I could solve this by making the whole git method synchronized, but that would defeat the purpose of running it in different threads as I want clones to happen in parallel.

So I was thinking to do partial synchronization like

def git(String command) {
    def outputStream
    def errorStream
    def process

    synchronized {
        command = "git ${command}"

        outputStream = new StringBuilder()
        errorStream = new StringBuilder()
        process = command.execute()
    }

    process.waitForProcessOutput(outputStream, errorStream)

    return [process.exitValue(), outputStream, errorStream, command]
}

But I guess that also is not safe as in thread two waitForProcessOutput() might return earlier than in thread one, screwing up the outputStream / errorStream variables.

What is the correct way to get this thread-safe?


Solution

  • Change the assignment statement inside the eachParallel closure argument as follows:

            def (exit, out, err, cmd) = git("clone ${cloneUrl} ${cloneDir}")
    

    This will make the variables local to the closure, which in turn will make them thread-safe. The git() method is fine as is.