ipythonipython-parallel

Real-time output from engines in IPython parallel?


I am running a bunch of long-running tasks with IPython's great parallelization functionality.

How can I get real-time output from the ipengines' stdout in my IPython client?

E.g., I'm running dview.map_async(fun, lots_of_args) and fun prints to stdout. I would like to see the outputs as they are happening.

I know about AsyncResult.display_output(), but it's only available after all tasks have finished.


Solution

  • You can see stdout in the meantime by accessing AsyncResult.stdout, which will return a list of strings, which are the stdout from each engine.

    The simplest case being:

    print ar.stdout
    

    You can wrap this in a simple function that prints stdout while you wait for the AsyncResult to complete:

    import sys
    import time
    from IPython.display import clear_output
    
    def wait_watching_stdout(ar, dt=1, truncate=1000):
        while not ar.ready():
            stdouts = ar.stdout
            if not any(stdouts):
                continue
            # clear_output doesn't do much in terminal environments
            clear_output()
            print '-' * 30
            print "%.3fs elapsed" % ar.elapsed
            print ""
            for eid, stdout in zip(ar._targets, ar.stdout):
                if stdout:
                    print "[ stdout %2i ]\n%s" % (eid, stdout[-truncate:])
            sys.stdout.flush()
            time.sleep(dt)
    

    An example notebook illustrating this function.

    Now, if you are using older IPython, you may see an artificial restriction on access of the stdout attribute ('Result not ready' errors). The information is available in the metadata, so you can still get at it while the task is not done:

    rc.spin()
    stdout = [ rc.metadata[msg_id]['stdout'] for msg_id in ar.msg_ids ]
    

    Which is essentially the same thing that the ar.stdout attribute access does.