pythonipythonipython-parallel

Import custom modules on IPython.parallel engines with sync_imports()


I've been playing around with IPython.parallel and I wanted to use some custom modules of my own, but haven't been able to do it as explained on the cookbook using dview.sync_imports(). The only thing that has worked for me was something like

def my_parallel_func(args):
    import sys
    sys.path.append('/path/to/my/module')
    import my_module
    #and all the rest

and then in the main just to

if __name__=='__main__':
     #set up dview...
     dview.map( my_parallel_func, my_args )

The correct way to do this would in my opinion be something like

 with dview.sync_imports():
     import sys
     sys.path.append('/path/to/my/module')
     import my_module

but this throws an error saying there is no module named my_module.

So, what is the right way of doing it using dview.sync_imports()??


Solution

  • The problem is that you're changing the PYTHONPATH just in the local process running the Client, and not in the remote processes running in the ipcluster.

    You can observe this behaviour if you run the next piece of code:

    from IPython.parallel import Client
    
    rc = Client()
    dview = rc[:]
    
    with dview.sync_imports():
        import sys
        sys.path[:] = ['something']
       
    def parallel(x):
        import sys
        return sys.path
    
    print 'Local: ', sys.path
    print 'Remote: ', dview.map_sync(parallel, range(1))
    

    Basically all the modules that you want to use with sync_imports must already be in the PYTHONPATH.

    If it's not in the PYTHONPATH then you must add it to the path in the function that you execute remotely, and then import the module in the function.