pythonipythonstarcluster

IPython.parallel namespaces


I want to parallelize a function using IPython.parallel, and when I define it in the IPython shell it works flawlessly:

Type:       function
Base Class: <type 'function'>
String Form:<function gradient at 0x3ae0398>
Namespace:  Interactive
File:       /root/<ipython-input-30-cf7eabdfef84>
Definition: gradient(w) 
Source:
def gradient(w):
    s = (1.0 + exp(y * (X * w)))**-1
    return C*X.T*((1 - s) * y)

rc = Client() 
rc[:].apply_sync(gradient, w)
...

However, when I define it in a module and use import:

Type:       function
Base Class: <type 'function'>
String Form:<function gradient at 0x3933d70>
Namespace:  Interactive
File:       /root/mv.py
Definition: mv.gradient(w)
Source:
def gradient(w):
    s = (1.0 + exp(y * (X * w)))**-1
    return C*X.T*((1 - s) * y)

import mv 
rc = Client()
rc[:].apply_sync(mv.gradient, w)

CompositeError: one or more exceptions from call to method: gradient
[0:apply]: NameError: global name 'y' is not defined
[1:apply]: NameError: global name 'y' is not define

Furthermore, it works fine one my local system running Python 2.7.2/IPython 0.12, while it crashes on Python 2.7.2+/IPython 0.12 using the newest Starcluster Ubuntu AMI.

What is going on here?

UPDATE: I installed the IPython 0.13.dev version from github and now it works.


Solution

  • The difference is module globals. When a function is defined in a module, the global namespace is that of the module (i.e. mv.y). When that module is __main__, e.g. an interactively defined function, then the global namespace is your user_ns on the Engine, and is affected by execute("y=5").

    IPython provides a decorator, if you want to define functions in modules that should behave as if they are interactively defined (have access to the user namespace as globals):

    # mymod
    
    from IPython.parallel.util import interactive
    
    @interactive
    def by_a(b):
        """multiply a by b"
        return a*b
    

    And interactively, you can do:

    from mymod import by_a
    
    e0 = rc[0]
    e0.execute("a=5")
    print e0.apply_sync(by_a, 10) # 50
    e0.execute("a=10")
    print e0.apply_sync(by_a, 10) # 100