I have a complex unpickable object that has properties (defined via getters and setters) that are of complex and unpickable type as well. I want to create a multiprocessing proxy for the object to execute some tasks in parallel.
The problem: While I have succeeded to make the getter methods available for the proxy object, I fail to make the getters return proxies for the unpickable return objects.
My setup resembles the following:
from multiprocessing.managers import BaseManager, NamespaceProxy
class A():
@property
def a(self):
return B()
@property
def b(self):
return 2
# unpickable class
class B():
def __init__(self, *args):
self.f = lambda: 1
class ProxyBase(NamespaceProxy):
_exposed_ = ('__getattribute__', '__setattr__', '__delattr__')
class AProxy(ProxyBase): pass
class BProxy(ProxyBase): pass
class MyManager(BaseManager):pass
MyManager.register('A', A, AProxy)
if __name__ == '__main__':
with MyManager() as manager:
myA = manager.A()
print(myA.b) # works great
print(myA.a) # raises error, because the object B is not pickable
I know that I can specify the result type of a method when registering it with the manager. That is, I can do
MyManager.register('A', A, AProxy, method_to_typeid={'__getattribute__':'B'})
MyManager.register('B', B, BProxy)
if __name__ == '__main__':
with MyManager() as manager:
myA = manager.A()
print(myA.a) # works great!
print(myA.b) # returns the same as myA.a ?!
It is clear to me that my solution does not work since the __getattr__
method applies to all properties, whereas I only want it to return a proxy for B
when property a
is accessed. How could I achieve this?
As a side question: if I remove the *args
argument from the __init__
method of B
, I get an error that it is called with the wrong number of arguments. Why? How could I resolve this?
I don't this is possible without some hacks, since the choice to return a value or proxy is made based on the method name alone, and not the type of the return value (from Server.serve_client
):
try:
res = function(*args, **kwds)
except Exception as e:
msg = ('#ERROR', e)
else:
typeid = gettypeid and gettypeid.get(methodname, None)
if typeid:
rident, rexposed = self.create(conn, typeid, res)
token = Token(typeid, self.address, rident)
msg = ('#PROXY', (rexposed, token))
else:
msg = ('#RETURN', res)
Also keep in mind exposing __getattribute__
in an unpickable class's proxy basically breaks the proxy functionality when calling methods.
But if you're willing to hack it and just need attribute access, here is a working solution (note calling myA.a.f()
still won't work, the lambda is an attribute and is not proxied, only methods are, but that's a different problem).
import os
from multiprocessing.managers import BaseManager, NamespaceProxy, Server
class A():
@property
def a(self):
return B()
@property
def b(self):
return 2
# unpickable class
class B():
def __init__(self, *args):
self.f = lambda: 1
self.pid = os.getpid()
class HackedObj:
def __init__(self, obj, gettypeid):
self.obj = obj
self.gettypeid = gettypeid
def __getattribute__(self, attr):
if attr == '__getattribute__':
return object.__getattribute__(self, attr)
obj = object.__getattribute__(self, 'obj')
result = object.__getattribute__(obj, attr)
if isinstance(result, B):
gettypeid = object.__getattribute__(self, 'gettypeid')
# This tells the server that the return value of this method is
# B, for which we've registered a proxy.
gettypeid['__getattribute__'] = 'B'
return result
class HackedDict:
def __init__(self, data):
self.data = data
def __setitem__(self, key, value):
self.data[key] = value
def __getitem__(self, key):
obj, exposed, gettypeid = self.data[key]
if isinstance(obj, A):
gettypeid = gettypeid.copy() if gettypeid else {}
# Now we need getattr to update gettypeid based on the result
# luckily BaseManager queries the typeid info after the function
# has been invoked
obj = HackedObj(obj, gettypeid)
return (obj, exposed, gettypeid)
class HackedServer(Server):
def __init__(self, registry, address, authkey, serializer):
super().__init__(registry, address, authkey, serializer)
self.id_to_obj = HackedDict(self.id_to_obj)
class MyManager(BaseManager):
_Server = HackedServer
class ProxyBase(NamespaceProxy):
_exposed_ = ('__getattribute__', '__setattr__', '__delattr__')
class AProxy(ProxyBase): pass
class BProxy(ProxyBase): pass
MyManager.register('A', callable=A, proxytype=AProxy)
MyManager.register('B', callable=B, proxytype=BProxy)
if __name__ == '__main__':
print("This process: ", os.getpid())
with MyManager() as manager:
myB = manager.B()
print("Proxy process, using B directly: ", myB.pid)
myA = manager.A()
print('myA.b', myA.b)
print("Proxy process, via A: ", myA.a.pid)
The key to the solution is to replace the _Server
in our manager, and then wrap the id_to_obj
dict with the one that performs the hack for the specific method we need.
The hack consists on populating the gettypeid
dict for the method, but only after it has been evaluated and we know the return type to be one that we would need a proxy for. And we're lucky in the order of evaluations, gettypeid
is accessed after the method has been called.
Also luckily gettypeid
is used as a local in the serve_client
method, so we can return a copy of it and modify it and we don't introduce any concurrency issues.
While this was a fun exercise, I have to say I really advise against this solution, if you're dealing with external code that you cannot modify, you should simply create your own wrapper class that has explicit methods instead of @property
accessors, proxy your own class instead, and use method_to_typeid
.