I have a luigi pipeline. We have a lot and lots of external files that change with a regular basis, and we want to be able to build the pipeline from metadata.
I create classes dynamically, and have found two ways to do it:
Using exec:
exec("""
class {system}(DeliverySystem):
pass
""".format(system='ClassUsingExec'))
Using type:
name = 'ClassUsingType'
globals()[name] = type(name, (DeliverySystem,),{})
Both of these work fine in single-threaded environments, but when I start running luigi with many workers spawning child-processes the exec version is fine but the type version gives errors as described in this post and this post (see them for more complete stack traces):
PicklingError: Can't pickle <class 'abc.ClassUsingType'>: attribute lookup abc.ClassUsingType failed.
The only diff I can find between the two is the module:
print(ClassUsingExec.__dict__) #=>
mappingproxy({'__module__': '__main__',
'__doc__': None,
'__abstractmethods__': frozenset(),
'_abc_impl': <_abc_data at 0x15b5063c120>,
'_namespace_at_class_time': ''})
print(ClassUsingType.__dict__) #=>
mappingproxy({'__module__': 'abc',
'__doc__': None,
'__abstractmethods__': frozenset(),
'_abc_impl': <_abc_data at 0x15b3f870450>,
'_namespace_at_class_time': ''})
It seems to module is different, and that might be the source of the diff.
Using Python 3.6, Windows 10, luigi 2.8.9.
Questions:
Is there a way to use type
to create a class so that its module is the module in which it is defined, and not in abc
?
Is there some other difference I am missing between the methods? According to this post there should be no difference, but I am not finding that to be the case.
The problem arises because:
ABC
(Abstract Base Class) as a meta-class, the module of that class will be abc
, not the name of the module where the class was defined.When a worker is assigned a Task, it will go to the module to load the task. Since the module is set to abc
and not the module where the class is dynamically created it will fail.
To make it work, all that is needed is to modify the class creation to modify the module:
type(name, (DeliverySystem,),{})
becomes
type(name, (DeliverySystem,),{'__module__':__name__})
Now when the worker gets assigned the task, it will go into the right module and re-create the class, and things will work!