Inspired by an answer about a plugin architecture, I was playing around with PEP-487's subclass registration and found that it led to surprising results when changing the code slightly.
The first step was to split the code from the answer linked above into two files:
$ cat a.py
class PluginBase:
subclasses = []
def __init_subclass__(cls, **kwargs):
print(f'__init_subclass__({cls!r}, **{kwargs!r})')
super().__init_subclass__(**kwargs)
cls.subclasses.append(cls)
if __name__ == '__main__':
from b import Plugin1, Plugin2
print('a:', PluginBase.subclasses)
$ cat b.py
from a import PluginBase
class Plugin1(PluginBase):
pass
class Plugin2(PluginBase):
pass
print('b:', PluginBase.subclasses)
$ python a.py
__init_subclass__(<class 'b.Plugin1'>, **{})
__init_subclass__(<class 'b.Plugin2'>, **{})
b: [<class 'b.Plugin1'>, <class 'b.Plugin2'>]
a: []
I found this output surprising, why is PluginBase
's subclasses
list empty when printed from a.py, but not from b.py?
Intuitively, I would've written the subclass registration line in a.py as
PluginBase.subclasses.append(cls)
instead of
cls.subclasses.append(cls)
because I want to operate on the PluginBase
's subclass
field rather than the respective Plugin*
's, but that alone didn't give the expected result either.
Then I found that the behaviour could be fixed by simply replacing a.py's line
from b import Plugin1, Plugin2
to
from b import *
which, when executing a.py, leads to the output I expected, namely
$ python a.py
__init_subclass__(<class 'b.Plugin1'>, **{})
__init_subclass__(<class 'b.Plugin2'>, **{})
b: [<class 'b.Plugin1'>, <class 'b.Plugin2'>]
a: [<class 'b.Plugin1'>, <class 'b.Plugin2'>]
Can someone enlighten me
cls.subclasses.append
rather than PluginBase.subclasses.append
andfrom b import *
rather than from b import Plugin1, Plugin2
in this context?1. Why we write cls.subclasses.append
rather than PluginBase.subclasses.append
These two mean different things, and depends on your intention:
cls.subclasses.append
means run .subclasses.append
looked up from the current class (cls
), in which case it executes b.Plugin1.subclasses.append
and b.Plugin2.subclasses.append
upon subclassing PluginBase
. If, as in your example, you don't override the attribute subclasses
on b.Plugin1
and b.Plugin2
, then it implicitly finds the attribute on PluginBase
, so it executes PluginBase.subclasses.append
. If you do override the attribute for example with a new list
, then the printed list will not include the class:
# b.py
from a import PluginBase
class Plugin1(PluginBase):
subclasses = []
class Plugin2(PluginBase):
pass
>>> print("b:", PluginBase.subclasses)
b: [<class 'b.Plugin2'>]
PluginBase.subclasses.append
means run .subclasses.append
on PluginBase
. It doesn't care what you do with the attribute subclasses
in the current class (cls
, that is, your b.Plugin1
and b.Plugin2
) - you can empty the list, set it to None
, ... and you'll always get your b: [<class 'b.Plugin1'>, <class 'b.Plugin2'>]
.
# b.py
from a import PluginBase
class Plugin1(PluginBase):
subclasses = []
class Plugin2(PluginBase):
pass
>>> print("b:", PluginBase.subclasses)
b: [<class 'b.Plugin1'>, <class 'b.Plugin2'>]
2. What the difference is between from b import *
rather than from b import Plugin1, Plugin2
in this context?
The examples you've given demonstrate a workflow bug.
If you're in a directory containing a.py
and you run python a.py
, you are implicitly creating a module called __main__
and executing the code in a.py
inside this module. This is why the code under if __name__ == "__main__"
actually runs - __main__
is the name of the module whose code is being executed, not a
.
Hence, running python a.py
with the following code:
# a.py
class PluginBase:
subclasses = []
def __init_subclass__(cls, **kwargs):
print(f'__init_subclass__({cls!r}, **{kwargs!r})')
super().__init_subclass__(**kwargs)
cls.subclasses.append(cls)
if __name__ == '__main__':
from b import Plugin1, Plugin2
print('a:', PluginBase.subclasses)
__main__
- so you'll have a class called __main__.PluginBase
Plugin1
and Plugin2
from b
, which (due to from a import PluginBase
) further executes the code in a.py
in a module called a
. a.PluginBase
is the one providing the output b: [<class 'b.Plugin1'>, <class 'b.Plugin2'>]
.print('a:', PluginBase.subclasses)
, because PluginBase.subclasses
is actually __main__.PluginBase.subclasses
- you haven't subclassed __main__.PluginBase
anywhere.When you do from b import *
, you override __main__.PluginBase
with a.PluginBase
due to how star imports work*, hence print('a:', PluginBase.subclasses)
gives a: [<class 'b.Plugin1'>, <class 'b.Plugin2'>]
.
*from b import *
will import all names in b
which doesn't start with an underscore if b
doesn't define a list or tuple of strings named __all__
; otherwise it imports all names defined in __all__
. Since b
imports the name PluginBase
from a
, b
has no object called __all__
, and the name PluginBase
doesn't start with an underscore, from b import *
will implicitly include from b import PluginBase
- overriding __main__.PluginBase
.