Strange behaviour using __init_subclass__ with multiple modules

Inspired by an answer about a plugin architecture, I was playing around with PEP-487's subclass registration and found that it led to surprising results when changing the code slightly.

The first step was to split the code from the answer linked above into two files:

$ cat a.py 
class PluginBase:
    subclasses = []
    def __init_subclass__(cls, **kwargs):
        print(f'__init_subclass__({cls!r}, **{kwargs!r})')
        super().__init_subclass__(**kwargs)
        cls.subclasses.append(cls)

if __name__ == '__main__':
    from b import Plugin1, Plugin2
    print('a:', PluginBase.subclasses)

$ cat b.py 
from a import PluginBase
class Plugin1(PluginBase):
    pass

class Plugin2(PluginBase):
    pass

print('b:', PluginBase.subclasses)

$ python a.py
__init_subclass__(<class 'b.Plugin1'>, **{})
__init_subclass__(<class 'b.Plugin2'>, **{})
b: [<class 'b.Plugin1'>, <class 'b.Plugin2'>]
a: []

I found this output surprising, why is PluginBase's subclasses list empty when printed from a.py, but not from b.py?

Intuitively, I would've written the subclass registration line in a.py as

        PluginBase.subclasses.append(cls)

instead of

        cls.subclasses.append(cls)

because I want to operate on the PluginBase's subclass field rather than the respective Plugin*'s, but that alone didn't give the expected result either.

Then I found that the behaviour could be fixed by simply replacing a.py's line

    from b import Plugin1, Plugin2

    from b import *

which, when executing a.py, leads to the output I expected, namely

$ python a.py
__init_subclass__(<class 'b.Plugin1'>, **{})
__init_subclass__(<class 'b.Plugin2'>, **{})
b: [<class 'b.Plugin1'>, <class 'b.Plugin2'>]
a: [<class 'b.Plugin1'>, <class 'b.Plugin2'>]

Can someone enlighten me

Why we write cls.subclasses.append rather than PluginBase.subclasses.append and
What the difference is between from b import * rather than from b import Plugin1, Plugin2 in this context?

Solution

1. Why we write cls.subclasses.append rather than PluginBase.subclasses.append

These two mean different things, and depends on your intention:

cls.subclasses.append means run .subclasses.append looked up from the current class (cls), in which case it executes b.Plugin1.subclasses.append and b.Plugin2.subclasses.append upon subclassing PluginBase. If, as in your example, you don't override the attribute subclasses on b.Plugin1 and b.Plugin2, then it implicitly finds the attribute on PluginBase, so it executes PluginBase.subclasses.append. If you do override the attribute for example with a new list, then the printed list will not include the class:
```
# b.py
from a import PluginBase

class Plugin1(PluginBase):
    subclasses = []

class Plugin2(PluginBase):
    pass
```
```
>>> print("b:", PluginBase.subclasses)  
b: [<class 'b.Plugin2'>]
```
PluginBase.subclasses.append means run .subclasses.append on PluginBase. It doesn't care what you do with the attribute subclasses in the current class (cls, that is, your b.Plugin1 and b.Plugin2) - you can empty the list, set it to None, ... and you'll always get your b: [<class 'b.Plugin1'>, <class 'b.Plugin2'>].
```
# b.py
from a import PluginBase

class Plugin1(PluginBase):
    subclasses = []

class Plugin2(PluginBase):
    pass
```
```
>>> print("b:", PluginBase.subclasses)  
b: [<class 'b.Plugin1'>, <class 'b.Plugin2'>]
```

2. What the difference is between from b import * rather than from b import Plugin1, Plugin2 in this context?

The examples you've given demonstrate a workflow bug.

If you're in a directory containing a.py and you run python a.py, you are implicitly creating a module called __main__ and executing the code in a.py inside this module. This is why the code under if __name__ == "__main__" actually runs - __main__ is the name of the module whose code is being executed, not a.

Hence, running python a.py with the following code:

# a.py
class PluginBase:
    subclasses = []
    def __init_subclass__(cls, **kwargs):
        print(f'__init_subclass__({cls!r}, **{kwargs!r})')
        super().__init_subclass__(**kwargs)
        cls.subclasses.append(cls)

if __name__ == '__main__':
    from b import Plugin1, Plugin2
    print('a:', PluginBase.subclasses)

Executes all of this under a module called __main__ - so you'll have a class called __main__.PluginBase
Imports Plugin1 and Plugin2 from b, which (due to from a import PluginBase) further executes the code in a.py in a module called a. a.PluginBase is the one providing the output b: [<class 'b.Plugin1'>, <class 'b.Plugin2'>].
Prints an empty list at print('a:', PluginBase.subclasses), because PluginBase.subclasses is actually __main__.PluginBase.subclasses - you haven't subclassed __main__.PluginBase anywhere.

When you do from b import *, you override __main__.PluginBase with a.PluginBase due to how star imports work*, hence print('a:', PluginBase.subclasses) gives a: [<class 'b.Plugin1'>, <class 'b.Plugin2'>].

_{*from b import * will import all names in b which doesn't start with an underscore if b doesn't define a list or tuple of strings named __all__; otherwise it imports all names defined in __all__. Since b imports the name PluginBase from a, b has no object called __all__, and the name PluginBase doesn't start with an underscore, from b import * will implicitly include from b import PluginBase - overriding __main__.PluginBase.}