TL;DR: How do I read a module's __all__
definition and dynamically add it to the package-level __init__.py
without actually running any slow code in the module itself?
I am writing a library and have a package structure not unlike this:
library/
package1/
__init__.py # sub-package __init__
_module_a.py
_module_b.py
package2/
__init.__py # package level __init__
subpackage/
__init__.py # sub-package __init__
_module_d.py
_module_e.py
_module_f.py
_module_g.py
__init__.py # Library level __init__
I use '_' prefixes on all my modules because I want to tightly control what the user can see whenever they call something like dir(library.package1)
. To that end, I make sure each module where this is the case has an __all__
list defined.
For example,
"""Inside of _module_e.py"""
import time
__all__ = ["Foo", "Bar"]
# do computationally intensive stuff
time.sleep(5)
class Foo:
pass
class Bar:
pass
and
"""Inside of _module_f.py"""
import time
__all__ = ["Baz"]
# do more stuff that takes a long time
time.sleep(5)
class Baz:
pass
To make sure that time isn't wasted running all the computationally expensive code, a user wanting to use the Baz
class might normally write
from library.package2.subpackage._module_f import Baz
but I think this is way clunkier than writing something nice like from library.package2.subpackage import Baz
. Clearly then, I have to do something in the sub-package's _init_.py file to enable this desired import behaviour.
Without restructuring my files, is it possible to dynamically import modules as and when they are needed? Should I restructure/refactor my files in some way? Is there some other approach I'm missing?
I know I can define a __getattr__(name)
in the _init_.py file and use importlib
to dynamically import from a module, but that still requires me to hand-copy the contents of each module's __all__
list into the __all__
list of the _init_.py file, like below
"""Inside of subpackage/__init__.py"""
import importlib
# I have to create the below dictionary and maintain it manually!!!
defined_classes = {
"Foo": "_module_e",
"Bar": "_module_e",
"Baz": "_module_f"
}
__all__ = [] + list(defined_classes.keys())
def __dir__():
return __all__
def __getattr__(name):
if name in defined_classes:
file = defined_classes[name]
return getattr(_importlib.import_module(f'library.package2.subpackage.{file}'), name)
else:
try:
return globals()[name]
except KeyError:
raise AttributeError(f"Module 'subpackage' has no attribute '{name}'")
I'm sure I could write a quick and dirty method to with open(filename) as f
and parse lines of each module file until I find something that looks like an __all__
list to procedurally generate my defined_classes
mapping, but I don't know what the best way of doing this is (or if there is a better solution native to Python already).
I came to the conclusion that it was probably best I refactored my modules instead so that the slow code only ran inside of functions that were called (and cached) as needed - and the burden of deciding when to run this code shouldn't fall on init.py.
Another benefit of not messing with init is that my IDE better understands what's going on and provides me with the relevant tooltips.