I am currently packaging my own module for distribution. In general everything is working fine, but fine-tuning/best-practice for structuring sub-modules is giving me some trouble.
Assuming a module structure of:
mdl
├── mdl
│ ├── __init__.py
│ ├── core.py
| ├── sub_one
| | ├── __init__.py
| | └── core_sub_one.py
| └── sub_two
| ├── __init__.py
| └── core_sub_two.py
├── README
└── setup.py
With the header of core.py
starting with:
import numpy as np
...some fairly large module code...
And the headers of both core_sub_one.py
and core_sub_two.py
starting with:
import numpy as np
from .. import core as cr
So all submodules require np
and cr
.
The mdl/__init__.py
(core-level) looks like:
from . import sub_one as so
from . import sub_two as st
And __init__.py
of both submodules looks like (replace one
with two
for the other submodule):
from . import core_sub_one
from .core_sub_one import *
I've "learnt" this structure from numpy
, see f.i. numpy/ma/__init__.py
Now I've got some trouble with the submodule-access after running setup.py
and importing my module with import mdl
.
I can now access my submodules with f.i. mdl.so.some_function_in_sub_one()
. This is expected and what I want.
But I can also access the top level module cr
and numpy
with mdl.so.cr
and mdl.so.np
, which I want to avoid. Is there any way to avoid this?
If not: Is there any drawback of importing/connecting modules and submodules like this?
And is there any best practice for how to import libraries like numpy
in sub-modules, when they are required in all submodules?
Edit:
Since some seem to have trouble with the fact that asking for best practice is opinion based (which I know and which I intended, since imho most design decisions in real life are not clear binary 1-0 decisions), I have to add:
I want to comply with the module packaging style used in the scipy
, and more specifically numpy
, package environment. So if these packages found a solution for any of the questions I asked, this will be the most welcome solution for me.
First thing first:
from .core_sub_one import *
DONT DO THIS. Yes, even if you seen it in some "big name" package, read it in some tutorials or whatever. This is officially considered bad practice, and for good reasons (from experience, it's a maintaince hell).
If you really really insist on doing this (but seriously, don't), at least define an explicit __all__
var in those modules so you keep exposed names under control (and it helps documenting what's supposed to be part of the module's API).
But I can also access the top level module cr and numpy with mdl.so.cr and mdl.so.np, which I want to avoid. Is there any way to avoid this?
Not really. If you're really worried about it, you can import those names as "protected" in your submodules:
# core_sub_xxx.py
import numpy as _np
from .. import core as _cr
(of course you'll have to replace all occurrences of 'np' and 'cr' but any half-decent text editor can do this)
This doesn't prevent access to mysubmodule._cr
or mysubmodule._np
but at least it makes it clear that one should NOT access those names.
But really, this is not a big issue, as long as your API is clearly documented.