Scikit learn import statements in their tutorials are on the form
from sklearn.decomposition import PCA
Another versions that works is
import sklearn.decomposition
pca = sklearn.decomposition.PCA(n_components = 2)
However
import sklearn
pca = sklearn.decomposition.PCA(n_components = 2)
does not, and complains
AttributeError: module 'sklearn' has no attribute 'decomposition'
Why is this, and how can I predict which ones will work and not so i don't have to test around? If the understanding and predictiveness extends to python packages in general that would be the best.
sklearn
doesn't automatically import its submodules. If you want to use sklearn.<SUBMODULE>
, then you will need to import it explicitly e.g. import sklearn.<SUBMODULE>
. Then you can use it without any further imports like result = sklearn.<SUBMODULE>.function(...)
.
Large packages often behave this way where they don't automatically import all the submodules.
Memory and load-time efficiency become worse if the submodules are automatically loaded; by specifying the submodule explicitly it saves on memory consumption and minimises the start-up time. I think namespace cluttering is another consideration, where explicit imports reduce the chance of naming conflicts and help maintain clarity about the specific functionality being used.