jupyter-notebookgoogle-colaboratory

Need to restart runtime before import an installed package in Colab


I am trying to install and use an existing python package in Google Colab. For this, I download the code from Github in Colab and install the package, but when trying to import the installed package, I get a ModuleNotFoundError: No module named 'gem' Error.

However, if I restart the runtime and run the importing cell again, then no error appears.

I am wondering why I need to restart the runtime after installing the package and before importing.

Any clever response will be much appreciated.

My code is:

[1] !wget --show-progress --continue -O /content/gem.zip https://github.com/palash1992/GEM/archive/master.zip

[2] !unzip gem.zip

# Installing Dependencies
[3] ! pip install keras==2.0.2

[4] %cd GEM-master
!sudo python3 setup.py install
%cd-

[5] from gem.utils import graph_util, plot_util

And the error that I get is:

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-5-af270a37878a> in <module>()
      1 import matplotlib.pyplot as plt
      2 
----> 3 from gem.utils import graph_util, plot_util
      4 from gem.evaluation import visualize_embedding as viz
      5 from gem.evaluation import evaluate_graph_reconstruction as gr

ModuleNotFoundError: No module named 'gem'

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

However, if I restart the runtime using os.kill(os.getpid(), 9) after installing the package and before importing it, then the above error does not appear.


Solution

  • It seems that everything except simple !pip installs seem to not get included in colab's module registry except after a runtime restart. Likely, colab has a fairly naive way of keeping track of available modules. You also have to restart the runtime if you import a different version of a previously installed package.

    Probably they just have a script that appends the metadata for piply installed packages to a list-like object during runtime. And imports just search from the top of the list (which is why the restart is required for diff versions of packages).

    However, when a new runtime is started, the list-like registry is initialized and populated by searching the relevant directories.