dockercontainerspopper

unable to use python modules in separate steps in a popper workflow


I have a requirements.txt file that I use when executing the following workflow:

steps:
  - id: install-python-modules
    uses: popperized/python-actions@master
    args:
    - pip install -r requirements.txt

  - id: run-script
    uses: popperized/python-actions@master
    args:
    - python my_script.py

The problem I have is that when the run-script step runs, it doesn't have the modules that were installed in the first step.


Solution

  • The problem is that when pip install runs as part of the first step, it installs the python modules to the /usr/local folder inside the container. Since the second step is instantiating another container, those modules are not available. This is explained in more detail in the official Popper documentation, where the relationship between the namespaces of the container and the host machine where the workflow resides are explained.

    Using the official python docker image and explicitly dealing with the virtualenv could help:

    steps:
    - id: install-requirements
      uses: docker://python:3.8-slim-buster
      runs: [bash, -uec]
      args:
      - |
        python -mvenv venv/
        source venv/bin/activate
        pip install -r requirements.txt
    
    - id: run-sim
      uses: docker://python:3.8-slim-buster
      runs: [bash, -uec]
      args:
      - |
        source venv/bin/activate
        python my_script.py
    

    the first step creates a virtual environment on the folder where popper run is being invoked from (a venv/ folder) and installs requirements there. The second loads the environment before running the script.

    You might be wondering how python:3.8-slim-buster was selected. In general there are three types of official images on docker hub, those based on 1) full debian, 2) debian-slim and 3) alpine. I usually use the latest stable release of python (3.8 as of today), and start with alpine cause it has a tiny footprint compared to the debian images. If requirements make use of system libraries though (e.g. numpy), alpine usually doesn't work because there are no pre-compiled binaries (wheels) for it, so then I move to the slim variant of debian-based images