My project was up-and-running for a while running in a kubernetes container... until, I decided to "clean-up" my use of the sys.add
calls that I had at the top of my modules. This included describing my dependencies in pyproject.toml
, and all-together ditching setup.py
; it imported setup tools, called setup()
when __main__
.
The design intent is not to run anything in /tnc/app
as a script. But rather, a collection of modules, or a package. The only part of the codebase that serves as a __main__
is the api.py
file. It initializes and fires-up flask.
I have a lean deployment setup that consists of the following:
/opt/venv
/app/tnc
/app/bin/api
I kick-off the flask app with: python /app/bin/api
.
The build takes place in the python:3.11-slim
docker image. Here I install the recommended gcc and specify the following in the dockerfile:
-- build
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY pyproject.toml project.toml
RUN pip3 install -e . -- << aside: better would be to use python -m pip3 install -e .
I then copy the following from the build into my runtime image.
-- runtime
ENV PATH "/opt/venv/bin:$PATH"
ENV PYTHONPATH "/opt/venv/bin:/app/tnc"
COPY --chown=appuser:appuser bin bin
COPY --chown=appuser:appuser tnc tnc
COPY --chown=appuser:appuser config.py config.py
COPY --from=builder /opt/venv/ /opt/venv
As I mentioned, in the kubernetes deployment I fire-up the container with:
command: ["python3"]
args: ["bin/api"]
Firing up the container in such a way that I can run the python REPL:
import flask
generates AttributeError ...replace(' -> None', '')
/app/tnc
from the PYTHONPATH
, import flask
generates ModuleNotFound ... no tnc
AttributeError ...replace(' -> None', '')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/venv/lib/python3.10/site-packages/werkzeug/__init__.py", line 2, in <module>
from .test import Client as Client
File "/opt/venv/lib/python3.10/site-packages/werkzeug/test.py", line 35, in <module>
from .sansio.multipart import Data
File "/opt/venv/lib/python3.10/site-packages/werkzeug/sansio/multipart.py", line 19, in <module>
class Preamble(Event):
File "/usr/local/lib/python3.10/dataclasses.py", line 1175, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash,
File "/usr/local/lib/python3.10/dataclasses.py", line 1093, in _process_class
str(inspect.signature(cls)).replace(' -> None', ''))
AttributeError: module 'inspect' has no attribute 'signature'
ModuleNotFoundError: No module named 'tnc'
appuser@tnc-py-deployment-set-1:/app$ echo $PYTHONPATH
/opt/venv/bin
appuser@tnc-py-deployment-set-1:/app$ echo $PATH
/opt/venv/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
appuser@tnc-py-deployment-set-1:/app$ python -m /app/bin/api
/opt/venv/bin/python: No module named /app/bin/api
appuser@tnc-py-deployment-set-1:/app$ python /app/bin/api
Traceback (most recent call last):
File "/app/bin/api", line 12, in <module>
from tnc.s3 import S3Session
ModuleNotFoundError: No module named 'tnc'
├── bin
│ └── api
├── config.py
├── pyproject.toml
└── tnc
├── __init__.py
├── data
│ ├── __init__.py
│ ├── download.py
│ ├── field_types.py
│ └── storage_providers
├── errors.py
├── inspect
│ ├── __init__.py
│ └── etl_time_index.py
├── test
│ ├── __init__.py
│ └── test_end-to-end.py
├── utils.py
└── www
├── __init__.py
└── routes
├── __init__.py
├── feedback.py
├── livez.py
└── utils.py
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
[tool.setuptools.packages.find]
where = ["./"]
exclude = [ "res", "notes" ]
dependencies = [ ... with version specs ]
First, I have to shout-out to the pyproject.toml
+ setuptools
team: the documentation and implementation has gotten good. It allowed me to get a lot more specific and "deterministic" :)) about my setup. Not to mention, a bit more aggressive in the build process.
The fix included the following:
pyproject.toml
with the following[tool.setuptools.package-dir]
tnc = "tnc"
bin = "bin"
# entry point (not required but is ergonomic)
[project.scripts]
run-api = "bin.api:main"
I included a __init__
to mark each submodule.
config.py
file into the bin
directory. This location captured my design intent. Changes to the api.py
file...# instantiate the config object using a string ref to the config.py
app.config.from_object("bin.config.DevelopmentConfig")
...
# added a def main() to enable the option of specifying an entry point
if __name__ == '__main__':
logging.basicConfig(level=logging.DEBUG)
app.run(host=app.config['HOST'], port=app.config['PORT'])
def main():
""" if using entrypoint script """
logging.basicConfig(level=logging.DEBUG)
app.run(host=app.config['HOST'], port=app.config['PORT'])
PYTHONPATH
env value to "/app", the location of the tnc
and bin
directories. By no means a best practice, but in this case, given my determination to have bin
separate from tnc
, the only way that made sense. This use case seemed the right way to go.Finally, while there are a few well known techniques to maximize the reuse of the cache when building the docker image, I wanted to call out how easy it was to know precisely what was going on during the build, made possible by the latest setuptool
configured with pyproject.toml
.
A. It was trivial to first run the build using empty stub for where the app code would eventually go.
# pyproject.dependencies.toml
packages = ["tnc"]
... paired with the 2 phased build (the image is an official docker python image)
# Make sure to use the venv from the python base img:
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# phase 1: dependency build using an empty project dir
COPY pyproject.dependencies.toml pyproject.toml
RUN mkdir tnc
RUN pip3 install .
# phase 2: full and final build
COPY bin bin
COPY tnc tnc
COPY pyproject.toml pyproject.toml
RUN pip3 install .
B. It was clear what to copy from the now consolidated build artifacts, into my image used for distribution
COPY --from=builder --chown=appuser:appuser /app/build/lib/tnc tnc
COPY --from=builder --chown=appuser:appuser /app/build/lib/bin bin
COPY --from=builder --chown=root:root /opt/venv/ /opt/venv
In the kube deployment, despite being able call the entry point configured using pyproject.toml
, I chose to call the api.py
as a script.
# in the kube deployment for the image
command: ["python"]
args: ["/app/bin/api.py"]
I have an improved design that no longer includes "ad-hoc" calls to sys.path, nor resorts to "polluting" the
PYTHONPATH. The single entry I now have,
/app`, conveys an important design choice: wanting to have the entry point be in a separate root directory.