Hi I am trying to build live prediction with the YOLO. The goal is to stream the data with some kind of transformation from the inference to a final frontend.
The flow should be like this:
The idea is to put everything into a micro service sturcture to have independent and scalable solutions. I know in bigger scale a architecture with Kafka and Spark would be the more efficient way, but for this project I want to put in a micro service architrcture.
My problem now is that I want to share some utils and also some schemas between the services. My idea is to use a base container which I am building so that I can use this container as base image for all the container which need the schemas. Due to the reason that everything should end up in one product I also want to have it in a monorepo.
I also know schema sharing for micro services is not best practice but for this usecase it would really help.
Here shortly a simplyfied idea for our structure(here with videos which get live calculated):
.
├── data/
│ ├── weights
│ ├── model_data
│ └── inference_tests
├── model_training/
│ ├── train.py
│ ├── prep.py
│ └── eval.py
├── services/
│ ├── shared/
│ │ ├── Dockerfile
│ │ ├── schemas/
│ │ │ ├── stats.py
│ │ │ └── raw_data.py
│ │ └── db_utils
│ ├── inference/
│ │ ├── Dockerfile
│ │ ├── pyproject.toml
│ │ ├── main.py
│ │ └── src/
│ │ └── all_stuff.py
│ ├── etl_process/
│ │ ├── Dockerfile
│ │ ├── pyproject.toml
│ │ ├── main.py
│ │ └── src/
│ │ └── all_stuff.py
│ ├── backend_for_frontend/
│ │ ├── Dockerfile
│ │ ├── pyproject.toml
│ │ ├── main.py
│ │ └── src/
│ │ └── all_stuff.py
│ └── frontend/
│ ├── Dockerfile
│ ├── pyproject.toml
│ ├── main.py
│ └── src/
│ └── all_stuff.py
└── docker-compose.yaml
In the end I want to combine everything with docker-compose like this:
version: "3.8"
services:
# Base image for shared code
shared-base:
build:
context: ./services/shared
dockerfile: Dockerfile.base
image: shared-base-image
db:
image: postgres:13
volumes:
- db_data:.local/postgresql/data # .local is in the .gitignore
environment:
POSTGRES_USER: myuser
POSTGRES_PASSWORD: mypassword
POSTGRES_DB: mydb
ports:
- "5432:5432"
inference:
build:
context: ./services/inference
dockerfile: Dockerfile
depends_on:
- shared-base
- db
volumes:
- ./data/video:/input_videos # not stream yet
environment:
DB_HOST: db
DB_USER: myuser
DB_PASSWORD: mypassword
DB_NAME: mydb
etl-process:
build:
context: ./services/etl-process
dockerfile: Dockerfile
depends_on:
- shared-base
- db
environment:
DB_HOST: db
DB_USER: myuser
DB_PASSWORD: mypassword
DB_NAME: mydb
backend:
build:
context: ./services/backend_for_frontend
dockerfile: Dockerfile
depends_on:
- shared-base
- db
ports:
- "8000:8000"
environment:
DB_HOST: db
DB_USER: myuser
DB_PASSWORD: mypassword
DB_NAME: mydb
frontend:
build:
context: ./services/frontend
dockerfile: Dockerfile
ports:
- "3000:3000"
volumes:
db_data:
To have the shared modules and schemas I want to use the use the base container I am building as base image for others container that need the shared schemas and utils.
Here how I want to implement it:
FROM python:3.9-slim-buster
WORKDIR /app
COPY . /shared
Next file that depends on it:
FROM shared-base-image
RUN pip install uv
COPY . .
ENRTRYPOINT["uv", "run", "main.py"]
Now my final question: What would be the final structure for this workflow and design? Are there some design patterns which are really helpful?
With this sturucture I also face the issues that I cannot easiyl run the scripst and modules without the container. Does it make sense to append a paths based on if the path exists?
I mean I could also have only one bif src folder but then all services would have the same dependencies which woul also be overhead.
Thx already for your help and I hope you have some inut to improve the structure.
I hope you can give me some idea how to structure it effectively. It is mainly about design and design pattern.
You should treat the shared library as an ordinary Python library. It does not need a Dockerfile, but it does need its own pyproject.toml
. Then your other services can depend on it normally
# services/inference/pyproject.toml
[project]
dependencies = [
"../shared",
...
]
This introduces the case where a Dockerfile needs to include content from outside its own directory. In the Compose file you need to change the build: { context: }
to point to some parent directory, and change the dockerfile:
to point back into the subdirectory
services:
inference:
build:
context: services
dockerfile: inference/Dockerfile
and also change the Dockerfile COPY
statements to reference the subdirectory
FROM python:3.13-slim
# Install uv
# https://docs.astral.sh/uv/guides/integration/docker/#installing-uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
# Copy in the application and its libraries
WORKDIR /app
COPY shared/ shared/
COPY inference/ inference/
# Build it
WORKDIR /app/inference
RUN uv sync --frozen
ENV PATH=/app/inference/.venv/bin:$PATH
# Metadata to run it
CMD ["inference"]
In this setup there is not a "base Dockerfile". This pattern isn't supported well by Compose. Your shared Python code probably isn't large, and so long as the first several lines of the Dockerfile are the same across your various services, the underlying Docker image layers can be shared.
I would also explore the merits of only using a single image. In your root directory you could have a pyproject.toml
that depended on all of the subprojects, which would also bring in their Python entry point scripts. To the extent that you have large dependencies, this probably requires less disk space: a container shares space with its image, and you'll only have one copy of each dependency regardless of how many projects use them. Now a commit anywhere in your repo produces a single new image.
There also may be some value in splitting this setup up into separate repositories. If you can use a Python package repository or upload your library to PyPI, then you can use a simpler Dockerfile. You also won't be forced to rebuild and restart your frontend because the ETL job changed. The downside, such as it is, is that it's harder to make cross-service breaking changes, but this hopefully is a rare event (and proper semantic versioning on your library can mitigate the issues somewhat).