machine-learningdocker-composepipelinemlflow

Combine MLflow projects with docker-compose


I face the following situation:

We train our models within docker container, which is build by running a docker-compose file. I have implemented MLflow to work with docker-compose (by doing something similar to e.g. this post: https://towardsdatascience.com/deploy-mlflow-with-docker-compose-8059f16b6039), creating two more containers (one for the server and one for the postgresql backend).

However, the story doesn't end here. Our goal is to implement a full ML pipeline, which includes data creation, preprocessing steps and so on. I know, that ML projects is something which helps to create such pipeline. I have seen that it is designed to work with docker images (https://www.mlflow.org/docs/latest/projects.html), but I don't get it, how one could use it with docker-compose.

Could you help me in that by giving any tipps, guidelines, documentations, etc?

Or in general, any advice, how a full machine learning pipeline could be implemented using mlflow?

Thanks a lot!


Solution

  • I would suggest training models in a conda environment and only dockerizing for deployment. That way, you can debug model code from an IDE like Pycharm.

    So,

    conda create -n env_name
    conda run -n env_name pip install requirements.txt
    

    Here is how I do it, though it is probably more complicated than you need: https://github.com/bdzyubak/torch-control/blob/main/run_setup_all.py

    MLflow works with model training natively, you just need to import and call autolog.

    mlflow.autolog()
    mlflow.set_experiment('Energy Use Forecasting')
    with mlflow.start_run(): 
        [your training code]
    

    https://github.com/bdzyubak/torch-control/blob/main/projects/MachineLearning/energy_use_time_series_forecasting/time_series_forecasting_energy_use.py

    Then, you would use a single command to pull down a registered model from mlflow to build it.

    https://mlflow.org/docs/latest/cli.html

    mlflow models build-docker --model-uri "runs:/some-run-uuid/my-model" --name "my-image-name"
    # Serve the model
    docker run -p 5001:8080 "my-image-name"