docker continuous-integration jenkins-pipeline continuous-deployment dockerhub

Proper way to build a CICD pipeline with Docker images and docker-compose

I have a general question about DockerHub and GitHub. I am trying to build a pipeline on Jenkins using AWS instances and my end goal is to deploy the docker-compose.yml that my repo on GitHub has:

version: "3"

services:
  db:
    image: postgres
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: password
    volumes:
      - ./tmp/db:/var/lib/postgresql/data
  web:
    build: .
    command: bash -c "rm -f tmp/pids/server.pid && bundle exec rails s -p 3000 -b '0.0.0.0'"
    volumes:
      - .:/myapp
    ports:
      - "3000:3000"
    depends_on:
      - db
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: password
      POSTGRES_HOST: db

I've read that in CI/CD pipelines people build their images and push them to DockerHub but what is the point of it?

You would be just pushing an individual image. Even if you pull the image later in a different instance, in order to run the app with the different services you will need to run the container using docker-compose and you wouldn't have it unless you pull it from the github repo again or create it on the pipeline right?

Wouldn't it be better and straightforward to just fetch the repo from Github and do docker-compose commands? Is there a "cleaner" or "proper" way of doing it? Thanks in advance!

Solution

The only thing you should need to copy to the remote system is the docker-compose.yml file. And even that is technically optional, since Compose just wraps basic Docker commands; you could manually docker network create and then docker run the two containers without copying anything at all.

For this setup it's important to delete the volumes: that require a copy of the application code to overwrite the image's content. You also shouldn't need an override command:. For the deployment you'd need to replace build: with image:.

version: "3.8"
services:
  db: *from-the-question
  web:
    image: registry.example.com/me/web:${WEB_TAG:-latest}
    ports:
      - "3000:3000"
    depends_on:
      - db
    environment: *web-environment-from-the-question
    # no build:, command:, volumes:

In a Compose setup you could put the build: configuration in a parallel docker-compose.override.yml file that wouldn't get copied to the deployment system.

So what? There are a couple of good reasons to structure things this way.

A forward-looking answer involves clustered container managers like Kubernetes, Nomad, or Amazon's proprietary ECS. In these a container runs somewhere in a cluster of indistinguishable machines, and the only way you have to copy the application code in is by pulling it from a registry. In these setups you don't copy any files anywhere but instead issue instructions to the cluster manager that some number of copies of the image should run somewhere.

Another good reason is to support rolling back the application. In the Compose fragment above, I refer to an environment variable ${WEB_TAG}. Say you push out one build a day and you give each a date-stamped tag; registry.example.com/me/web:20220220. But, something has gone wrong with today's build! While you figure it out, you can connect to the deployment machine and run

WEB_TAG=20220219 docker-compose up -d

and instantly roll back, again without trying to check out anything or copy the application.

In general, using Docker, you want to make the image as self-contained as it can be, though still acknowledging that there are things like the database credentials that can't be "baked in". So make sure to COPY the code in, don't override the code with volumes:, do set a sensible CMD. You should be able to start with a clean system with only Docker installed and nothing else, and docker run the image with only Docker-related setup. You can imagine writing a shell script to run the docker commands, and the docker-compose.yml file is just a declarative version of that.

Finally remember that you don't have to use Docker. You can use a general-purpose system-management tool like Ansible, Salt Stack, or Chef to install Ruby on to the target machine and manually copy the code across. This is a well-proven deployment approach. I find Docker simpler, but there is the assumption that the code and all of its dependencies are actually in the image and don't need to be separately copied.