dockervolumes

Populate a volume using multiple containers


I am checking the docker documentation on how to use named volumes to share data between containers. In Populate a volume using a container it is specified that:

If you start a container which creates a new volume, as above, and the container has files or directories in the directory to be mounted (such as /app/ above), the directory’s contents are copied into the volume. The container then mounts and uses the volume, and other containers which use the volume also have access to the pre-populated content.

So I did a simple example where:

So far so good. However I wanted to see if it is possible to have pre-populated content from more than one containers. What I did was

  1. Create two simple images which have their respective configuration files in the same directory
FROM alpine:latest

WORKDIR /opt/test

RUN mkdir -p "/opt/test/conf" && \
    echo "container from image 1" > /opt/test/conf/config_1.cfg
FROM alpine:latest

WORKDIR /opt/test

RUN mkdir -p "/opt/test/conf" && \
    echo "container from image 2" > /opt/test/conf/config_2.cfg
  1. Create a docker compose which defines a named volume which is mounted on both services
services:
    test_container_1:
        image:
          test_image_1
        volumes:
          - test_volume:/opt/test/conf
        tty: true

    test_container_2:
        image:
          test_image_2
        volumes:
          - test_volume:/opt/test/conf
        tty: true

volumes:
    test_volume:
  1. Started the services.
> docker-compose -p example up
Creating network "example_default" with the default driver
Creating volume "example_test_volume" with default driver
Creating example_test_container_2_1 ... done
Creating example_test_container_1_1 ... done
Attaching to example_test_container_1_1, example_test_container_2_1

According to the logs container_2 was created first and it pre-populated the volume. However, the volume was then mounted to container_1 and the only file available on the mount was apparently /opt/test/conf/config_2.cfg effectively removing config_1.

So my question is, if it is possible to have a volume populated with data from 2 or more containers.

The reason I want to explore this, is so that I can have additional app configuration loaded from different containers, to support a multi tenant scenario, without having to rework the app to read the tenant configuration from different folders.

Thank you in advance


Solution

  • Once there is any content in a named volume at all, Docker will never automatically copy content into it. It will not merge content from two different images, update the volume if one of the images changes, or anything else.

    I'd advise you to ignore the paragraph you quote in the Docker documentation. Assume any volume you mount into the container is initially empty. This matches the behavior you'll get with Docker bind-mounts (host directories), Kubernetes persistent volumes, and basically any other kind of storage besides Docker named volumes proper. Don't mount a volume over the content in your image.

    If you can, restructure your application to avoid sharing files at all. One common use of named volumes I see is trying to republish static assets to a reverse proxy, for example; rather than trying to use a named volume (which will never update itself) you can COPY the static assets into a dedicated Web server image. This avoids the various complexities around trying to use a volume here.


    If you really don't have a choice in the matter, then you can approach this with dedicated code in both of the containers. The basic setup here is:

    1. Have a data directory somewhere outside your application directory, and mount the volume there.
    2. Include the original files in the image somewhere different.
    3. In an entrypoint wrapper script, copy the original files into the data directory (the mounted volume).

    Let's say for the sake of argument that you've installed the application into /opt/test, and the data directory will be /etc/test. The entrypoint wrapper script can be as little as

    #!/bin/sh
    
    # Copy config files from the application tree into the config tree
    # (overwriting anything that's already there)
    cp /opt/test/* "$TEST_CONFIG_DIR"
    
    # Run the main container command
    exec "$@"
    

    In the Dockerfile, you need to make sure that directory exists (and if you'll use a non-root user, that user needs permission to write to it).

    FROM alpine
    
    WORKDIR /opt/test
    COPY ./ ./
    
    ENV TEST_CONFIG_DIR=/etc/test
    RUN mkdir "$TEST_CONFIG_DIR"
    
    ENTRYPOINT ["./entrypoint.sh"]
    CMD ["./my_app"]
    

    Finally, in the Compose setup, mount the volume on that data directory (you can't use the environment variable, but consider the filesystem path part of the image's API):

    version: '3.8'
    volumes:
      test_config:
    services:
      one:
        build: ./one
        volumes:
          - test_config:/etc/test
      two:
        build: ./two
        volumes:
          - test_config:/etc/test
    

    You would be able to run, for example,

    docker-compose run one ls /etc/test
    docker-compose run two ls /etc/test
    

    to see both sets of files appear there.

    The entrypoint script is code you control. There's nothing especially magical about it beyond the final exec "$@" line to run the main container command. If you want to ignore files that already exist, for example, or if you have a way to merge in changes, then you can implement something more clever than a simple cp command.