I have a n
docker files each corresponding to one image with layers from 1..m
common to all of them. The steps specifically includes pulling the base image from a public registry, installing the bare minimum essentials for application running. Due to nature of use case after a periodic interval I am downloading a different docker image that has same base and initial installations but a different application layer.
Question 1:
The docker pull time for some of them is very large based on layers after the layer m
or the common layers. Is there a way to cache these first m
layers (common installations) so that I can save on some time for subsequent pulls?
Question 2: Is it possible to pull only the subsequent layers meaning from layer m onwards?
What you are asking is exactly what docker does. You can refer to the storage driver docs which talks about this exact situation
When you use docker pull to pull down an image from a repository, or when you create a container from an image that does not yet exist locally, each layer is pulled down separately, and stored in Docker’s local storage area, which is usually /var/lib/docker/ on Linux hosts.
It goes on to talk about your exact case
Now imagine that you have two different Dockerfiles. You use the first one to create an image called acme/my-base-image:1.0.
# syntax=docker/dockerfile:1
FROM alpine
RUN apk add --no-cache bash
The second one is based on acme/my-base-image:1.0, but has some additional layers:
# syntax=docker/dockerfile:1
FROM acme/my-base-image:1.0
COPY . /app
RUN chmod +x /app/hello.sh
CMD /app/hello.sh
The second image contains all the layers from the first image, plus new layers created by the COPY and RUN instructions, and a read-write container layer. Docker already has all the layers from the first image, so it does not need to pull them again. The two images share any layers they have in common.