I know you can specify user and group IDs with the docker run
command and you can force the same IDs when building an image in order to have an easy life dealing with file permissions (e.g. as described here).
But what if you want to reuse an image containing a user definition across different machines / users with different user/group IDs?
E.g. on a machine with user/group IDs are 1000:1000
you build an image and use those values to create a user. Then you push
the image to some registry. On another machine (or just another user) with IDs 1001:1000
you want to pull
and use the image.
AFAIU you would either have to know the IDs to use and provide them to docker run
and you might get trouble dealing with files created by the container. Using the local IDs will let you experience those issues inside the container.
Right?
What's the usual approach to this? I'd like to have some way to 'translate' those ID's, i.e. having UID 1001
outside and 1000
inside the container.
Currently the only ways I know of are: 1. ignoring the issue, 2. not sharing images or 3. entering the container with IDs of the user inside the image and rewriting permissions afterwards.
Do not build specific user IDs into your image. As you note, if you do depend on the runtime user ID matching host-directory permissions, this will be wrong if a different host user runs the container. You must specify this at docker run
time.
If at all possible, avoid writing to local files in your container. Store data somewhere like a relational database instead. If you don't need to write files, then it doesn't actually matter what user ID the container runs as. This also makes it easier to scale the application and to run it in clustered environments like Kubernetes.
If your application does write to local files, then limit it to a single directory. Say your application code is in /app
; maybe the data goes in /data
. Only that one directory needs to be writable. The files in /app
should stay owned by root; you do not want the application to be able to overwrite its source code or static assets while it's running.
In the Dockerfile, a good practice would be to create a non-root user, with any user ID, but only switch to it at the end of the Dockerfile. Your container should be operable without any special options.
FROM ...
# Create a non-root user, with an arbitrary user ID
RUN adduser --system --no-create-home appuser
# We are still root; do the normal build-and-install steps
# (Do not run `chown` on anything here, leave it all owned by root)
WORKDIR /app
COPY ...
RUN ...
# Create the empty data directory and give the arbitrary user
# permissions on it
RUN mkdir /data && chown appuser /data
ENV APP_DATA_DIR=/data # recognized by the application
# Normal metadata to run the application, as the arbitrary user
USER appuser
CMD ...
If you decide you want the data to be backed by a bind-mounted host directory, then when you run the container you need to provide the corresponding user ID.
docker run \
-u $(id -u):$(id -g) \ # as the host user
-v "$PWD:/data" \ # mounting the current directory on /data
...
You may need an entrypoint wrapper script to set up the /data
directory on first use. Anything that is in the image will be hidden by the bind mount, so if the host directory is initially empty then the image will need to know how to put the required initial data there.