I usually use a Ubuntu or Arch Linux image but I found out recently that there is an OS called CoreOS specifically for docker containers.
As I am new to docker I am not sure which one would be the best base image to build my Dockerfile. It seems like a silly question but in case if I run lots of microservices on several containers, then the container should be as light as possible.
This really depends on your requirements:
FROM scratch
: if you are able to statically compile your application and don't need any other binaries (libraries, shells, or any other command period), then you can use the completely empty "scratch". You'll see this used as the starting point for the other base images, and it's also found in a lot of pre-compiled Go commands.
Distroless: these images are built for a handful of use cases, and ship without a package manager or even shell (excluding their developer images). If you fit in their specific use case, these can be very small, but like with scratch images, difficult to debug.
Busybox: I consider this less of a base image and more of a convenient utility container. You get a lot of common commands in a very small size. Busybox is a single binary with various commands linked to it, and that binary implements each of the commands depending on the CLI. What you don't get is the general package manager to easily install other components.
Alpine: This is a minimal distribution, based on busybox, but with the apk
package manager. The small size comes at a cost, things like glibc are not included, preferring the musl libc implementation instead. You will find that many of the official images are based on Alpine, so inside of the container ecosystem, this is a very popular option.
Debian, Ubuntu, and CentOS: These are less of the lightweight base images. But what they lose with size they gain with a large collection of packages you can pull from and lots of people that are testing, fixing bugs, and contributing to things upstream. They also come with a collection of libraries that some applications may expect to be preinstalled.
While that last option is a bit larger, keep in mind that base images should only be pushed over the wire and stored on disk once. After that, unless you change them, any images built on top of them only need to send the manifest that references layers in that base image and the docker engine will see that it already has those layers downloaded. And with the union fs, those layers never need to be copied even if you run 100 containers all pointing back to that image, they each use the same read-only layer on disk for all the image layers and write their changes to the their container specific RW layer.
If you find yourself installing a set of common tools on many of your images, the better option is to build your own base image, extending an upstream base image with your common tools. That way those tools only get packaged into a layer once and reused by multiple images.