dockerdocker-multi-stage-build

How do I COPY multiple dependency libraries in a multi-stage docker build?


Question at hand: If I need to copy hundreds of dependency libraries between stage builds in Docker, how do I accomplish this without hitting the max depth?

What I'm ultimately trying to do: build as small as possible a container image containing both python and ffmpeg for use with AWS Lambda.

What I know I can do:

FROM ubuntu:latest AS build

RUN apt-get update && apt-get clean && apt-get install -y ffmpeg

FROM python:3.12-slim-bookworm

COPY --from=build /usr/bin/ffmpeg /usr/bin/ffmpeg
COPY --from=build /usr/lib/aarch64-linux-gnu/* /usr/lib/aarch64-linux-gnu/

WORKDIR /app

CMD ["foo"]

Produces a working container image, but it's huge, over 1 GB.

I'd like to copy over just ffmpeg and its dependencies, but that is the rub. ldd tells me there's over 200 dependencies in /lib, and having that many COPY statements leads to exceeding the max depth during the build. Trying to avoid:

COPY --from=build /usr/lib/lib1 /usr/lib/lib1
COPY --from=build /usr/lib/lib2 /usr/lib/lib2
...
COPY --from=build /usr/lib/lib227 /usr/lib/lib227

Is there a way to loop inside the Dockerfile from an external txt file, or some other mechanism to copy hundreds of dependencies without exceeding the max depth?


Solution

  • It looks like ffmpeg is just that big, mostly. If I try to install ffmpeg in a new clean Ubuntu container

    $ docker run --rm -it ubuntu:24.04 \
    >   sh -c 'apt-get update && apt-get install ffmpeg'
    ...
    9 upgraded, 294 newly installed, 0 to remove and 13 not upgraded.
    Need to get 208 MB of archives.
    After this operation, 704 MB of additional disk space will be used.
    

    Now, some of that is file-format and device drivers you don't necessarily need. These come in via the Debian "recommends" dependency type, and you can apt-get install --no-install-recommends. That helps, but only some

    $ docker run --rm -it ubuntu:24.04 \
    >   sh -c 'apt-get update && apt-get install --no-install-recommends ffmpeg'
    ...
    0 upgraded, 202 newly installed, 0 to remove and 22 not upgraded.
    Need to get 131 MB of archives.
    After this operation, 438 MB of additional disk space will be used.
    

    Unless you have a real reason to believe that even that set of packages is bringing in extra files that aren't necessary to run the tool, you're stuck with an image that's several hundred megabytes larger. (And if you think there is an unnecessary dependency, filing a Debian or Ubuntu bug report might make sense.)


    In principle, rather than doing a bunch of tiny COPYs, you can assemble a temporary tree in the first image that contains the files that you need to COPY, and just move them across once. That also means you could do it programmatically. Maybe

    FROM ... AS build
    RUN mkdir /export \
     && cp $(ldd /usr/bin/ffmpeg | sed -ne 's/.* => \([^ ]*\) .*/\1/p') /export
    
    FROM ...
    COPY --from=build /export /usr/local/lib
    RUN ldconfig
    

    This would hopefully reliably (up to parsing ldd's output) collect all of the C shared libraries the program depends on. That won't necessarily be all of the files that you really need, though; ffmpeg itself includes a handful of files in /usr/share/ffmpeg, for example, and tracking those down is a pretty open-ended problem. If you do it successfully, you've probably recreated the exact files that apt-get install ffmpeg installs.