Docker - best practices
02 Apr 2020@tech #docker#best practices#howto#build#tagI have been using a docker for some time. Most of the knowledge I have is empirical acquired by numerous trials and supported by even more errors. Recently, I tried to gather and organize everything in one place.
Keep your build context clean
When you run docker build .
command, docker runs in a context of directory you provided. This way every operation you describe in Dockerfile
will be executed in context of that directory but first docker will send build context to Docker daemon. The smaller context will be, the less time and memory will be consumed for build. If, for some reason, you can’t have context directory separated you should have file called .dockerignore
placed in context root. .dockerignore
works just like .gitignore
file. Any file/directory that is not required for our application to run should be specified in .dockerignore
.
Keep your instructions in right order
While building, docker creates layers. You know… onions have layers, ogres have layers… also Docker image has layers. Instructions like RUN
, COPY
and ADD
creates so called intermediate layers (or intermediate images) which are cacheable until something changes for that layer. Let’s take following example into consideration:
FROM php:apache
WORKDIR /app
COPY . .
RUN apt-get -y update
RUN apt-get -y install curl ca-certificates gnupg2 git
RUN apt-get -y install postgresql-client-11 librabbitmq-dev libgmp-dev libpq-dev
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
RUN composer install
EXPOSE 8080
CMD ["/usr/sbin/apache2ctl", "-DFOREGROUND"]
In above example we are copying whole code into an image at very beginning whitch is a bad practice. Layer that is created by COPY instruction will be invalidated every time we change content of any file. Which happens quite often. When that layer gets invalidated, every layer that was created after will be invalidated too. So putting something that can change more often at the top of file will make our image to be build from scratch every time. What we can do is to put frequently changing stuff as late as possible, e.g. what changes less often than code itself are dependencies. That way we can install them at the start and copy rest of code almost at the end:
FROM php:apache
WORKDIR /app
~~COPY . .~~
RUN apt-get -y update
RUN apt-get -y install curl ca-certificates gnupg2 git
RUN apt-get -y install postgresql-client-11 librabbitmq-dev libgmp-dev libpq-dev
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
COPY composer.json ./
COPY composer.lock ./
RUN composer install
COPY . .
EXPOSE 8080
CMD ["/usr/sbin/apache2ctl", "-DFOREGROUND"]
Be more specific when copy
Copying our code into image we should specify only those files that are crucial for application to run. This will make final image even smaller.
FROM php:apache
WORKDIR /app
RUN apt-get -y update
RUN apt-get -y install curl ca-certificates gnupg2 git
RUN apt-get -y install postgresql-client-11 librabbitmq-dev libgmp-dev libpq-dev
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
COPY composer.json ./
COPY composer.lock ./
RUN composer install
~~COPY . .~~
COPY src ./
EXPOSE 8080
CMD ["/usr/sbin/apache2ctl", "-DFOREGROUND"]
Combine your RUN commands
If you have many RUN instructions one under another, you can join them using &&
sign. This will result as single intermediate layer with all those commands. This can be useful if you update the repository before installing dependencies. By doing these operations separately, Docker can cache repository update, but not package installation itself. If for some reason the required package is unavailable during building, the building process will fail.
FROM php:apache
WORKDIR /app
~~RUN apt-get -y update~~
~~RUN apt-get -y install curl ca-certificates gnupg2 git~~
~~RUN apt-get -y install postgresql-client-11 librabbitmq-dev libgmp-dev libpq-dev~~
RUN apt-get -y update && apt-get -y install \
curl \
ca-certificates \
gnupg2 \
git \
postgresql-client-11 \
librabbitmq-dev \
libgmp-dev \
libpq-dev
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
COPY composer.json ./
COPY composer.lock ./
RUN composer install
COPY src ./
EXPOSE 8080
CMD ["/usr/sbin/apache2ctl", "-DFOREGROUND"]
Stick to alphabetical order
This is simple purist practice. If you have a lot of dependencies to install, keeping them in alphabetical order will make it more readable for other developers and let you avoid repetitions.
FROM php:apache
WORKDIR /app
RUN apt-get -y update && apt-get -y install \
~~curl \ ~~
~~ca-certificates \ ~~
~~gnupg2 \ ~~
~~postgresql-client-11 \ ~~
~~librabbitmq-dev \ ~~
~~libgmp-dev \ ~~
~~libpq-dev~~
ca-certificates \
curl \
git \
gnupg2 \
libgmp-dev \
libpq-dev \
librabbitmq-dev \
postgresql-client-11
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
COPY composer.json ./
COPY composer.lock ./
RUN composer install
COPY src ./
EXPOSE 8080
CMD ["/usr/sbin/apache2ctl", "-DFOREGROUND"]
Are you realy gonna need those extra dependencies?
Just like you will copy only code that is required for application to run, you should install only those dependencies that are realy required. By default if package has some recommended dependencies, package manager will install them by default. Only required packages should installed. This can be sagnificant for image size and building time.
FROM php:apache
WORKDIR /app
~~RUN apt-get -y update && apt-get -y install \ ~~
RUN apt-get -y update && apt-get -y --no-install-recommends install \
ca-certificates \
curl \
git \
gnupg2 \
libgmp-dev \
libpq-dev \
librabbitmq-dev \
postgresql-client-11
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
COPY composer.json ./
COPY composer.lock ./
RUN composer install
COPY src ./
EXPOSE 8080
CMD ["/usr/sbin/apache2ctl", "-DFOREGROUND"]
You are your own garbage collector
Installing dependencies will make package manager to cache the local repository of retrieved package files. This cache will stay on image layer. Again to keep final image as small as possible, you can clear unnecessary cache.
FROM php:apache
WORKDIR /app
RUN apt-get -y update && apt-get -y --no-install-recommends install \
ca-certificates \
curl \
git \
gnupg2 \
libgmp-dev \
libpq-dev \
librabbitmq-dev \
postgresql-client-11 \
&& apt-get clean
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
COPY composer.json ./
COPY composer.lock ./
RUN composer install && composer clear-cache
COPY src ./
EXPOSE 8080
CMD ["/usr/sbin/apache2ctl", "-DFOREGROUND"]
Use specific version of base image
Using latest
version of base image is strongly not recomended. This is because latest
doesn’t tell you exactly which version is required for application to run and most of the time links to latest stable version. Considering that there can be some sagnificant differences between each versions you should always specify which version to use. Just to be safe. Upgrade should always be done deliberately and with caution.
~~FROM php:apache~~
FROM php:7.4.4-apache
WORKDIR /app
RUN apt-get -y update && apt-get -y --no-install-recommends install \
ca-certificates \
curl \
git \
gnupg2 \
libgmp-dev \
libpq-dev \
librabbitmq-dev \
postgresql-client-11 \
&& apt-get clean
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
COPY composer.json ./
COPY composer.lock ./
RUN composer install && composer clear-cache
COPY src ./
EXPOSE 8080
CMD ["/usr/sbin/apache2ctl", "-DFOREGROUND"]
Choose smallest base image possible
Many base images comes in lighter versions (-alpine
, -slim
, etc.). They defer in preinstalled set of dependencies. It is up to you which version will be used as your base image. Depending on final result, you will end up with image build on top of full-sized image or build one FROM scratch
.
Separate responsibilities
Another good practice is that your container should be as ephemeral as as possible. That means that you should be able to stop, destroy, rebuild and start your container without any unnecessary changes in application. You will accomplish that by defining image as one process only like API, console or some processing worker. This will make horizontal scaling easier and running containers in this stateless manner let you reuse them more efficiency.
Multi-stage build
Multi-stage build involves the use of more complex base images containing dependencies required to build the application, but not necessarily to run it. An application built in this way in a larger, base image will become an artifact used in building the final image. This will reduce the number of layers and dependencies of the final image, which will drastically reduce its final size.
FROM composer:1.9.3 as build
COPY composer.json composer.json
COPY composer.lock composer.lock
RUN composer install --ignore-platform-reqs --prefer-dist --optimize-autoloader --no-suggest --no-scripts
FROM php:7.4.4-apache
WORKDIR /app
RUN apt-get -y update && apt-get -y --no-install-recommends install \
ca-certificates \
curl \
gnupg2 \
libgmp-dev \
libpq-dev \
librabbitmq-dev \
postgresql-client-11 \
&& apt-get clean
COPY --from=build /app/vendor /app/vendor
COPY src ./
EXPOSE 8080
CMD ["/usr/sbin/apache2ctl", "-DFOREGROUND"]
Don’t use latest for your own images
latest
is considered anti-pattern in Docker because it doesn’t tell you which particular version you’re dealing with. Therefore, when building an image that will be shared between team members or the community outside your organization, you should avoid using the latest
tag.
Locally, however, tagging an image as latest
makes it easier and faster to work on it. Just remember to change the tag to a more descriptive one before pushing the image outside the development environment.
A good tag is one that it describes:
- what it concerns
- when it was created
- where someone can find more information about it
You can use the branch name used to build the image to specify what the tag is about. The information when it was created is simply a timestamp. It is good to agree its format with other team members. For additional information, it’s best to use commit or build ID.
The above applies to tagging images in the software development process. If we operate according to a specific publishing cycle, subsequent, planned releases of our application will probably have a form resulting from the adopted versioning rules of the application when it enters production, e.g. Semantic Versioning. However, you may notice additional descriptive parts added to the tags that can tell us about the size of the image (e.g. debian:9-slim
) or additional features (e.g. php:7.4-apache
).
Found this article useful and worthy? Please tweet about it.