Programmer’s Notes-How to Write Elegant Dockerfile

  container, docker, dockerfile

Introduction

Kubernetes starts with containerization, and containers need to start with Dockerfile. This article will introduce how to write an elegant Dockerfile.

The main contents of the article include:

  • Docker container
  • Dockerfile
  • Use multi-level construction

Thanks to the company for providing a large amount of machine resources and time so that we can practice, and thanks to the support of some projects and personnel that are continuously practicing on this topic.

I. Docker Container

1.1 Characteristics of Containers

We all know that a container is a standard software unit, which has the following characteristics:

  • Run Anywhere: Containers can package code with configuration files and related dependency libraries to ensure consistent operation in any environment.
  • High resource utilization: containers provide process-level isolation, so CPU and memory utilization can be set more finely, thus making better use of the server’s computing resources.
  • Rapid Expansion: Each container can be run as a separate process and can share the system resources of the underlying operating system, thus speeding up the start-up and stop efficiency of the container.

1.2 Docker container

At present, the mainstream container engines on the market include Docker, Rocket/rkt, OpenVZ/Odin, etc. The container engine that dominates the market is the one that uses the most Docker container engines.

Docker container is a series of processes isolated from other parts of the system. All files needed to run these processes are provided by another image. Linux container has portability and consistency from development to testing to production. Compared with development channels that rely on repeating traditional test environments, the container runs much faster and supports deployment on a variety of mainstream cloud platforms (PaaS) and local systems. Docker container well solves the embarrassment of “development environment can run normally and collapse as soon as it goes online”.

Docker container features:

  • Lightweight: The container is process-level resource isolation, while the virtual machine is operating system-level resource isolation, so the Docker container can save more resource overhead than the virtual machine, because the Docker container no longer needs the GuestOS operating system.
  • Quick: The startup and creation of the container need not start the GuestOS, and can realize the startup of seconds or even milliseconds.
  • Portability: Docker container technology is to transform the application and its dependent libraries and runtime environment technology into a container image, which can be run on different platforms.
  • Automation: container layout in container ecology (e.g Kubernetes) can help us to realize automatic container management.

Second, Dockerfile

Dockerfile is a text document used to describe the composition of the file. It contains all the commands that users can use to call in line to combine Image. Users can also use Docker build to implement automatic construction of multiple commands in a row.

By writing Dockerfile magneto-generated image, we can provide a basically consistent environment for the development and testing teams, thus improving the efficiency of the development and testing teams, eliminating worries about the inconsistent environment, and at the same time, the operation and maintenance can more conveniently manage our image.

The syntax of Dockerfile is very simple, only 11 are commonly used:

2.1 write elegant Dockerfile

The following are the main points to note when writing an elegant Dockerfile:

  • The Dockerfile file should not be too long, and the more levels, the larger the final image will be.
  • The built image should not contain unnecessary contents, such as logs, installation of temporary files, etc.
  • Try to use the base image of the runtime as much as possible without putting the build-time process into the runtime Dockerfile.

As long as you remember the above three points, you can write a good Dockerfile.

For the convenience of everyone, we use two Dockerfile instances for a simple comparison:

FROM ubuntu:16.04
RUN apt-get update
RUN apt-get install -y apt-utils libjpeg-dev \     
python-pip
RUN pip install --upgrade pip
RUN easy_install -U setuptools
RUN apt-get clean
FROM ubuntu:16.04
RUN apt-get update && apt-get install -y apt-utils \
  libjpeg-dev python-pip \
           && pip install --upgrade pip \
      && easy_install -U setuptools \
    && apt-get clean

When we look at the first Dockerfile, at first glance it looks clear and reasonable in structure, which seems not bad. Let’s look at the second Dockerfile, which is compact and difficult to read. Why write it like this?

  • The advantage of the first Dockerfile is that when an error occurs in a layer of the process being executed, it is corrected and then Build again, and the previously executed layer will not be executed again. This can greatly reduce the time for the next Build, and its problem is that the mirror will take up more space due to the increase in levels.
  • The second Dockerfile solves all components in one layer, which can reduce the space occupied by the mirror image to a certain extent. However, if one of the groups is compiled incorrectly when making the basic mirror image, building again after correction is equivalent to starting all over again. All the previously compiled components in one layer have to be recompiled again, which consumes more time.

The following table shows the size of the image compiled by the two Dockerfile:

$ docker images | grep ubuntu      
REPOSITORY      TAG     IMAGE ID    CREATED     SIZE                                                                                                                                   
ubuntu                   16.04       9361ce633ff1  1 days ago 422MB
ubuntu                   16.04-1   3f5b979df1a9  1 days ago  412MB

Er …. It doesn’t seem to have any special effect, but if the Dockerfile is very long, you can consider reducing the layers, because the Dockerfile can only have 127 layers at most.

Three, the use of multi-level construction

Docker can support multi-level construction after upgrading to Docker 17.05. In order to make the image smaller, we use multi-level construction to package the image. Before the emergence of multi-level construction, we usually use a Dockerfile or multiple Docker Files to build the image.

3.1 Single File Construction

A single file is used for construction before multi-level construction. A single file contains all the construction processes (including project dependency, compilation, testing and packaging) under a Dockerfile:

FROM golang:1.11.4-alpine3.8 AS build-env
ENV GO111MODULE=off
ENV GO15VENDOREXPERIMENT=1
ENV BUILDPATH=github.com/lattecake/hello
RUN mkdir -p /go/src/${BUILDPATH}
COPY ./ /go/src/${BUILDPATH}
RUN cd /go/src/${BUILDPATH} && CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go install –v

CMD [/go/bin/hello]

This approach will bring some problems:

  • Dockerfile files will be especially long, and maintainability index will drop when more and more things are needed.
  • If there are too many mirror layers, the size of the mirror will gradually increase and the deployment will become slower and slower.
  • There is a risk of code leakage.

Take Golang as an example. It does not depend on any environment at runtime, but only needs a compilation environment. Then this compilation environment has no task function at actual runtime. After compilation is completed, those source codes and compilers have no task function and there is no need to stay in the mirror.

As can be seen from the above table, single file construction finally takes up 312MB of space.

3.2 Multi-file Construction

Is there a good solution before multi-level construction? Yes, for example, using multiple files to build or installing compilers on the build server, but installing compilers on the build server is not recommended because installing compilers on the build server will make the build server very bloated, require adaptation to multiple versions and dependencies of various languages, are prone to errors, and have high maintenance costs. Therefore, we will only introduce how to build multiple files.

Multi-file construction is actually using multiple Dockerfile and then combining them through scripts. Suppose there are three files: Dockerfile.run, Dockerfile.build, build.sh

  • The Dockerfile.run is the Dockerfile of some components that the runtime program must need, and it contains the simplest library.
  • Dockerfile.build is only used for building, and it is useless after building.
  • The function of build.sh is to compose Dockerfile.run and Dockerfile.build, take out what Dockerfile.build has built, and then execute Dockerfile.run, which is regarded as a scheduling role.

Dockerfile.build

FROM golang:1.11.4-alpine3.8 AS build-env
ENV GO111MODULE=off
ENV GO15VENDOREXPERIMENT=1
ENV BUILDPATH=github.com/lattecake/hello
RUN mkdir -p /go/src/${BUILDPATH}
COPY ./ /go/src/${BUILDPATH}
RUN cd /go/src/${BUILDPATH} && CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go install –v

Dockerfile.run

FROM alpine:latest
RUN apk –no-cache add ca-certificates
WORKDIR /root
ADD hello .
CMD ["./hello"]

Build.sh

#!/bin/sh
docker build -t –rm hello:build . -f Dockerfile.build
docker create –name extract hello:build
docker cp extract:/go/bin/hello ./hello
docker rm -f extract
docker build –no-cache -t –rm hello:run . -f Dockerfile.run
rm -rf ./hello

Sh to complete the construction of the project.

As can be seen from the above table, multi-file construction greatly reduces the occupied space of the mirror, but it has three files to manage and the maintenance cost is higher.

3.3 Multilevel Construction

Finally, let’s look at the much-anticipated multi-level construction.

To complete the multi-stage construction, we only need to use the FORM declaration in the Dockerfile several times. Each FROM instruction can use a different basic mirror image, and each FROM instruction will start a new construction. We can choose to copy the construction results of one stage to another stage, leaving only the results of the last construction in the final mirror image, which can easily solve the above-mentioned problems and only need to write a Dockerfile file. It is worth noting here that it is necessary to ensure that Docker’s version is 17.05 and above. Let’s talk about the specific operation.

In Dockerfile, as can be used to alias a phase “build-env”:

FROM golang:1.11.2-alpine3.8 AS build-env

Then copy the files from the mirror in the previous stage, or copy the files from any mirror:

COPY –from=build-env /go/bin/hello /usr/bin/hello 

Look at a simple example:

FROM golang:1.11.4-alpine3.8 AS build-env
 
ENV GO111MODULE=off
ENV GO15VENDOREXPERIMENT=1
ENV GITPATH=github.com/lattecake/hello
RUN mkdir -p /go/src/${GITPATH}
COPY ./ /go/src/${GITPATH}
RUN cd /go/src/${GITPATH} && CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go install -v
 
FROM alpine:latest
ENV apk –no-cache add ca-certificates
COPY --from=build-env /go/bin/hello /root/hello
WORKDIR /root
CMD ["/root/hello"]

Perform dockerbuild-t–rmhello 3. then perform docker images, and then let’s look at the size of the mirror:

Multi-level construction brings us a lot of convenience. The biggest advantage is that it can reduce the maintenance burden of Dockerfile while ensuring that the running image is small enough. Therefore, we highly recommend using multi-level construction to package your code into Docker image.

Author: Cong Wang

Source of content:Yixin Institute of Technology