Putting Your App in a Docker Container
The days of having to provision servers and VMs by hand or by using complicated and heavy handed toolchains like Chef, Puppet, and Ansible are over. Docker simplifies the process by providing developers with a simple domain specific language for creating pre-configured virtual machine images, and simple tools for building, publishing and running them on the (virtual) hardware you're already using. In this guide, I will show you how to install your NodeJS application into a Docker container.
The Dockerfile Language
Docker containers are built using a single file called a Dockerfile. This file uses a simple domain specific language (think SQL) to define how to configure a virtual machine to run your application. The language provides a small number of commands called "instructions" that can be used to define the steps required to build a new virtual machine called a "container". First, I'll explain which instructions we'll be using and what they'll be used for.
FROM
The FROM
instruction is used to define a base image to use as the foundation for your custom image. You can use any local or published image with FROM
. There are published images that only contain popular Linux distributions (or even Windows!) and there are also images that come preinstalled with popular software development stacks like NodeJS, Python, or .NET.
RUN
The RUN
instruction is used in the image build process to run commands required to bootstrap your application environment. We'll use it mostly to install dependencies, but it's capable of running any command that your container OS supports.
COPY
The COPY
instruction is used to copy files from the local filesystem into the container. We will use this instruction to copy our application code, etc., into our image.
ENTRYPOINT
The ENTRYPOINT
instruction contains a command that will be run when your container is launched. It is different from RUN
because the command passed to ENTRYPOINT
does not run at build time. Instead the command passed to ENTRYPOINT
will run when your container is started via docker run
(check out my Docker CLI Deep Dive post). Only a single ENTRYPOINT
instruction per Dockerfile is allowed. If used multiple times, only the last usage will be operative. The value of ENTRYPOINT
can be overridden when running a container image.
CMD
The CMD
instruction is an extension of the ENTRYPOINT
instruction. The content passed to CMD
is tacked onto the end of the command passed to ENTRYPOINT
to create a complete command to start your application. Like with ENTRYPOINT
, only the final usage of CMD
in a Dockerfile is operative and the value given can be overridden at runtime.
EXPOSE
EXPOSE
is a little different from the other instructions in that it doesn't have a practical purpose. It only exists to provide metadata about what port the container expects to expose. You don't need to use EXPOSE
in your container, but anyone who has to understand how to connect to it will appreciate it.
And more...
More details about these Dockerfile instructions and some others that I didn't cover are available via the official Dockerfile reference.
Choosing a Base Image
Generally, choosing a base image will be simple. If you're using a popular language, there is most likely already an official image available that installs the language runtimes. For instance, NodeJS application developers will most likely find it easiest to use the official NodeJS image provided via Dockerhub. This image is developed by the NodeJS team and comes pre-installed with everything you need to run basic NodeJS applications. Similarly, users of other popular languages will find similar images available (ex: Python, Ruby). Once you've chosen your base image, you also need to choose which specific version you will use. Generally, images are available with any supported version of a language's toolchain so that a wide range of applications can be supported using official images. You can usually find a list of all available version tags on an image's DockerHub page.
In addition to offering images with different versions of language tools installed, there are also typically additional images available with different operating systems as well. Unless specified otherwise, images usually use the most recent Debian Linux release as their base image. Since it's considered best practice to keep the size of your images as low as possible, most languages also offer variants of their images built with a "slim" version of Debian Linux, or built with Alpine Linux, a Linux distribution designed for building Docker containers with tiny footprints. Both Debian Slim and Alpine ship with fewer system packages installed than the typical Debian Linux base image. They only include the packages that are required to run the language tools. This will make your Docker images more compact, but may result in more work to build your containers if you require specific system dependencies that are not preinstalled in those versions. Some languages, like .NET Core, even offer Windows-based images.
Though it's typically not necessary, you can choose to use a base operating system image without any additional language specific tools installed by default. Images containing only Debian Linux, Debian Slim, and Alpine Linux are available. However, the most popular images contain many other operating systems like Ubuntu Linux, Red Hat Linux or Windows are available as well. Choosing one of these images will add much more complexity to your Dockerfile. It is highly reccommended that you use a more specific official image if your use case allows.
In the interest of keeping things simple for our example NodeJS app, we will choose the most recent (at the time of writing) Debian Linux version of the official NodeJS image. This image is called node:15
.
Note that we have only included a major version number in the image's version tag (The "version tag" is the part after the colon that specifies a specific version of an image). The NodeJS team (as well as most other maintainers of official images) also publishes images with more specific versions of Node. Using node:15
instead of node:15.5.1
means that my image with be automatically upgraded to new versions of NodeJS 15 at build time when an update is available. This is good for development, but for production workloads, you may want to use a more specific version so you don't get surprised with upgrades to NodeJS that your application can't support.
Starting Your Dockerfile
Now that we've chosen an image, we will create our Dockerfile. This part is very easy since the FROM
instruction is going to do most of the work for us. To get started, simply create a new file in your project's root folder called Dockerfile
. To this file, we will add this one simple line:
FROM node:15
Now we have installed everything we need to run a basic NodeJS application along with any other system dependencies that come pre-installed in Debian Linux.
Installing Additional Depenencies
If your application is simple and only requires NodeJS binaries to be installed and run, congratulations! You get to skip this section. Many developers won't be so lucky. If you use a tool like Image Magick to process images or wkhtmltopdf for generating PDFs, or any other libraries or tools that are not included by your chosen language or don't come installed by default on Debian Linux, you will need to add instructions to your Dockerfile so that they will be installed by Docker when your image is built.
We will primarily use the RUN
instruction to specify the operating system commands required to install our desired packages. If you recall, RUN
is used to give Docker commands to run when building your image. We will use RUN
to issue the commands required to install our dependencies. You may choose to use a package management system like Debian's apt-get
(or Alpine's apm
) or you may install via source. Installing via package manager is always the simplest route, but thanks to the simplicity of the RUN
instruction, it's fairly straightforward to install from source if your required package isn't available to install via package management.
Installing Package Dependencies
Using a package manager is the easiest way to install dependencies. The package manager handles most of the heavy lifting like installing dependencies. The node:15
image is based on Debian, so we will use the RUN
instruction with the apt-get package manager to install ImageMagick for image processing. Add the following lines to the bottom of our Dockerfile:
RUN apt-get update && \ apt-get install -y imagemagick
This is all the code you need in your Dockerfile to use the RUN
instruction to install ImageMagic via apt-get. It's really not very different from how you would install it by hand on an Ubuntu or Debian host. If you've done that before, you probably noticed that there are some unfamiliar instructions. Before we installed using apt-get install
, we had to run apt-get update
. This is required because in order to keep the docker images small, Debian linux containers don't come with any of the package manager metadata pre-downloaded. apt-get update
bootstraps the OS with all the metadata it needs to install packages.
We've also added the -y
option to apt-get install
. This option automatically answers affimatively to any yes/no prompts when apt-get
would otherwise ask for a user response. This is necessary because you will not be able to respond to prompts when Docker is building your image.
Finally, we use the &&
operator to run both commands within the same shell context. When installing dependencies, it's a good practice to combine commands that are part of the same procedure under the same RUN
instruction. This will ensure that the whole procedure is contained in the same "layer" in the container image so that Docker can cache and reuse it to save time in future builds. Check out the official documentation for more information on image layering and caching.
Installing Source Dependencies
Sometimes, since they use pre-compiled binaries, package managers will contain a version of a dependency that doesn't line up with the version you need. In these cases, you'll need to install the dependency from source. If you've done it by hand before, the commands used will be familiar. As with package installs, it's only different in that we use &&
to combine the whole procedure into a single RUN
instruction. Let's install ImageMagick from source this time.
RUN wget https://download.imagemagick.org/ImageMagick/download/ImageMagick-7.0.10-60.tar.gz && \ tar -xzf ImageMagick-7.0.10-60.tar.gz && \ cd ImageMagick-7.0.10-60 && \ ./configure --prefix /usr/local && \ make install && \ ldconfig /usr/local/lib && \ cd .. && \ rm -rf ImageMagick*
As you can see, there's a lot more going on in this instruction. First, we need Docker to download the code for the specific ImageMagic version we want to install with wget
, and unpack it using tar
. Once the source is unpacked, we have it navigate to the source directory with cd
and use ./configure
to prepare the code for compilation. Then, make install
and ldconfig
are used to compile and install the binaries from source. Afterward, we navigate back to the root directory and clean the source tarball and directory since they are no longer needed.
Installing Your App
Now that we've installed dependencies, we can start installing our own application into the container. We will use the COPY
instruction to add our own node app's source code to the container, and RUN
to install npm dependencies. We'll install NPM dependencies first.
In order to get the most out of Docker's build caching, it's best to install external dependencies first, since your dependency tree will change less often than your application code. A single cache miss will disable caching for the remainder of the build instructions. Application code typically changes more often between builds, so we will apply it as late in the process as we possibly can. To install your application's NPM packages, add these lines to the end of your Dockerfile:
WORKDIR /var/lib/my-app
COPY package*.json .
RUN npm install
First, we use the WORKDIR
instruction to change the Dockerfile's working directory to /var/lib/my-app
. This is similar to using the cd
command in a shell environment. It changes the working directory for all of the following Docker instructions. Then we use COPY
to copy our package.json and package-lock.json from the local filesystem to the working directory within the container. We used the wildcard operator (*
), to copy both files with a single instruction. After the package files have been copied, use RUN
to execute npm install
Finally, we will use COPY
to bring the rest of our application code into the container:
COPY * .
This will copy the rest of your NodeJS app's source code to the container, again using COPY
and a much more broad usage of the wildcard. However, since we're using *
to copy everything, we need to introduce a new configuration file called .dockerignore to prevent some local files from being copied to the container at this time. For example, we want to make sure that we aren't copying the contents of our local node_modules folder so that the modules we installed previously don't get overwritten by the ones we've installed on our development machine.
It's likely that your local build platform is different from the one in the container, so copying your local node_modules folder will likely cause your app to malfunction or not run at all. The .dockerignore file is very simple. Simply add the names of files or folders that Docker should ignore at build time. You can use the *
character as a wildcard just like you can in COPY
instructions. Create a .dockerignore with this content:
node_modules/
You may wish to add additional entries to the .dockerignore. For example, if you're using git for version control, you'll want to add the .git/ folder since it's not needed and will unnecessarily increase the size of your image. Any file or directory name you add will be skipped over when copying files via COPY
at build time.
Running Your App
Now that we've installed all our external dependencies and copied our application code into the container, we're ready to tell docker how to run our application. We will run our app using node index.js
. Docker provides the ENTRYPOINT
and CMD
instructions for this purpose. Both instructions have nearly the same behavior of defining the command that the container should use to start the application when our container is run, but ENTRYPOINT
is less straightforward to override. They can be used together and Docker will concatenate their values and run them as a single command. In this case, you would provide the main application command to ENTRYPOINT
(in our case, node
) and any arguments to CMD
(index.js
in our case). However, we're just going to use CMD
. Using both would make sense if NodeJS was our main process, but really, our main command is the whole node index.js
expression. We could use only ENTRYPOINT
but it's more complicated to override an ENTRYPOINT
instruction's value at runtime, and we will want to be able to override the main command simply so that it's easier to troubleshoot issues within the conatainer when they arise. With all that said, add the following to the end of your Dockerfile:
CMD ["node", "index.js"]
Now Docker understands what to do to start our application. We provide our command to CMD
(and ENTRYPOINT
if it's used) in a different way than we supply commands to the RUN
instruction. The form we're using for CMD
is called "exec form" and the form used for RUN is called "shell form". Using shell form for RUN
allows you to access all of the power of the sh
shell environment. You can use variable and wildcard substitution in shell form, in addition to other shell features like piping and chaining commands using &&
and ||
. When using exec form, you do not have access to any of these shell features. When passing a command via exec form, each element within the square brackets is joined with a space in between and run exactly as is. Using shell form is preferred for RUN so that you can leverage build arguments and chaining (recall we did that above for better layering/caching). It's better to use exec form for CMD
or ENTRYPOINT
so that it's always straightforward to understand which action the container takes at runtime.
Conclusion
I hope this article has helped to demystify the process of getting your app into a container. Docker can have a steep learning curve if you're not already a seasoned systems administrator, but features like portability, distribution, and reproducible builds make getting over the hump totally worth it, especially for developers working in teams.