Developer Insights
Join millions of viewers! Our engineers craft human-written articles solving real-world problems weekly. Enjoy fresh technical content and numerous interviews featuring modern web advancements with industry leaders and open-source authors.
Build IT Better - DevOps - Monitoring Roundup
Build IT Better DevOps - Monitoring Roundup On This Dot's Build IT Better show, I talk to people who make popular tools that help developers make great software. In my most recent series, we looked at application monitoring tools. Marcus Olssen from Grafana and Ben Vinegar from Sentry showed us how the tools they work on can help developers keep their applications running smoothly. Grafana Grafana is an organization that builds a number of open source observation and monitoring tools for collecting and visualizing application metrics. Their namesake product is a platform for aggregating and visualizing any kind of data from a near limitless number of sources via their rich plugin library. Grafana's commercial counterpart, Grafana Labs, maintains this plugin library as well as educational resources for the Grafana ecosystem and paid products and services for companies that are looking for help managing their own Grafana tooling. Flexibility Grafana is a platform for application monitoring and analytics that offers a really huge amount of flexibility for collecting and analyzing application data. Instead of providing a hyper-focused application monitoring solution, Grafana provides unparallelled flexibility for collecting almost any kind of data. Grafana offers built in integrations for all the most popular SQL and non-SQL databases, as well as Grafana's own popular application monitoring tools, Prometheus, Loki, and Tempo (and a handful of other popular sources). Community developed plugins can be used to add support for most other platforms. This flexibility allows Grafana to have applications outside the traditional application monitoring use cases. Some are even using Grafana to track their own home energy usage and health data. You can really analyze almost any kind of data in Grafana. *A Grafana dashboard with custom metrics* Datasource Compatibility While flexibility allows Grafana to reach across industries to find users and use cases, it still excels at traditional application monitoring. Developers can use Prometheus to pull data out of their own applications. Most popular host operating systems and appliation development frameworks offer community developed integrations with Prometheus that will provide useful system and application data like resource usage and response time, as well as the ability to publish your own custom application data. Loki is a tool for aggregating and querying system and application logs. Also, you can use Tempo for aggregating distributed application trace data from tools like Jaeger, OpenTelemetry, and Zipkin. If you use all 4 tools together, you can visually trace transactions all the way through your application, even as the user shifts between different components of your microservice architecture. Visualization and Analysis All of this flexible data collection technology would be useless without Grafana's equally flexible visualization platform. Once you've integrated all your data sources, you can use Grafana to explore and visualize data you've collected. You can use dashboards to create an array of vizualizations of your data. As a DevOps engineer, one of my favorite things about Grafana is their Dashboard library. The dashboard library contains community developed dashboards for a number of popular (and not so popular) application frameworks and backend tools and systems. Instead of needing to make your own dashboards from scratch for monitoring Rails apps and PostgreSQL databases, you can simply add and modify community Dashboards, saving you time and providing insights you may not have considered on your own. Finally, we have to mention the Explore tool. It can be easy to overlook with everything that's possible with Dashboards, but it allows users to easily view, query, and analyze data streams on the fly without needing to create permanent dashboards. *Grafana Nginx Dashboard - available from the dashboard library* This big tent collection of features makes Grafana a great platform for observing any amount of any kind data. The flexibility does come with the overhead of needing to know a lot about a number of different tools like Prometheus and Loki, which have a non-trivial amount of overhead on their own. As with any community-developed content, plugins and dashboards from the library don't always work as expected out of the box and will often need to be modified to line up with your devops procedures and environments. Sentry Sentry, like Grafana, is a tool for monitoring application health and performance. However, unlike Grafana, Sentry is laser-focused on providing curated experiences with deep first party integrations for popular application development tools and provides some additional tools for tracking user errors and code changes, which it uses as the framing narrative for all of the data the Sentry platform surfaces for developers. Integratons are available for most popular frontend JavaScript frameworks (React, Angular, Vue, etc) and backend applications in Python, Ruby, Go, and more. Sentry gives developers a huge amount of visibility without the overhead of more complex devops driven platforms like Grafana. Developer Focused Sentry's primary goal is to help you understand what's wrong with all of the parts of your application. They do this by giving you a view into what errors your users are experiencing in real time. Sentry collects data on all of the exceptions thrown by applications which have a Sentry integration. As you investigate individual issues, Sentry provides you with a curated collection of datapoints to cross reference with the specific error. Sentry provides some very traditional data, such as user like browser agent, OS, their geographical location, and the url they were visiting, but it also connects that error back to the code. Not only can you see the stack trace and easily see the lines of code where the error manifested, but Sentry also uses its deep integration to provide what they call "Breadcrumbs." These are pieces of data about what actual activity led up to the error. Depending on the what type of application you're troubleshooting, this might be things like log output, events fired from UI elements, or your own custom breadcrumb events. These can give you a better idea of the actions the user took leading up to the error. *Sentry's Issue (aka Error) View* *A sample of Sentry's Breadcrumbs* Integrations In addition to helping you identify the root cause of your errors, Sentry also aggregates errors to make it easier for you to understand which errors have the highest impact on your application. You can easily identify errors that are happening frequently and on critical paths. If you've enabled integration with a source control platform like GitHub, Sentry will even make suggestions as to which code commits introduced the problem. All these features together will help you tackle application health like a devops expert, without needing to be a devops expert. Application Performance Debugging and error surfacing aren't the only place where Sentry shines. I'm really excited to talk about Sentry's performance and application tracing platform. Using their deep framework and platform integrations, you're able to collect a lot of performance data from your applicaitons and to coallate them with user behaviors. Similar to the debugging experience, Sentry starts you from a broad view of your performance picture, and shows you the slowest pages and endpoints of your application, and provides you with another curated experince for investigating and resolving performance problems. The most interesting aspect of the performance investigation tools are transactions, or traces. When you choose a slow page to begin investigating, alongside the individual performance metrics for that page, are transactions. These transactions allow you to see the performance of your pages broken into waterfall graphs, like you might already be used to from browser dev tools. However, Sentry adds some really cool tricks since they're deeply integrated into all the parts of your application. If you analyze a transaction that starts from your javascript app and see that there's a fetch request that's taking a long time, assuming the API is part of your stack that's integrated with Sentry, you can click down into that fetch request within the Sentry UI and switch contexts to the API application and see a waterfall graph of what the API did to handle that request, allowing you to simply traverse your whole application to identify the exact source of performance problems. These transactions also benefit from the same Breadcrumb and code change data that's provided in the error analysis tools. Conclusions Sentry and Grafana are both strong tools to add to your DevOps toolbelt. While they both provide great features for observing application health and analyzing data, they really fill two pretty different niches. Sentry provides curated developer experiences and deep integrations that will help developers dive head first into error and performance monitoring for their applications without needing to be experts. However for experts and "data scientists" Grafana provides an incredibly powerful and flexible platform for not only analyzing application metrics and health, but really any data you can manage to get into a Dashboard. Some organizations may even benefit from using both tools for different use cases....
Apr 20, 2021
7 mins
Comparing App Platforms with Heroku/Salesforce, AWS, and Vercel
Introduction Recently on This Dot's Build It Better show, I had the opportunity to sit down with some folks from popular platform as a service vendors. I asked them to tell me about what got them excited about their platforms, and what advice they have to offer to help viewers choose which platform is best for them. Salesforce/Heroku I spoke with Julian Duque and Mohith Shrivastava from Salesforce about their "low code" products, and how those products are enhanced by their Heroku platform for "pro code" solutions. Salesforce makes it easy for almost anyone to get started building and running cloud apps whether they're software engineers or not. The Salesforce Platform offers a wide variety of low-code products that make it simple for non-developers to build apps, and Heroku provides a streamlined set of tools and services for running custom software applications using a wide array of application development frameworks and programming languages. I personally really love the simplicity and flexibility of the Heroku platform. I've used it for tons of projects over the years. You can host almost any application, built using any language or framework, using Heroku's buildpack technology. Heroku Buildpacks are sets of scripts that automate your app's build and deployment steps. Official Buildpacks for a dozen different platforms are available for use and in most cases, Heroku can automatically detect which one your app needs when you deploy your code. If your stack isn't supported by an official buildpack, you can build your own or use one of the many community maintained buildpacks for languages and frameworks that don't have first party support. Another benefit of Heroku is that each Heroku app has an internal Git repository and all you need to do to deploy your code is push your code to that repository using git push. There are no additonal tools required for deployments. Not only does this simplify the process of deploying your code by hand, but it also means that Heroku is automatically compatible with any CI/CD system that supports Git, which is almot all of them by now. In addition to custom applicaiton hosting, Heroku also has PaaS integrated offerings for PostgreSQL, Redis, and Apache Kafka that can all be managed through the Heroku dashboard or CLI. Even though I'm a long time user of Heroku, I wasn't really aware of everything that Salesforce brings to the table. Heroku offers a strong platform for pro code applications, but in addition, the Salesforce Platform provides a variety of low code tools that can be used to build applications by people who aren't experienced in custom software development. These tools allow businesses to begin the digital transformation process without needing to bring on a large in-house IT staff. There are point and click tools for managing authentication and identity as well as automating workflows and building user interfaces. They even offer a product called Einstein that can imbue your workflows with AI powers. However, you don't need to worry about outgrowing the low code solutions because the Salesforce Platform can also be integrated with pro code applications hosted in the Heroku ecosystem. This makes Salesforce/Heroku a great platform that businesses can rely on all the way through their digital transformation process. Technology isn't the only thing that sets Salesforce and Heroku apart from their competition. They also provide a couple of huge documentation libraries. For the Salesforce Platform, you can head to their Salesforce Trailhead site. Trailhead offers interactive courses and learning tracks that will teach you how to build applications from the ground up on the Salesforce Platform. Heroku also has an expansive documentation library that not only applies directly to the Heroku platform, but I've used their documentation personally many times to assist in resolving problems with my applications on other platforms. The Heroku documentation site is not only comprehensive, but it's also easier to consume than that of many of their competitors (I'm looking at you Amazon). And finally, when documentation isn't enough, Heroku and Salesforce also have excellent support teams who will work quickly to resolve any problems you're experiencing with their platform, and in many cases they can act proactively before you are aware you have a problem. Vercel I also spoke with Lee Robinson from Vercel. Vercel is a platform that's quite similar to Heroku in a lot of ways. However they are laser focused on providing a great hosting platform for your Jamstack applications. While Heroku can support a nearly limitless number of programming languages and application frameworks, Vercel is focused on providing the best possible experience for "serverless" Javascript apps. These are apps that use a hybrid or static JavaScript framework for building frontends and backends that are powered by NodeJS serverless functions. Serverless functions written in Python, Go, or Ruby are also supported, but there are no options for supporting functions written in languages that aren't officially supported. Compared to Heroku's flexibility, one might take this to mean that Vercel is an inferior platform, but this isn't the case at all. What Vercel doesn't offer in terms of flexibility, they make up for in developer experience. Where Heroku provides the simplicity of being able to effortlessly scale your applications by dragging a slider, Vercel takes the simplicity to the extreme and automagically scales your applications without needing to ever even use the dashboard or CLI. Not only do they completely automate and manage all the complexities of scaling your app to meet the demands of your users, you also get the benefit of having the Vercel Edge Network CDN to ensure your app is always available and performant no matter where your users are located geographically. This is all part of every single app hosted on Vercel, even the free tier! Vercel also provides additional tools to help you supercharge your development workflows and improvement cycles. "Develop. Preview. Ship" is Vercel's mantra. To help developers achieve this, not only do they provide Git-based deployments, but for each branch or pull request opened via version control, Vercel provides a "preview URL" which is connected to a preview version of your application that reflects the code on that branch/PR. This eliminates the need for complicated staging and QA workflows, since preview URLs provide isolated environments for testing and demoing new features. Another mantra Lee shared with me is the idea that "developers are scientists." As developers, we can use data to inform how we build the solutions we work on, but often that data can be cumbersome or difficult to obtain. Vercel simplifies the data collection process by offering a high quality analytics platofrm to help you understand how your applicaiton performs, not only in terms of response performance but also tracking frontend user experience metrics like layout shift and input delay. Being able to easily collect and visualize these metrics allows you to really be a scientist and always be able to justify priorities and improvements to your products with real user data. Another interesting aspect of Vercel is that they've also created a NodeJS application development framework in-house called Next.js that is meant to pair perfectly with their platform. It provides a "zero-configuration" framework for building applications with NodeJS and React. It's an incredibly flexible platform that can support the simplest one-page statically rendered applications, but also can support request-time server-side frontend rendering and custom backend API endpoints supproted by Vercel's serverless functions. To help new and experienced developers alike, Vercel offers a library of starter projects using Next.js and/or other JavaScript frameworks you can use to get your project started with just a few button clicks. Amazon Web Services I spoke with Nader Dabit from Amazon about their new Amplify platform. Amazon has been the biggest player in the PaaS marketplace for well over a decade now. Most developers have used an EC2 virtual server or stored application assets and uploads in S3. What developers may not know is that Amazon offers more than 200 different services for use by developers and other business users. Ec2 and S3 are pretty simple and straightforward, but branching out into the broader ecosystem or learning to tie everything together can be pretty intimidating. This isn't a big deal for companies like Netflix or AirBnB who can afford to bring in devops engineers that are already AWS experts, but historically it's been a lot more difficult for less exprienced developers to take full advantage of what AWS has to offer. With Amplify, the AWS team is hoping to demystify the process and give new and experienced developers a way to work with the core AWS platform in a more streamlined way. Instead of having to udnerstand which service to use out of a list of 200+ services with intimidating names, Amplify selects a smaller subset of these services and gives them less esoteric names. So Amazon Cognito becomes "Authentication" and AWS Lambda becomes "Functions". They also provide simplified client libraries over the traditional AWS SDK that are compatible with JavaScript, Android, iOS and Flutter. Another neat thing about the Amplify platform is that they, like Salesforce, are steering users toward Amazon's low code tools like AWS AppSync and API Gateway, and making it easier for developers to integrate with AWS tools for things like AI/ML predictions and PubSub. Also like Salesforce, if developers outgrow the lowcode tools, it's easier than ever to expand out to the boader ecosystem and some of the more specialized services that amazon offers. In addition to making it easy to build your application's backend with little or no code, Amplify also offers the frontend components you need to build interactive web or mobile apps. Amplify UI components are available for React, Angular, Vue and more. And of course, on top of the simplified Amplify toolchain, AWS still provides the same 200+ services they've traditionally offered. So if you outgrow Amplify, or need services that aren't compatible with it, you can always integrate offerings outside of the Amplify ecosystem with other AWS services. Another thing I really like about Amplify, and AWS in general, is the pricing. All of the Amplify services have a free tier. This makes it useful for hobby projects or to keep development costs low before you launch your applications. Also, it's important to note that the other services like Heroku and Vercel are often based on AWS themselves (. As such, buying services direct from AWS will usually save you at least a little bit of money over using a more managed service. Conclusion Developers have a ton of choices when they are choosing a platform to build their applications on. All of the vendors I spoke with have compelling solutions that will make your life as a developer better. I always personally reach for platforms like Heroku or Vercel first since they're quick and easy to get started with, but it's clear that AWS has taken note of that and is trying to close that gap. So really, there's not a bad choice if these are your options. I hope I've explained them well enough so you can choose which one suits your project the best!...
Mar 17, 2021
8 mins
Putting Your NodeJS App in a Docker Container
Putting Your App in a Docker Container The days of having to provision servers and VMs by hand or by using complicated and heavy handed toolchains like Chef, Puppet, and Ansible are over. Docker simplifies the process by providing developers with a simple domain specific language for creating pre-configured virtual machine images, and simple tools for building, publishing and running them on the (virtual) hardware you're already using. In this guide, I will show you how to install your NodeJS application into a Docker container. The Dockerfile Language Docker containers are built using a single file called a Dockerfile. This file uses a simple domain specific language (think SQL) to define how to configure a virtual machine to run your application. The language provides a small number of commands called "instructions" that can be used to define the steps required to build a new virtual machine called a "container". First, I'll explain which instructions we'll be using and what they'll be used for. FROM The FROM instruction is used to define a base image to use as the foundation for your custom image. You can use any local or published image with FROM. There are published images that only contain popular Linux distributions (or even Windows!) and there are also images that come preinstalled with popular software development stacks like NodeJS, Python, or .NET. RUN The RUN instruction is used in the image build process to run commands required to bootstrap your application environment. We'll use it mostly to install dependencies, but it's capable of running any command that your container OS supports. COPY The COPY instruction is used to copy files from the local filesystem into the container. We will use this instruction to copy our application code, etc., into our image. ENTRYPOINT The ENTRYPOINT instruction contains a command that will be run when your container is launched. It is different from RUN because the command passed to ENTRYPOINT does not run at build time. Instead the command passed to ENTRYPOINT will run when your container is started via docker run (check out my Docker CLI Deep Dive post). Only a single ENTRYPOINT instruction per Dockerfile is allowed. If used multiple times, only the last usage will be operative. The value of ENTRYPOINT can be overridden when running a container image. CMD The CMD instruction is an extension of the ENTRYPOINT instruction. The content passed to CMD is tacked onto the end of the command passed to ENTRYPOINT to create a complete command to start your application. Like with ENTRYPOINT, only the final usage of CMD in a Dockerfile is operative and the value given can be overridden at runtime. EXPOSE EXPOSE is a little different from the other instructions in that it doesn't have a practical purpose. It only exists to provide metadata about what port the container expects to expose. You don't _need_ to use EXPOSE in your container, but anyone who has to understand how to connect to it will appreciate it. And more... More details about these Dockerfile instructions and some others that I didn't cover are available via the official Dockerfile reference. Choosing a Base Image Generally, choosing a base image will be simple. If you're using a popular language, there is most likely already an official image available that installs the language runtimes. For instance, NodeJS application developers will most likely find it easiest to use the official NodeJS image provided via Dockerhub. This image is developed by the NodeJS team and comes pre-installed with everything you need to run basic NodeJS applications. Similarly, users of other popular languages will find similar images available (ex: Python, Ruby). Once you've chosen your base image, you also need to choose which specific version you will use. Generally, images are available with any supported version of a language's toolchain so that a wide range of applications can be supported using official images. You can usually find a list of all available version tags on an image's DockerHub page. In addition to offering images with different versions of language tools installed, there are also typically additional images available with different operating systems as well. Unless specified otherwise, images usually use the most recent Debian Linux release as their base image. Since it's considered best practice to keep the size of your images as low as possible, most languages also offer variants of their images built with a "slim" version of Debian Linux, or built with Alpine Linux, a Linux distribution designed for building Docker containers with tiny footprints. Both Debian Slim and Alpine ship with fewer system packages installed than the typical Debian Linux base image. They only include the packages that are required to run the language tools. This will make your Docker images more compact, but may result in more work to build your containers if you require specific system dependencies that are not preinstalled in those versions. Some languages, like .NET Core, even offer Windows-based images. Though it's typically not necessary, you can choose to use a base operating system image without any additional language specific tools installed by default. Images containing _only_ Debian Linux, Debian Slim, and Alpine Linux are available. However, the most popular images contain many other operating systems like Ubuntu Linux, Red Hat Linux or Windows are available as well. Choosing one of these images will add much more complexity to your Dockerfile. It is _highly reccommended_ that you use a more specific official image if your use case allows. In the interest of keeping things simple for our example NodeJS app, we will choose the most recent (at the time of writing) Debian Linux version of the official NodeJS image. This image is called node:15. Note that we have only included a major version number in the image's version tag (The "version tag" is the part after the colon that specifies a specific version of an image). The NodeJS team (as well as most other maintainers of official images) also publishes images with more specific versions of Node. Using node:15 instead of node:15.5.1 means that my image with be automatically upgraded to new versions of NodeJS 15 at build time when an update is available. This is good for development, but for production workloads, you may want to use a more specific version so you don't get surprised with upgrades to NodeJS that your application can't support. Starting Your Dockerfile Now that we've chosen an image, we will create our Dockerfile. This part is very easy since the FROM instruction is going to do most of the work for us. To get started, simply create a new file in your project's root folder called Dockerfile. To this file, we will add this one simple line: ` Now we have installed everything we need to run a basic NodeJS application along with any other system dependencies that come pre-installed in Debian Linux. Installing Additional Depenencies If your application is simple and only requires NodeJS binaries to be installed and run, congratulations! You get to skip this section. Many developers won't be so lucky. If you use a tool like Image Magick to process images or wkhtmltopdf for generating PDFs, or any other libraries or tools that are not included by your chosen language or don't come installed by default on Debian Linux, you will need to add instructions to your Dockerfile so that they will be installed by Docker when your image is built. We will primarily use the RUN instruction to specify the operating system commands required to install our desired packages. If you recall, RUN is used to give Docker commands to run when building your image. We will use RUN to issue the commands required to install our dependencies. You may choose to use a package management system like Debian's apt-get (or Alpine's apm) or you may install via source. Installing via package manager is always the simplest route, but thanks to the simplicity of the RUN instruction, it's fairly straightforward to install from source if your required package isn't available to install via package management. Installing Package Dependencies Using a package manager is the easiest way to install dependencies. The package manager handles most of the heavy lifting like installing dependencies. The node:15 image is based on Debian, so we will use the RUN instruction with the apt-get package manager to install ImageMagick for image processing. Add the following lines to the bottom of our Dockerfile: `` RUN apt-get update && \ apt-get install -y imagemagick `` This is all the code you need in your Dockerfile to use the RUN instruction to install ImageMagic via apt-get. It's really not very different from how you would install it by hand on an Ubuntu or Debian host. If you've done that before, you probably noticed that there are some unfamiliar instructions. Before we installed using apt-get install, we had to run apt-get update. This is required because in order to keep the docker images small, Debian linux containers don't come with any of the package manager metadata pre-downloaded. apt-get update bootstraps the OS with all the metadata it needs to install packages. We've also added the -y option to apt-get install. This option automatically answers affimatively to any yes/no prompts when apt-get would otherwise ask for a user response. This is necessary because you will not be able to respond to prompts when Docker is building your image. Finally, we use the && operator to run both commands within the same shell context. When installing dependencies, it's a good practice to combine commands that are part of the same procedure under the same RUN instruction. This will ensure that the whole procedure is contained in the same "layer" in the container image so that Docker can cache and reuse it to save time in future builds. Check out the official documentation for more information on image layering and caching. Installing Source Dependencies Sometimes, since they use pre-compiled binaries, package managers will contain a version of a dependency that doesn't line up with the version you need. In these cases, you'll need to install the dependency from source. If you've done it by hand before, the commands used will be familiar. As with package installs, it's only different in that we use && to combine the whole procedure into a single RUN instruction. Let's install ImageMagick from source this time. `` RUN wget https://download.imagemagick.org/ImageMagick/download/ImageMagick-7.0.10-60.tar.gz && \ tar -xzf ImageMagick-7.0.10-60.tar.gz && \ cd ImageMagick-7.0.10-60 && \ ./configure --prefix /usr/local && \ make install && \ ldconfig /usr/local/lib && \ cd .. && \ rm -rf ImageMagick* `` As you can see, there's a lot more going on in this instruction. First, we need Docker to download the code for the specific ImageMagic version we want to install with wget, and unpack it using tar. Once the source is unpacked, we have it navigate to the source directory with cd and use ./configure to prepare the code for compilation. Then, make install and ldconfig are used to compile and install the binaries from source. Afterward, we navigate back to the root directory and clean the source tarball and directory since they are no longer needed. Installing Your App Now that we've installed dependencies, we can start installing our own application into the container. We will use the COPY instruction to add our own node app's source code to the container, and RUN to install npm dependencies. We'll install NPM dependencies first. In order to get the most out of Docker's build caching, it's best to install external dependencies first, since your dependency tree will change less often than your application code. A single cache miss will disable caching for the remainder of the build instructions. Application code typically changes more often between builds, so we will apply it as late in the process as we possibly can. To install your application's NPM packages, add these lines to the end of your Dockerfile: `` WORKDIR /var/lib/my-app `` COPY package*.json . `` RUN npm install `` First, we use the WORKDIR instruction to change the Dockerfile's working directory to /var/lib/my-app. This is similar to using the cd command in a shell environment. It changes the working directory for all of the following Docker instructions. Then we use COPY to copy our package.json and package-lock.json from the local filesystem to the working directory within the container. We used the wildcard operator (*), to copy both files with a single instruction. After the package files have been copied, use RUN to execute npm install Finally, we will use COPY to bring the rest of our application code into the container: `` COPY * . `` This will copy the rest of your NodeJS app's source code to the container, again using COPY and a much more broad usage of the wildcard. However, since we're using * to copy everything, we need to introduce a new configuration file called .dockerignore to prevent some local files from being copied to the container at this time. For example, we want to make sure that we aren't copying the contents of our local node_modules folder so that the modules we installed previously don't get overwritten by the ones we've installed on our development machine. It's likely that your local build platform is different from the one in the container, so copying your local node_modules folder will likely cause your app to malfunction or not run at all. The .dockerignore file is very simple. Simply add the names of files or folders that Docker should ignore at build time. You can use the * character as a wildcard just like you can in COPY instructions. Create a .dockerignore with this content: `` node_modules/ `` You may wish to add additional entries to the .dockerignore. For example, if you're using git for version control, you'll want to add the .git/ folder since it's not needed and will unnecessarily increase the size of your image. Any file or directory name you add will be skipped over when copying files via COPY at build time. Running Your App Now that we've installed all our external dependencies and copied our application code into the container, we're ready to tell docker how to run our application. We will run our app using node index.js. Docker provides the ENTRYPOINT and CMD instructions for this purpose. Both instructions have nearly the same behavior of defining the command that the container should use to start the application when our container is run, but ENTRYPOINT is less straightforward to override. They can be used together and Docker will concatenate their values and run them as a single command. In this case, you would provide the main application command to ENTRYPOINT (in our case, node) and any arguments to CMD (index.js in our case). However, we're just going to use CMD. Using both would make sense if NodeJS was our main process, but really, our main command is the whole node index.js expression. We could use only ENTRYPOINT but it's more complicated to override an ENTRYPOINT instruction's value at runtime, and we will want to be able to override the main command simply so that it's easier to troubleshoot issues within the conatainer when they arise. With all that said, add the following to the end of your Dockerfile: `` CMD ["node", "index.js"] `` Now Docker understands what to do to start our application. We provide our command to CMD (and ENTRYPOINT if it's used) in a different way than we supply commands to the RUN instruction. The form we're using for CMD is called "exec form" and the form used for RUN is called "shell form". Using shell form for RUN allows you to access all of the power of the sh shell environment. You can use variable and wildcard substitution in shell form, in addition to other shell features like piping and chaining commands using && and ||. When using exec form, you do not have access to any of these shell features. When passing a command via exec form, each element within the square brackets is joined with a space in between and run exactly as is. Using shell form is preferred for RUN so that you can leverage build arguments and chaining (recall we did that above for better layering/caching). It's better to use exec form for CMD or ENTRYPOINT so that it's always straightforward to understand which action the container takes at runtime. Conclusion I hope this article has helped to demystify the process of getting your app into a container. Docker can have a steep learning curve if you're not already a seasoned systems administrator, but features like portability, distribution, and reproducible builds make getting over the hump totally worth it, especially for developers working in teams....
Feb 22, 2021
12 mins
Publishing Docker Containers
Publishing Docker Containers If you've read my previous blog posts, Getting Started With Docker and Building Docker Containers, you may find yourself wondering what your options are for publishing your custom docker image. Thankfully, publishing your custom images is one of the simplest things you can do. I'll show you how. I'm going to assume you've already got Docker installed locally. If you don't have it and aren't sure how to get it, check out the older posts I linked before. This Is How We Do It The fastest and easiest way to get started publishing images is to make yourself a DockerHub account. Once you've done this, you'll need to log in with your Docker client. At a command prompt, enter the following: ` In return, you'll be prompted for your DockerHub username and password. Enter them to complete the login process. Once you've logged in successfully, you're ready to start publishing images. Publishing a public docker image to your personal account is incredibly easy. First, you need to make sure your image is tagged appropriately. You'll need to prefix the container's name with your DockerHub username so that docker knows what to do. Then you can publish using docker push. The whole process goes like this: ` You will see the status of your image upload. Once it's complete, head over to hub.docker.com and log in to see your published image. It's important to note that following this process will publish your image publicly. Anyone will be able to view your DockerHub profile and download your image. DockerHub does support private repositories, but only provides one free private image per account (paid accounts). You can make the image you just uploaded private by navigating to it from your DockerHub dashboard, selecting the "Settings" tab, and clicking the "Make Private" button. Alternatively, if you'd prefer to make sure your image is private as soon as you publish it, you may create your private repository on DockerHub, _before_ you use docker push to publish it. Click "Create Repository" on the DockerHub dashboard (after logging in) and follow the instructions given. Alternatives To DockerHub DockerHub isn't the only place you can publish your Docker image artifacts online. There are a number of other image repository hosts you can use both managed and self-hosted that offer a similar feature set to that of DockerHub. Amazon, Google, Microsoft each have a container registry offering, so if you're already using one of those clouds for hosting, you can leverage those providers' own solutions to keep your billing consolidated. Alternatively, GitHub and GitLab users can choose to keep their container images in those services alongside their application code. These are just a handful of the options available to you. A quick Google search will reveal even more vendors like Sloppy.io and Quay.io. For some, whether because of personal preference or business requirements, storing images on the public internet won't be desireable. The good news is that there are options for folks who need or want to host their images within their own private networks, or simply want to maintain control of their data. Two of the most popular open-source registry hosts are Harbor and Artifactory. Harbor is a Kubernetes (Cloud Native) focused solution. It also acts as a repository for hosting Helm Charts. Artifactory by JFrog is a one-stop shop for all your build artifact storage needs. In addition to being able to manage container images and Helm Charts, it can also manage RubyGems, NPM modules, or nearly any other sort of build artifact that you'd like to publish. These self-hosted options require administration and maintenance, so they are more labor-intensive solutions, but each is a great choice if you'd like to take image hosting into your own hands. Publishing to Other Registries If you choose to use a registry hosted somewhere other than dockerhub, your process for publishing images will change slightly. You'll still use the same tools but when tagging your image, the instructions will be slightly different. You will need to login to your preferred provider using docker login and you will need to provide your registry's hostname and other required metadata in your image's tag. The process for publishing to each provider differs slightly, but here is an example using AWS Elastic Container Registry (ECR). : ` You'll notice that the login process is different and requires you to use awscli to retrieve your password and pipe it into docker login, using "AWS" as your username. This is an added security measure. AWS changes your password regularly to keep your account secure. In ECR, all images are private by default, and you must create the repository before using docker push either via the AWS console or commandline interface. The logging and tagging process will differ slightly for each provider, but most provide straightforward and clear instructions for their process when you create your registry. Refer to your chosen provider's documentation for more info. Conclusion While Docker may be the company that introduced us to Linux container images, more and more vendors and open-source projects are getting involved in the hosting of images. You are no longer limited to using a host on the public internet or run by Docker. I hope this post has helped you understand more about all the other options available to you for image hosting....
Jan 20, 2021
4 mins
Building Docker Containers
Revisiting The Basics In my earlier post, Getting Started with Docker, I covered building a basic Dockerfile using the FROM, COPY, RUN, and CMD instructions and how to use a .dockerignore file to keep unnecessary files out of your images and containers. If you haven't read that post, go check it out to learn the basics of building Docker images. In this post, I'll cover some more advanced techniques for building container images. In addition, I recently published a post exploring advanced Docker CLI usage. I recommend giving it a read, too, if you aren't already a CLI pro. Installing Dependencies Using FROM with an official image for your language or framework will get you a long way, but many applications will require a system dependency that's not included in the FROM image. For example, many applications use ImageMagick for processing image uploads, but it's not included by default in the Debian images that most language images are based on. You can use RUN and apt-get to install missing dependencies. ` We started the Dockerfile just like the example from my earlier post, using the official NodeJS 15 image, but then we do 2 additional steps to install ImageMagick using apt-get. To keep the base image size low, Debian does not come pre-loaded with all of the data it needs to install packages from apt-get, so we need to run apt-get update first so that apt-get has that info. Then, we simply use apt-get install -y imagemagick to install imagemagick. The -y option is used to automatically respond with "yes" when apt-get prompts you to confirm the package installation. RUN vs CMD (vs ENTRYPOINT) By now you've probably noticed that there are two different instructions that run commands in your containers, RUN and CMD. While both are used to run commands, they're used in very different contexts. As we've seen in previous examples, RUN is used exclusively in the build process to run commands to modify the image as needed. CMD is different because it specifies the command that will be run by the container when you launch it using docker run. You can have as many RUN instructions as you need, but only one CMD. If you need to run a different command at runtime, you can pass it as an argument when you launch the container with docker run (check out my Docker CLI Deep Dive post). Additionally, Docker provides the ENTRYPOINT instruction. This is a command that the command you provide to the CMD instruction will be passed to as arguments. If you do not provide an ENTRYPOINT it will default to /bin/sh -c which will cause your CMD command to execute in a basic unix shell environment. The default ENTRYPOINT will satisfy most use cases. It's possible to override a container's CMD at runtime, but it is not possible to change its ENTRYPOINT. Docker's own ENTRYPOINT documentation goes into more detail about how it can be used. In the example Dockerfile above, you probably noticed that the way commands are passed to CMD and RUN looks different. Typically, when using RUN you provide commands using shell syntax, and you provide commands to CMD (and ENTRYPOINT) using the exec syntax, but they can be used interchangably. When using shell syntax, you can resolve shell expressions within your command. You can use shell variables and operators like output pipes (|) and redirects (>, >>), as well as boolean operations (&&, ||) to join commands. Exec syntax is much more straightforward. Each string within the bracketed array is joined with the other elements with a space in between and run exactly as provided. Layers and Caching Each isntruction in your Dockerfile adds a new Layer to your image. For performance reasons, it's considered a best practice to limit the total number of layers that comprises your finished image. There are a number of ways to do this. The simplest is by combining lines where RUN or COPY are used in close proximity to each other. Consider the example above where we installed ImageMagick; instead of using two separate RUN instructions, we can combine them using the bash && operator. ` Combining copy commands is a bit easier. The COPY instruction takes any number of arguments. The first N parameters provided to COPY are interpreted as a list of files to copy, and the N+1th paramter is the location to copy those files to. You can also use * as a wildcard character as I did in the first example when copying the package.json and package-lock.json files to the image. Anothing thing to consider when thinking about how your image layers are composed is caching. When Docker processes your Dockerfile to build your image, it runs each of the instructions in order to create the layers of your image. Docker analyzes each instruction before it is run and checks its cache to determine whether or not there is an identical existing image layer. When analyzing RUN instructions, Docker looks for any cached image layer that was built using the exact same command and uses it instead of rebuilding the same layer. For COPY and ADD instructions, it analyzes the files to be copied and looks for a previously built layer that has the exact same file contents. If at any point any instruction requires its layer to be rebuilt, all of the following instructions will result in a rebuild. Optimizing your Dockerfile to take advantage of the layer cache can greatly reduce the time it takes to build your image. Organize your Dockerfile so that the layers least likely to change are processed first (ex: installing dependencies) and those more likely to change (ex: copying application code) are processed later. Conclusion These techniqes will help you create more advanced container images and hopefully help you optimize them. However, I've only covered a small slice of the options available to you when building container images. If you dig deeper into the official Dockerfile reference you'll find information about all of the instructions available to you and more advanced concepts and use cases....
Dec 22, 2020
5 mins
Docker CLI Deep Dive
In my last post I covered all of the basics of getting started with Docker. In this post, I'll dive more deeply into the most common uses for the Docker CLI. I'm assuming that you've already got a working local Docker install. If not, you can refer back to my previous post. Running Containers docker run This is the most important, and likely the most commonly used Docker command. This is the command that is used to run container images. You can use it with images that you've built yourself, or you can use it to run images from a remote repository like DockerHub. docker run IMAGE This is the most basic way to use the run command. Docker will look for the named image locally first, and if it cannot find it, it will check to see if it's available from Docker Hub and download it. The image runs in the foreground and can be exited by pressing ctrl+c. docker run IMAGE COMMAND [ARGS] Most Docker images will define a specific command to be executed when the container is run, but you can specify a custom command to run instead by adding it to your docker run command after the image tag. Optionally, you can also append any arguments that should be passed to your custom command. Keep in mind that the container will only run as long as the command executed continues to run. If your custom command exits for any reason, so will the container. docker run -it IMAGE By default, you cannot provide any input to a running container via STDIN. In order to respond to prompts, you need to add the --interactive option to run the image in interactive mode, and the --tty option to connect your terminal's STDIN to the container's. You can combine both options using the shorthand option -it. docker run -p HOST_PORT:CONTAINER_PORT Often when running a container, you will want to make a connection from your local host machine into your local docker container. This is only possible if you use the --port or -p option to specify a local host port to connect to the internal port exposed by the container. docker run -d IMAGE If you don't need to interact with your container and you'd rather not block your terminal shell, you can use --detach or -d to run your container in the background. NOTE: All of these options can be combined as desired. docker exec You can use exec to run arbitrary commands inside of a running container. I use this most often when troubleshooting problems in containers that I'm building. If your container has bash or another shell available, you can use it to get an interactive shell inside of a container. docker exec CONTAINER COMMAND [ARGS] This is similar to docker run, but instead of giving it the name of a container image, you provide the ID or name of a running container. The command you specify will run inside the specified container in the foreground of your shell. You can use the -it and -d options with exec just like you can with run. Managing Containers and Images docker list List all of your running containers with their metadata docker list -a List all containers including inactive ones docker stop CONTAINER Terminate the container specified by the given ID or name via SIGTERM. This is the most graceful way to stop a container. docker kill CONTAINER Terminate the container specified by the given ID or name via SIGKILL. docker rm CONTAINER Delete the container specified by the given ID. This will completely remove it and it will no longer appear in docker ps -a docker stats Starts a real-time display of stats like CPU and memory usage for your running containers. Press Ctrl + c to exit. docker image list List all the container images present in your local docker registry. docker image remove IMAGE_NAME[:TAG] Delete the given image from your local repository docker image prune -a Over time, you will accumulate a lot of images that take up disk space but are not in use. This command will bulk delete any image you have stored locally that isn't currently being used in a container (including stopped containers). Building Images Aside from run, docker build is the the other crucial docker command. This command builds a portable container image from your Dockerfile and stores it in your local Docker registry. docker build PATH This is the most basic usage for build. PATH is a relative path for the folder your dockerfile is in. The image is stored within docker and tagged with a hash derived from the image's contents. docker build -t REPOSITORY_NAME[:VERSION_TAG] PATH The automatically generated hash image names aren't easy to remember or refer back to, so I usually add a custom tag at build time using the --tag or -t option. If you don't provide a version tag, it will default to latest Publishing Images docker tag You may find that you need to re-tag an image after it's built. This is what docker tag is for. docker tag SOURCE_IMAGE[:VERSION_TAG] TARGET_IMAGE[:VERSION_TAG] To use tag you simply need to provide a source image repository name and version tag and repository name and version tag for the new tag. As always, the version tags are optional and default to latest. docker login In order to pull images from private registries, you'll need to use docker login. docker login [REGISTRY_HOST] registry host defaults to hub.docker.com. You will be prompted for your username and password. docker push push is used to publish docker images to a remote registry. docker push REPOSITORY_NAME[:VERSION_TAG] Publish the specified image to a registry. If your repository name does not include a registry host, it will be published to [Docker Hub][https://hub.docker.sh]. If you want to use a custom registry, you will need to use docker tag to re-tag the image such that the repository name includes the registry host name (ex: docker tag my-image-repo my-registry.com/my-image-repo). You will most likely need to use docker login to login to your registry first. Conclusion Congratulations! You're on your way to being a Docker expert. However, it's worth noting that this list really only scratches the surface of the commands available in the Docker CLI. For more information check out the CLI docs or simply type docker --help at your shell. You can also use --help with most other docker CLI commands....
Dec 17, 2020
5 mins
Getting Started with Docker
Getting Started With Docker Introduction Docker is quickly becoming one of the most popular technologies for hosting web applications. It is a set of tools for packaging, distributing, and running software applications. Developers can write configuration files to create packages called images, which are distributed via decentralized, web-based repositories (some public, some private). Images downloaded from repositories are used as templates to create isolated environments called "containers" that run applications within them. Many containers may exist alongside each other on a single host. Memory and CPU resources are shared between all the containers running on a machine, but each container has its own fully isolated file system and environment. This is convenient for a number of reasons, but most of all, it simplifies the process of installing and running one or more applications on a single host machine. Installing Docker If you are on MacOS or Windows, the best way to install Docker is by installing Docker Desktop. It provides a complete installation of Docker and provides a GUI for managing it. You can use the GUI to start or stop your Docker daemon, or to manage installing software updates to the Docker platform. (Bonus: Docker Desktop can also manage a local Kubernetes cluster for you. It's not relevant to this article, but it provides a straightforward way to get started with Kubernetes, a platform for managing running containers across a scalable number of hosts). Linux users can install docker from their distribution’s package manager, but the Docker Desktop GUI is not included. Installation instructions for the most popular Linux distributions can be found in the Docker documentation. Working With 3rd Party Containers The first thing to try once you've installed Docker on your computer is running containers based on 3rd party images. This exercise is a great way to quickly display the power of Docker. First open your favorite system terminal and enter docker pull nginx. This command will download the official nginx image from Docker Hub. Docker Hub is a managed host for Docker images. You can think of it sort of like npm for Docker. We've pulled the newest version of the nginx image, however, as with npm, we could have chosen a specific version to download by changing the command to docker pull nginx:1.18. You can find more details about an image, including which versions are available for download, on its Docker Hub page. Now that we've downloaded an image, we can use it to create a container on our local machine just as simply as we downloaded it. Run docker run -d -p 8080:80 nginx to start an nginx container. I’ve added a couple options to the command. By default, nginx runs on port 80, and your system configuration likely prevents you from exposing port 80. Therefore, we use -p 8080:80 to bind port 80 on the container to port 8080 on your local machine. We use -d to detach the running container from the terminal session. This will allow us to continue using the same terminal while the nginx container continues to run in the background. Now, you can navigate to http://localhost:8080 with your web browser, and see the nginx welcome page that is being served from within Docker. You can stop the nginx container running in the background by using the docker kill command. First, you'll need to use docker ps to get its container ID, then you can run docker kill . Now, if you navigate to http://localhost:8080 again, you will be met with an error, and docker ps will show no containers running. The ability to simply download and run any published image is one of the most powerful features of Docker. Docker Hub hosts millions of already baked images, many of which are officially supported by the developer of the software contained within. This allows you to quickly and easily deploy 3rd party software to your servers and workstations without having to follow bespoke installation processes. However, this isn’t all that Docker can do. You can also use it build your own images so that you can benefit from the same streamlined deployment processes for your own software. Build Your Own As I said before, Docker isn’t only good for running software applications from 3rd parties. You can build and publish your own images, so that your applications can also benefit from the streamlined deployment workflows that Docker provides. Docker images are built using 2 configuration files, Dockerfile and .dockerignore. Dockerfile is the most important of the two. It contains instructions for telling docker how to run your application within a container. The .dockerignore file is similar to Git’s .gitignore file. It contains a list of project files that should never be copied into container images. For this example, we'll Dockerize a dead "hello world" app, written with Node.js and Express. Our example project has a package.json and index.js like the following: package.json: ` --- index.js: ` The package.json manages our single express dependency, and configures an npm start command with which to start the application. In index.js, I've defined a basic express app that responds to requests on the root path with a greeting message. The first step to Dockerizing this application is creating a Dockerfile. The first thing we should do with our empty Dockerfile is add a FROM directive. This tells Docker which image we want to use as the base for our application image. Any Docker image published to a repository can be used in your FROM directive. Since we've created a Node.js application, we'll use the official node docker image. This will prevent us from needing to install Node.js on our own. Add the following to the top of your empty Dockerfile: ` Next, we need to make sure that our npm dependencies are installed into the container so that the application will run. We will use the COPY and RUN directives to copy our package.json file (along with the package-lock.json that was generated when modules were installed locally) and run npm install. We'll also use the WORKDIR directive to create a folder and make it the image's working directory. Add the following to the bottom of your Dockerfile: ` Now that we've configured the image so that Docker installs the application dependencies, we need to copy our application code and tell Docker how to run our application. We will again use COPY, but we’ll add CMD and EXPOSE directives as well. These will explain to Docker how to start our application and which ports it needs exposed to operate. Add these lines to your Dockerfile: ` Your completed Dockerfile should look like this: ` Now that we have a complete Dockerfile, we need to create a .dockerignore as well. Since our project is simple, we only need to ignore our local node_modules folder. That will ensure that the locally installed modules aren’t copied from your local disk via the COPY . . directive in our Dockerfile after they've already been installed into the container image with npm. We'll also ignore npm debug logs since they're never needed, and it's a best practice to keep Docker images' storage footprints as small as possible. Add the following .dockerignore to the project directory: ` On a larger project, you would want to add things like the .git folder and any text and/or configuration files that aren't required for the app to run, like continuous integration configuration, or project readme files. Now that we've got our Docker configuration files, we can build an image and run it! In order to build your Docker image open your terminal and navigate to the same location where your Dockerfile is, then run docker build -t hello-world .. Docker will look for your Dockerfile in the working folder, and will build an image, giving it a tag of “hello-world”. The “tag” is just a name we can use later to reference the image. Once your image build has completed, you can run it! Just as you did before with nginx, simply run docker run -d -p 3000:3000 hello-world. Now, you can navigate your browser to http://localhost:3000, and you will be politely greeted by our example application. You may also use docker ps and docker kill as before in order to verify or stop the running container. Conclusion By now, it should be clear to see the power that Docker provides. Not only does Docker make it incredibly easy to run 3rd party software and applications in your cloud, it also gives you tools for making it just as simple to deploy your own applications. Here, we've only scratched the surface of what Docker is capable of. Stay tuned to the This Dot blog for more information about how you can use Docker and other cloud native technologies with your applications....
Dec 3, 2020
6 mins
Let's innovate together!
We're ready to be your trusted technical partners in your digital innovation journey.
Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.