Provisioning Docker containers with Ansible
@xav-b|September 3, 2015 (9y ago)13 views
Bring declarative syntax, templating and modules to Dockerfiles
Past months saw the rocket spread of container technology among developers and companies alike. Most of the excitement focused on Docker and its workflow to pack, share and deploy application environments. Other technologies catched up, and the automation tool Ansible quickly offered a powerful interface to manage them on remote servers. In this article, I will explore why and how to leverage its extensible design to merge the best of both worlds.
TL;DR : Docker empowers us to ship more and more complex projects, yet we still have to configure those containers. In many ways Dockerfiles are weak and this article describes how Ansible can bring configuration managers' features behind a clearer syntax. Learn how to build any stack, with just Python and Docker installed.
The New Stack, an online blog backed by industry leaders, analyzed a year ago "Are Docker Users Migrating to Ansible and Away from Puppet and Chef?" This article (see Resources) covers interesting points:
- Despite the rise of new workflows brought by containers, orchestration and configuration tools are thriving.
- New players like Ansible and Salt are challenging existing tools like Chef and Puppet.
- Many developers involved with Docker are also concerned about those tools.
This discussion on Hacker News, the famous startup incubator's community website, details the lack of robust tools that ship with Docker to offer an end-to-end experience, both for development and for production. Being able to spin up full isolated stack environments in a matter of seconds, or replicate an exact setup between servers, seduced a large community. But quickly, we needed more sophisticated stacks and we shifted from the toy container to a coordinated fleet with reproducible builds and multi-host communication skills. The Docker team addressed those evolving needs with new clustering tools, trying to morph Docker into a reliable solution for running containers at scale.
None of these projects, however, challenge Dockerfiles and how we build them. We are still manually hard-coding tasks and repeating common setups. So, orchestration and the configuration management of containers are yet to be solved.
This article will explore how existing configuration managers fit in the container world, and how we can take advantage of their battle-tested skills to improve our provisioning workflow.
The rise of DevOps
Before diving into the world of configuration managers, let's deepen our understanding of the pain points mentioned above. Modern applications usually involve a complex deployment pipeline before hitting production.
Best practices suggest that you release code early and often, following each small iteration. Manually performing the tasks is not scalable, and organisations have started refining the process half-way between developers and sysadmins, so DevOps was born. Since then, agile teams are trying to strengthen and automate the way code is tested and shipped to their users.
State of the art technologies and methodologies allow companies to gain confidence over the code on their server. Nevertheless, the whole thing is continuously challenged as applications are growing in size and complexity. More than ever, we need powerful, community-driven tooling to support our products.
Solutions and limitations
In this environment, Ansible offers an interesting framework to manage infrastructures. You can gain control over a servers' definition, like packages to install or files to copy, and scale the configuration to thousand of them. Ansible playbooks constitute a safe representation of the cluster's desired state. Its YAML syntax and extensive list of modules produce readable configuration files any developer can quickly understand. Unlike Chef or Puppet, it is also agent-less, meaning all it takes to perform commands on remote hosts is an SSH connection. This is good news given Ansible will handle DevOps complexity for us.
The Ansible project, however, was designed before the rocket rise of containers and their revolution in the cloud world. So, is it still relevant? Micro-services paradigms and complex development environments have introduced new requirements:
- Lightweight images. For ease of transportation or cost savings, they are stripped down to their minimal dependencies.
- Single purpose, single process. The SSH daemon should not run if it's not strictly needed by the application.
- Ephemeral. Containers are expected to die, move and resurrect all of the time.
In this context, Ansible's vision addresses two issues.
On the one hand, they developed a module to manage Docker hosts and containers at a higher level. While we can debate which orchestration tool is best suited (see Kubernetes, from Google, or New Relic's Centurion), it performs efficiently, and we will use it as is for our purpose.
On the other hand, they suggest that you build containers starting from their official Ansible image, and execute playbooks in local mode from the inside. This approach fits remarkably well with Packer and certainly suits many cases. But its drawbacks are deal breakers in many others.
- We're locked with one base image and no longer can take advantage of our special recipes or other stacks.
- The resulting artifact has Ansible and its dependencies installed, which has nothing to do with the actual application, and makes the artifact heavier.
- Although Ansible can manage thousand of servers, it will only provision a single container.
I think this approach considers containers as small VMs, where we should use a specific solution. Fortunately the project has a modular design. Modules spread among different repositories and most of its capabilities can be extended through plugins. In the next section, we're going to setup a proper environment to adapt Ansible to our needs.
An effective environment
Let's take a step back and describe an alternative strategy. I want a tool
trivial to deploy, that configures application environments in lightweight
containers. Apart from those containers, we need a client node with Ansible
installed to issue commands to a docker daemon. This setup is shown in
As you can see, I am minimizing the dependencies to manage by executing Ansible from a container. This approach limits the host to a communication bridge between containers and commands.
Many options are available to bring Docker to your server.
- Use docker-machine to install it on remote hosts.
- Install locally. As a side note, you probably don't want to manage a serious container-based infrastructure by yourself at this point. Consider the next option for production goals.
- Rely on external providers.
- Use the awesome boot2docker on Windows and Mac.
Whatever solution you choose, make sure it deploys a docker version above 1.3
(the release which introduced process injection). You also need to run an SSH
server to securely process Ansible commands. Instructions in
A lot is left to said regarding the configuration of SSH but security concerns are beyond the scope of this article. The curious/paranoiac reader should explore /etc/ssh/sshd_config to learn more about available options.
The next step is to load the public key on the client container running Ansible
(see
Instructions are adapted from the official build and automate a working
installation from commit fbec8bfb90df1d2e8a0a4df7ac1d9879ca8f4dde
on Ansible
master branch. For convenience, I also pack hosts and ansible.cfg configuration
files. Using a container guarantees we share the same environment but, for
information, the Dockerfile installs python version 2.7.10 and Ansible 2.0.0.
The following steps build and validate the Ansible container from which we will
issue commands (see
First, this is essential to export the DOCKER_HOST environment variable, since
Ansible will use it to connect to the remote docker daemon. When using an HTTP
endpoint, like below, we need to modify /etc/default/docker (see
sudo service docker restart
restarts the daemon to pick up the changes we made
to its configuration file.
So far so good, we are able to issue commands from a container. From now on, we are going to leverage Ansible's Docker-specific extensions.
Dynamic inventory
At its core, Ansible automates its execution through playbooks: YAML files
specifying every task to perform and their properties (see
The command below will query the Docker host, import facts, print some and use
them to perform the second task against the very builder container we are in
(see
Ansible can't reach the container, since it doesn't run an SSH server. However, it would be an additional process to manage, completely unrelated to the actual application. Let's crush this difficulty.
Optimization of commands execution
Ansible's extensibility has us covered again. Connection plugins are classes implementing commands transport, like SSH or local execution. Docker 1.3 came with docker exec and the ability to run tasks inside the container namespace. And since we learned earlier how to target specific containers, we can leverage this ability to process the playbook.
Like other plugin types, connection hooks (see
This code hooks Ansible's methods to perform commands through a more native
docker exec
, instead of the default ssh. We need to rearrange a few setup
steps yet, to instruct Ansible to use this plugin (see
At this point, we managed to execute Ansible tasks within containers without many requirements on them or the host. This is a huge win regarding our initial specs, but we still need to address remaining imprecisions.
Containers lifecycle
The previous demonstration ran a task on the same node. A more realistic
workflow would spin up a new base image, provision it and finally commit, push,
and shutdown the resulting artifact. Thanks to the built-in Docker module, we
can achieve those steps without additional code (see
As I mentioned, it would be convenient to automatically name and store the image built on successful provisioning. Unfortunately, Ansible's Docker module does not implement methods to tag and push images, but we can overcome this limitation with plain shell commands.
Our tool is taking shape, but we are still lacking an essential feature: layer caching.
Cache implementation
Building containers with Dockerfiles often involves many iterations to get it right. In order to significantly speed up the process, successful steps are cached and reused in subsequent runs.
To replicate this behavior, we need to commit the container state after each successful task. Then, we can restart the provisioning process from the last snapshot. Ansible promises idempotent tasks, so previously successful ones won't be processed twice.
Ansible let's you hook on task events with callback plugins. Those classes are expected to implement specific callbacks, triggered at various steps of the playbook lifecycle.
We register this plugin like we did with the 'docker exec connection', i.e. by uploading the code in the expected location and rebuilding the builder environment.
After re-building the builder image and re-running ansible-playbook, the module
is automatically loaded and you can see how intermediate containers were created
(see
Conclusion
Provisioning is a complex topic and the implementation we followed laid the foundations for further development. The code itself was simplified and some steps still require human intervention. While we are free to choose the base image we need, Ansible needs Python to be installed in it (most of the time). The cache implementation also deserves a lot more attention, with more specific commit namings or cleanup skills for example.
Yet, we crafted a tool capable of executing Ansible playbooks to manage containers' configuration. This achievement brings a lot of benefits. We can leverage the full power of Ansible, combine, reuse and set up declarative build files for the micro-services of an infrastructure. We are also avoiding lock-in issues. The plugins we developed wrap playbooks that you can reuse against different targets, and minimal requirements make the project compatible with most providers.
I hope this article highlighted the limitations of Dockerfiles and some interesting ideas around the provisioning of complex containers. This is definitely an exciting time to build tooling with them.
Resources
-
Check out the official Ansible documentation.
-
Learn about Ansible's vision.
-
Get started with the Docker documentation.
-
Check out lniks to official Docker repositories.
-
Compose is a tool for defining and running complex applications with Docker.
-
Read a tutorial on installing Docker locally Install locally
-
Connect to the Bluemix developer community
-
Cloud computing service models (Dan Orlando, developerWorks, February 2011): In this three-part series find straightforward, real-world examples of cloud computing to help eliminate the confusion around the concept.
-
Exploring IBM Bluemix (David Barnes, developerWorks, September 2013): Join IBM's David Barnes as he demonstrates IBM Bluemix, an in-the-cloud platform that provides the cloud application capabilities that will power the next generation of cloud applications and services.
-
Read an Introduction to Hadoop on the cloud using. BigInsights on BlueMix.
-
IBM Emerging Technologies YouTube channel keeps you informed on new technologies from IBM.
-
To listen to interesting interviews and discussions for software developers, check out developerWorks podcasts.
-
developerWorks technical events and webcasts: Stay current with developerWorks technical events and webcasts.
-
Get Ansible, the powerfulIT automation tool.
-
Use docker-machine to install Docker on remote hosts.
-
Consul is a tool for discovering and configuring services in your infrastructure.