Plug-and-play service discovery with Consul and Docker
@xav-b|October 15, 2015 (9y ago)15 views
This article is about tooling ourselves to face modern deployment, which often involves microservices at scale. Instead of creating a single monolithic application, I have split an application into single-purpose units that collaborate with one another. I therefore get modular development with a separation of concerns and horizontal scaling for free.
Is it really for free? Not quite, actually. While this trendy paradigm promotes a nice composition of specific components, we inherit the hassle of orchestrating all of those moving parts of the infrastructure.
The wide adoption of Docker, the container engine, highlights such limitations. Although Docker has unlocked an exciting workflow to develop, ship, and run programs, many developers hit a wall when considering multi-host deployment or old-school problems such as log management. Nevertheless, to tackle the gap, emerging projects offer reliable primitives, such as Consul, described on GitHub as "a tool for service discovery, monitoring and configuration."
“My approach is to build a non-intrusive solution to service discovery, which will give us an essential tool for DevOps and will be immediately actionable without locking us behind frameworks.”
I hope this article will help you gain a better understanding of the challenges that come with ultra-agile cloud deployment.
Context and goals
First, let me introduce the pain point I want to solve. A typical web application today involves a front end of varying complexity, a back end, and a database, and it probably makes use of third-party services as well. All of these technologies communicate over the network, and we can take advantage of that fact: the back end is deployed where resources are available, and a database shard spins up nodes for performance considerations. Meanwhile, the whole setup dynamically evolves across the cluster to handle the load.
Now, how can the back end find the database URL in this changing cloud topology? We need to design a process that exposes to applications an up-to-date knowledge of the infrastructure.
Introducing Consul
Consul is one of the open-source projects developed by HashiCorp, the creator of Vagrant. It offers a distributed, highly available system to register services, store shared configuration, and keep an accurate view of multiple data centers. Finally, it is distributed as a simple Go binary, which makes it trivial to deploy.
To make the steps easy to follow (and consistent with our topic), we are going to use Docker. Installation for major platforms has been made as easy as possible and you can find step-by-step instructions on the official website. Once done, and thanks to progrium (aka Jeff Lindsay), the one-liner in Listing 1 is enough to bootstrap a Consul server.
Note: While the official documentation recommends that you spin up at least three servers to handle failure cases, those considerations are beyond the scope of this article.
We are already able to query our infrastructure and discover one service: Consul itself (see Listing 2).
As you can see, Consul stores important facts about services. It covers information and tags, the fundamental data for programmatically accessing remote services.
Declarative services
Let's take a look at the role that registration, external services, and Docker play with our solution. To illustrate, let's imagine a modern application requiering to store data in MongoDB, and send emails through Mailgun. The latter is an external service, while we will run the former by ourselves. Read on to see how we can handle both cases.
Registration
In order to expose those valuable properties, we first need to register the service . We will run a Consul agent on each node of our cluster, which is responsible for joining a Consul server, exposing the node's service, and performing a health check (see Listing 3).
With 10M+ downloads, MongoDB is a popular choice as a document database. Let's use it and save the following file in /etc/consul.d/mongo.json (see Listing 4).
The syntax offers a concise, readable, and declarative way of defining service properties and our health check. You can pick up those files in a version control system and immediately identify an application's components. The file above declares a service named "mongo" on port 27017. The check section provides the Consul agent a script to test whether the node is healthy or not. Indeed, when requesting the server for service requirements, we need to be sure it returns reliable endpoints.
All that remains is starting the actual Mongo server and the local Consul agent (see Listing 5).
Did it work? Let's query the Consul HTTP API (see Listing 6).
Given a Consul agent or server address, any piece of code in the cluster capable of HTTP requests is now able to consume that information. Shortly, I will explain how to process it all, but before that, let's cover how to register services that are out of our control, and, as a bonus point, how to automate the steps above with Docker.
External services
Usually, developers should avoid adopting a "not invented here" attitude and going on to reinvent the wheel. That's the reason we are willing to integrate third-party services in our application. But in our case, it means we can't start a Consul agent on the appropriate node. Once again, however, Consul has us covered (see Listing 7).
Since Mailgun is a web service, we use the HTTP field to check API availability. To dive deeper into Consul superpowers, you can refer to its comprehensive documentation (see Resources below).
Docker integration
So far, a Go binary, a single JSON file, and a few HTTP requests have given us a service discovery workflow. We are not tied to a particular technology, but as mentioned earlier, this agile setup is especially suitable for microservices.
In this context, Docker lets us package services into a reproducible, self-registering container. Given our existing mongo.json, all it takes is the Dockerfile and Procfile in Listing 8.
Dockerfiles let us define a single command to run when booting up containers.
However we need now to run both MongoDB and Consul. Goreman
let us achieve
just that. It reads a configuration file named Procfile
, defining multiple
processes to manage (liefecycle, environment, logs, ...). Such approach in the
container world is a debate on its own, and other solutions exist. For now, it
does the job in a simple manner.
Here is the Procfile:
And here are the shell commands to build the container:
Awesome. Having Docker and service discovery working together definitely makes us look good!
We can fetch more details by querying $CONSUL_IP:8500/v1/catalog/service/mongo
like in Listing 6, and, especially, find the service port. Consul exposing the
container IP as the service address, this approach works as long as the
container exposed the port, even if Docker mapped it to a random value on the
host. On multi-host topologies, however, you will need to explicitely map the
container's port to the same on the host. In case it would be a limitation,
projects like Weave might be able to help.
To sum up, here's how we can expose services information throughout several data centers:
- Launch at least one Consul server and store its address.
- On each node:
- Download the Consul binary.
- Write service and check definitions in the Consul configuration directory.
- Launch the application.
- Launch the Consul agent with the address of another agent or server.
Create infrastructure-aware applications
We have built a convenient and non-intrusive workflow to deploy and register new services. The next logical step is to export this knowledge to dependent applications.
The Twelve-Factor App makes a serious case about storing configurations in the environment:
- Maintain strict separation of the configuration from changing code.
- Avoid having sensitive information checked into repositories.
- Keep the language and operating system agnostic.
We will write a wrapper capable of querying a Consul endpoint for available services, export their connection properties into the environment, and execute the given command. Choosing the Go language gives us a potential cross-platform binary (like the other tools so far), and access to the official client API (see Listing 9).
The next command compiles this prototype and validates its behavior.
The last command should print something like MONGO_PORT=27017
, among other
variables. Any command should now be able to read services data from its
environment.
Reconfigure the infrastructure dynamically
A situation we are still likely to face is challenging our current implementation. A web app could start like the one above and successfully connect to mongodb, and bad things could still happen on database failures or migrations. What we want is to dynamically update the application's knowledge when the infrastructure is experiencing either normal or unexpected changes.
While designing a robust solution to this problem might require an article on its own, we can explore an interesting approach with the project Consul Template.
Consul Template queries a Consul instance and updates any number of specified templates on the file system. As an added bonus, Consul Template can execute arbitrary commands when a template update completes. Therefore, we can use Consul Template to monitor services (addresses and health) and automatically restart the application whenever a change is detected. Since our wrapper will fetch services data, the runtime environment will mirror the correct state of the infrastructure (see Listing 11).
Use Consul Template to monitor services and restart the application
Bonus point : we now enjoy all the benefits of a templated configuration file. Here is an example adapted from hackathon-starter.
This experience requires more thought. In particular it could be tricky to restart the application to update its knowledge of services. We could instead send it a specific signal to give it a chance to handle gracefully the changes. However it requires us to step in into the application's code base, although, until now, it didn't need to be aware of anything. Moreover, the rise of microservices on fallible cloud providers should encourage us to run stateless, failure-resilient apps. I think Martin Fowler makes a good point with its article on Phoenix servers.
Nevertheless, the composition of powerful tools with clear contracts allows us to integrate distributed applications into complex infrastructures, without limiting ourselves to a particular provider or application stack.
Conclusion
Service discovery, and more broadly services orchestration, is one of the most exciting challenges of modern development. Big players, along with the developer community, are stepping in and pushing technologies and ideas further.
IBM Bluemix™, for example, addresses this challenge with workload scheduler, smart databases, monitoring, cost management, data synchronization, REST API, and more. Only a handful of tools can enable developers to focus solely on the loosely coupled modules of their application.
Thanks to Consul and Go, we have been able to take a step in this direction and build a set of services featuring:
- Self-registration
- Self-update
- Stack agnostic
- Drop-in deployment
- Container friendliness
Nevertheless, we have covered only the basics of a production deployment. We could go further and extend our wrapper with encryption and offer a consistent integration to safely expose credentials such as service tokens. Overall, contemplating a plug-and-play approach to service discovery frees us to think about the other parts of a modern deployment pipeline without many of the usual constraints .
Resources
Learn
- The Twelve-Factor App is a methodology for building software-as-a-service apps.
- Get started with the Docker documentation.
- Read a blog post by Benjamin Wootton on Microservices.