Last week I made a little presentation (a kind of overview) on using Docker and MicroServices in a real production project. Today I want to share some of its slides with you. Why? Because I spent the whole day making them! :) Plus, I really do believe it might be useful for someone.
Let’s start from the most interesting part. Here is a deployment scheme of our application.
Looks a bit difficult, yeh? But don’t worry - in the end of this article you will understand all the details!
I would suggest to start our journey from answering two extremely important questions.
What is MiscroSevices architecture?
Mr. Fowler says that
The microservice architectural style is an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms.
In short, we can consider it as a reincarnation of plain old SOA, but with the following pros:
- short development and release circles (dealing with small components is much easier than with a big monolith);
- independent deployment;
- appropriate technologies / instruments for each task (if your task needs java - use java, if your task is already implemented in Ruby - just take Ruby!);
- scalability and fault-tolerance by design (an ability to have a dozen instances of the same service solves both of the problems);
- multi-teams development (since each of your services has strict api / boundaries, you can easily split development process across several teams);
- easy to test (small apps are easier to cover with mock tests);
- infrastructure automation (the only way to deal with dozens or even thousands of services is automation of everything in your infrastructure - so you have no way but do it :);
- personal responsibility (when you gather all the services together and make them running as a whole application, support of such system becomes a little bit complicated - so, your developers must be always involved in this process);
- dealing with a highly distributed system (such systems are always more complicated than monoliths):
- remote calls (since you have a lot of small services which have to speak with each other, be ready to pay for remote calls),
- sophisticated monitoring and logging (you have to gather logs and app metrics from all of your cluster);
- skilled team (distributed technologies require more qualification from your developers and support engineers);
What is Docker and why do I need it?
Docker is a platform for lightweight virtualization and software distribution. What does it mean?
Well, let’s start from lightweight virtualization. The following picture (stolen from the internet) shows the main difference
between Docker and a classic OS-level virtualization.
In other words, Docker uses Linux kernel features (e.g. cgroups) to pack a process into its own sandbox and control the amount of system resources for it.
As for software distribution, Docker packs your application with all its dependencies (including the OS libraries) into a special file - “image". Then you upload this image into a special repository in your intranet (or into a public Docker Hub) and can easily install it on every server where you have Docker.
So, what is profit?
- software distribution (you just build an app on your local machine or CI server, push it into the registry and install wherever you want);
- platform independence (nobody can say “it works on my local machine” when a build fails in test - all the libs are included into the image);
- lightweight containerization (you don’t have to pay for a hypervisor and a guest os anymore);
- appropriate technologies / instruments for each project;
- infrastructure as a part of your app (you can ship your infrastructure components - logstash, consul, etc - as docker containers and consider them as a part of your application);
What about the scheme above?
In the very beginning we have a host (one of many) with Docker service installed on it. Nothing more. A clean sheet of paper.
Let’s start from installation of one of our microservices (we use SpringBoot). Technically, we have much more than one service per host, but I want to keep this scheme as simple as possible :)
Since our application is distributed, we must have a logging system which gather all the logs from every host into a single storage. Logstash is a client part of ELK solution which we use for that.
Our microservices must have a way to know about each other. That’s where Consul comes into play. Each host has a Consul client, so that our services can ask it where to look for other services. Also all the Consul clients are connected to a Consul cluster (which is hosted on separate servers). That is a common approach for Consul.
But wait, Consul is not a part of Docker, that is an absolutely separate project. How does he know which services are currently running on the hosts?
Hopefully, we have Registrator - a third party tool which listens to the Docker socket and registers (and deregisters) your Docker containers in different service discovery tools (Consul, Etcd, SkyDns).
The next stop is communication between services. Since every service is running using its own port, Consul returns information about registered services as SRV-records. So, you have two possible ways - use a client-side solution for analyzing SRV records and load balancing. Or use Nginx. As we say here, in Russia - in every unclear situation - use Nginx :)
But in this case we also need Consul Template - an official utility from Consul’s authors for using Nginx in couple with Consul. It (re)reads Consul state and updates Nginx config, if necessary.
Now our services can invoke each other through Nginx on localhost (e.g. the following URL will be used for “xyz” service: http://hostname/xyz)
Do you remember that Consul is distributed across all the datacenter? That means that our services can invoke everyone in its DC.
Now let’s gather all the things together!
Notice, that DCs are absolutely independent. Each data center is a self-sufficient unit, which solves two problems at once: load balancing and fault-tolerance.
The only problem here is ElasticSearch. We want to observe our logs in a single interface, but according to the ES documentation and common sense you shouldn’t build an ES cluster which assumes traffic across different data centers. Hopefully, ES provides us the “tribes” mechanism.
How to support this?
That is the most interesting and difficult question :)
Thanks to Docker, for the first time you can say that all our containers are black boxes. Support just deploys them and doesn't try to understand what is going on inside of them. In case of problems - just asks a developer for help. Suddenly, it works… but only until the first more or less serious problem.
So, the answer on this question comes from the points that I’ve mentioned in the very beginning:
- you should have a very skilled team;
- developers and ops (support) must work together (literally, in the same room);