in mesos marathon haproxy bamboo docker vagrant ansible ~ read.

Mesos as a Docker containers farm

When a developer starts his/her first microservice application, he/she usually doesn't worry about such an interesting thing like Orchestration. All he/she you have are 2-4 servers and Ansible scripts which solve most of the problems. But once your application becomes bigger or you decide to use one environment for different projects, you need more servers… and a tool to manage your services running on these servers.

But wait! I've just mentioned Ansible, isn't that a solution of the problem? Well... no, it isn't :) Ansible solves only one problem - deployment. Using it you still have to decide a lot of other microservice-related issues: remember how many available resources each of your servers has, manually manage the inventory files to fit with servers capacity, monitor if your app is up running, rebounce the services when one of your nodes goes down, control port numbers clashes, etc. Even if you have 4 servers and 10 services these problems become sensible.

And that's where we want to ask Mesos for help. What is Mesos? That is a kind of cluster manager which helps you to run your app on a distributed environment. The key benefits of Mesos are:

  • resource management and effective utilization;
  • application lifecycle control;
  • Docker containers support;

The last point makes Mesos a perfect solution for our case :)

Now when you are interested, let's consider how to set up a Mesos-driven environment and deploy a simple distributed web application into it. The further part of the article is based on an existing GitHub project. So that you always can see how an example from the article can be used in the real world :)

Installation

Since finding 7 unused servers is a big problem for every organization, we are going to use Vagrant (as a perfect tool for managing virtualization process) for building our cluster on a local machine (HP Z230, i7-4770 3.40GHz, 16gb RAM). Also we've named Ansible a convenient way for deployment, so nothing can stop us from splitting up the installation process into Ansible roles to encapsulate most of the standard Linux commands behind them.

Environment

First of all we need a basis for our Mesos cluster - a set of virtual machines, which we'll use further. Hopefully, that is a piece of cake with Vagrant.

A snippet from the Vargantfile:

HOSTS = {  
  "masters" => {
    "master1" => "192.168.99.11",
    "master2" => "192.168.99.12",
    "master3" => "192.168.99.13"
  },
  "nodes" => {
    "node1"   => "192.168.99.21",
    "node2"   => "192.168.99.22",
    "node3"   => "192.168.99.23"
  },
  "logs" => {
    "log1"    => "192.168.99.24"
  }
}

Vagrant.configure(2) do |config|  
  config.vm.box = "bento/centos-7.1"

  HOSTS['masters'].each_with_index do |host, index|
    config.vm.define host[0] do |machine|
      machine.vm.hostname = host[0]

      machine.vm.provider "virtualbox" do |vb|
        vb.memory = "1024"
        vb.cpus = "2"
      end

      machine.vm.network "private_network", :ip => host[1]

      machine.vm.provision "ansible" do |ansible|
          ansible.playbook = "ansible/master.yml"
          ansible.groups = { "masters" => HOSTS['masters'].keys }
          ansible.extra_vars = {
            zk_id: "#{index + 1}",
            zk_host1: "master1",
            zk_host2: "master2",
            zk_host3: "master3",
            mesos_quorum: 2,
            marathon_host1: "master1",
            marathon_host2: "master2",
            marathon_host3: "master3",
            hosts: HOSTS['masters'].merge(HOSTS['nodes'])
          }
      end

    end
  end

end  

As you can see, we use a bit of Ruby magic to perform the following steps:

  • build the virtual machines based on Centos 7.1 image;
  • set up their resources limits (cpu, memory);
  • combine them in one network;
  • launch a special Ansible playbook (ansible/master.yml) with the predefined set of variables one each host;

Shortly speaking, the Vagrant file describes how to build 7 virtual machines: 3 masters, 3 slaves and 1 log storage (technically, this part should be distributed too, but the resources of the host machine are limited). All you need to make them run is one simple command:

vagrant up  

Initial cluster state

Since the first 3 steps are Vagrant-specific and aren't interesting for us, let's move to the main part of this article - setting up the cluster through Ansible.

Cluster

As you can see from the Vagrantfile, there are 3 major playbooks with their own roles:

Below we'll consider each role separately, but for now there is one important moment you have to know: each role is dependent on the "os" role. This one is about setting up DNS config for local network and YUM repos. Strictly speaking, it should be moved from dependencies up to the playbook level, but for now things are how they are :)

Zookeeper

Zookeer is a major component of our system, a glue layer for the Mesos infrastructure. It helps us to build the cluster and allows another application from the Mesos ecosystem (e.g. Bamboo, MesosDns) to communicate with it.

Its installation is quite simple and doesn't require any tricks except of a repo with RPM package. I know there are a lot of people who likes to build apps from the sources, but here we are going to use vendor-based MesoSphere packages.

All you need is install the package, set node's id, update the config and start the service.

A snippet from the Zookeeper installation task:

- name: Install Zookeeper
  yum: pkg=mesosphere-zookeeper state=latest update_cache=yes

- name: Configure Zookeeper ID
  lineinfile: dest=/var/lib/zookeeper/myid create=yes line="{{zk_id}}"

- name: Configure Zookeeper hosts
  template: src=zoo.cfg.j2 dest=/etc/zookeeper/conf/zoo.cfg

- name: Enable Zookeeper
  service: name=zookeeper state=started enabled=yes

Here we have to recall how the Vagrantfile looks like. Remember this part?

ansible.extra_vars = {  
  zk_id: "#{index + 1}",
  zk_host1: "master1",
  zk_host2: "master2",
  zk_host3: "master3",
  mesos_quorum: 2,
  marathon_host1: "master1",
  marathon_host2: "master2",
  marathon_host3: "master3",
  hosts: HOSTS['masters'].merge(HOSTS['nodes'])
}

Vagrant passes all the necessary variables into Ansible, so that we can easily use them in the roles.

As for Zookeeper's config, all we need to configure it are hosts IPs:

server.1={{zk_host1}}:2888:3888  
server.2={{zk_host2}}:2888:3888  
server.3={{zk_host3}}:2888:3888

tickTime=2000  
initLimit=10  
syncLimit=5

dataDir=/var/lib/zookeeper/  
clientPort=2181  

Here is a screenshot of a simple Zookeeper UI (ZK UI) for our cluster in the final state:

Cluster state after the step

Mesos

As was told before, guys from MesoSphere gently assembled Mesos package for us. So, the installation process won't be difficult.

The only point here is the fact that Mesos consists of two parts:

  • master nodes (which are responsible for all the management logic);
  • slaves nodes (which run the apps and gather info about hosts resources);

In other words, we have to implement different installation logic for master and slaves. Fortunately, that's quite simple with Ansible!

A snippet from the Mesos installation task:

- name: Install Mesos
  yum: pkg=mesos state=latest update_cache=yes


- name: Configure Mesos ZK settings
  shell: echo "zk://{{zk_host1}}:2181,{{zk_host2}}:2181,{{zk_host3}}:2181/mesos" > /etc/mesos/zk

- name: Configure Mesos / Master quorum
  shell: echo "{{mesos_quorum}}" > /etc/mesos-master/quorum
  when: mesos_type == "master"

- name: Configure Mesos / Slave containerizers
  shell: echo "docker,mesos" > /etc/mesos-slave/containerizers
  when: mesos_type == "slave"

- name: Configure Mesos / Slave deployment time
  shell: echo "10mins" > /etc/mesos-slave/executor_registration_timeout
  when: mesos_type == "slave"

- name: Configure Mesos / Slave resources
  shell: echo "ports(*):[10000-11000]" > /etc/mesos-slave/resources
  when: mesos_type == "slave"


- name: Enable Mesos Master
  service: name=mesos-master state=started enabled=yes
  when: mesos_type == "master"

- name: Disable Mesos Slave
  service: name=mesos-slave state=stopped enabled=no
  when: mesos_type == "master"


- name: Enable Mesos Master
  service: name=mesos-master state=stopped enabled=no
  when: mesos_type == "slave"

- name: Disable Mesos Slave
  service: name=mesos-slave state=started enabled=yes
  when: mesos_type == "slave"

For both cases we install the "mesos" package (remember, the related repo provided by the "os" role) and Zookeeper URL.

The next stop is settings. Here we provide only mandatory settings like the quorum size for master nodes and docker support for slave modes. If you are interested in more specific configuration consider Mesos official doc.

Since MesoSphere package provides both master and slave services in one package, we have to disable redundant parts depending on current role (master/node).
For that Ansible "conditional" mechanism is used. If you take a look at the "master" playbook more attentively, you'll find that we pass a special mesos_type variable:

{ role: "mesos", mesos_type: "master", tags: "mesos" }

Together with with conditional, that is the simplest way to split up the role into two flows.

After Mesos is installed, you can take a look at its Web IU and try to click on the buttons: http://192.168.99.11:5050 Notice, if the current host isn't a Mesos Master's cluster leader, the UI redirects you to master's host. And since we use DNS shortcuts (e.g. "master1") inside our virtual machines you should do the same on your host machine too.

So... that is it - we have our cluster!

Before you start drinking a champagne, there are few interesting moments to mention.

First of all, remember - each task has its own context - so-called "sandbox". You can open it and analyse all the output (see the screenshots from the Marathon section). Keep in mind that Docker containers must be pulled first - so, if you didn't specify enough time for container start - be ready for failed tasks w/o any messages in the UI (you still can find them in /var/logs/messages on the related node):

Dec 15 15:06:13 node1 mesos-slave[1028]: I1215 15:06:13.167913 2223 slave.cpp:3882] Terminating executor service_fibonacci.3e0f49bd-a33d-11e5-bcc9-080027ee6311 of framework f1ed9d76-36c0-477a-8293-71666f54159a-0000 because it did not register within 1mins  

To fix that configure the executor_registration_timeout as shown in the snippet above.

Also do not forget to set up the grace period for health checks (a java app might wake up for quite a long time :) Otherwise your app will be killed and re-bounced before it can start (health check settings will be discussed later):

Starting task service_fibonacci.52cd1fd7-a33f-11e5-bcc9-080027ee6311  
2015-12-15 15:20:14.575 INFO 1 --- [ main] n.k.mesos.fibonacci.Application : Starting Application on 1e7ede2d7f35 with PID 1 (/app.jar started by root in /)  
...
2015-12-15 15:21:15.703 INFO 1 --- [ost-startStop-1] o.s.b.c.embedded.FilterRegistrationBean : Mapping filter: 'applicationContextIdFilter' to: [/*]  
2015-12-15 15:21:20.718 INFO 1 --- [ main] o.s.s.concurrent.ThreadPoolTaskExecutor : Initializing ExecutorService 

Killing docker task  
Shutting down  

Cluster state after the step

Docker

Since we are going to run Docker containers on our slave hosts, it would be extremely logical to install Docker itself on those hosts :) Luckily, installation is extremely simple:

A snippet from the Docker installation task:

- name: Install Docker
  yum: pkg=docker,device-mapper-event-libs state=latest

- name: Enable Docker
  service: name=docker state=started enabled=yes

Cluster state after the step

Marathon

Despite the fact that our Mesos cluster is up and running, we still can't run our Docker containers on it. Strictly speaking, the only thing we can run on Mesos itself is a Mesos Framework. And of course, we have one which suits our requirements - Marathon.

From technical point of view, Marathon is just a simple Java package which can be started with jar command. Gladly, we have an RPM package, so we don't need to worry about its "demonization", configuration and control. Moreover, since our package also built by MesoSphere, it uses the same configuration files (Zookeeper URL), so we don't have to set them up!

A snippet from the Marathon installation task:

- name: Install Marathon
  yum: pkg=marathon state=latest update_cache=yes

- name: Enable marathon
  service: name=marathon state=started enabled=yes

Marathon also has a Web UI which is accessible by the following URL: http://master1:8080

Let's do some fun and deploy a simple REST service (the service and deployment settings will be discussed later):

Now we can monitor its state and the assigned port:

So we can invoke it and check if it works (of course, it does).

curl http://node2:31997/1  
1  

We even can scale our service, if necessary (at the moment it doesn't make any sense):

And analyse its logs from the "sandbox" (in Mesos UI):

Cluster state after the step

Bamboo and HaProxy

In the example above we deployed only one instance of the service. But what if we want to have a lot of instances and load balance among them? Well, a part of the answer is already well known - HaProxy. That is a really good load balancer. But how to configure it? For that there is "Bamboo" project which connects to Zookeeper, reads Mesos state and produces HaProxy config (using user-defined rules for every Mesos application).

Its installation might be a very simple process, but unfortunately there is no any public RPM repo with assembled Bamboo package at the moment. You can build it manually and install it from a local file, but be ready for a lot of different adventures. See the role file for the instruction (plus this issue for RPM packages).

A snippet from the Bamboo installation task:

- name: Install HaProxy
  yum: pkg=haproxy state=present update_cache=yes

- name: Enable HaProxy
  service: name=haproxy state=restarted enabled=yes

- name: Install Bamboo
  yum: pkg=/tmp/bamboo-1.0.0_1-1.noarch.rpm state=present

- name: Configure Bamboo (copy config)
  template:
    src: production.json.j2
    dest: /var/bamboo/production.json

- name: Configure Bamboo (HaProxy template)
  copy: src=haproxy_template.cfg
        dest=/opt/bamboo/config/
        mode=0644

- name: Configure SystemD
  copy: src=bamboo-server.service
        dest=/usr/lib/systemd/system/
        owner=root group=root mode=0644

- name: Reload SystemD
  shell: systemctl daemon-reload

- name: Enable Bamboo server
  service: name=bamboo-server state=restarted enabled=yes

After Bamboo has been installed you can set it up through its Web UI: http://master1:8000

And access our service through HaProxy: http://master1/services/fibonacci/1

Notice, we have a separate Ansible playbook (master_bamboo.yml) for Bamboo installation. The reason is necessity to upload its RPM package into a virtual host before running the playbook.
Since Vagrant automatically perform Ansible provisioning during VM initialization, the only way to do that is extracting Bamboo stuff into a separate playbook and performing the following algorithm:

  • start a VM with vagrant up;
  • upload the RPM file into the VM through SCP (see the role file);
  • change ansible.playbook in the Vagrantfile;
  • run vagrant provision master1 command;

As you can see, Bamboo is the most messy component in the ecosystem. So it's worth to take a look at its alternatives. E.g. Marathon Load Balancer.

Cluster state after the step

MesosDns

We missed one major question - what if our services have to communicate with each other? Is there a way to implement a service discovery inside a Mesos cluster? Yes, it is! And its name is quite obvious - Mesos DNS :) The idea is very simple - read Mesos cluster's state and publish it through DNS (A and SRV records) and HTTP API. The last point is very useful, because it allows us easily build client-side load balancing w/o any effort [1, 2].

Installation a bit tricky, but nothing special.

A snippet from the MesosDns installation task:

- name: Download MesosDNS binary
  get_url: url=https://github.com/mesosphere/mesos-dns/releases/download/{{mesos_dns_version}}/mesos-dns-{{mesos_dns_version}}-linux-amd64
    dest=/usr/bin/mesos-dns mode=0550

- name: Configure MesosDNS (check folder)
  file: path=/etc/mesos-dns state=directory

- name: Configure MesosDNS (copy config)
  template:
    src: config.json.j2
    dest: /etc/mesos-dns/config.json

- name: Set up MesosDNS service
  copy: src=mesos-dns.service dest=/etc/systemd/system mode=0644

- name: Enable MesosDNS service
  service: name=mesos-dns state=started enabled=yes

- name: Replace resolv.conf with MesosDNS
  copy: src=resolv.conf dest=/etc/resolv.conf mode=0644

The Config doesn't have any unexpected stuff too:

{
  "zk": "zk://{{zk_host1}}:2181,{{zk_host2}}:2181,{{zk_host3}}:2181/mesos",
  "refreshSeconds": 60,
  "ttl": 60,
  "domain": "mesos",
  "port": 53,
  "resolvers": ["8.8.8.8"],
  "timeout": 5,
  "httpon": true,
  "dnson": true,
  "httpport": 8123,
  "email": "root.mesos-dns.mesos"
}

You can check the installed instance by the following SRV records request: http://172.17.42.1:8123/v1/services/_fibonacci-service._tcp.marathon.mesos.

2015-12-14 09:58:39.495 INFO 1 --- [nio-8099-exec-8] n.k.m.f.s.d.MesosDnsDiscoveryService : DNS request: 

2015-12-14 09:58:39.497 INFO 1 --- [nio-8099-exec-8] n.k.m.f.s.d.MesosDnsDiscoveryService : DNS Response: [  
{
"service": "_fibonacci-service._tcp.marathon.mesos.",
"host": "fibonacci-service-83b5n-s0.marathon.slave.mesos.",
"ip": "192.168.99.21",
"port": "31681"
},
{
"service": "_fibonacci-service._tcp.marathon.mesos.",
"host": "fibonacci-service-e3j11-s1.marathon.slave.mesos.",
"ip": "192.168.99.22",
"port": "31300"
},
{
"service": "_fibonacci-service._tcp.marathon.mesos.",
"host": "fibonacci-service-3oxjq-s2.marathon.slave.mesos.",
"ip": "192.168.99.23",
"port": "31198"
}
]

Cluster state after the step

Logging

Ooophh! We are almost done! I must admit that if you've managed to read to this point, the topic is really interesting for you :)

Joyfully, there is not a lot to say here. Logging is logging. We just install Logstash on all the Mesos slave nodes...

A snippet from the LogStash installation task:

- name: Install LogStash packages
  yum: pkg=logstash,java state=present update_cache=yes

- name: Create LogStash config file
  template: src=logstash.conf.j2 dest={{logstash_config_dir}}/logstash.conf backup=yes mode=0644

- name: Enable Logstash
  service: name=logstash state=restarted enabled=yes sleep=5

and set up its config to publish data into the Log Nodes:

input {  
  tcp {
    port => 5959
    codec => json
  }
}

output {  
  elasticsearch {
    hosts => ["{{es_logs_host}}:9200"]
  }
}

In the same time we roll out ElasticSearch and Kibana on these nodes.

A snippet from the ELK installation task:

- name: Install ElasticSeach
  shell: docker run -d -p 9200:9200 -p 9300:9300 --name elasticsearch elasticsearch

- name: Install Kibana
  shell: docker run -d --link elasticsearch:elasticsearch -p 5601:5601 --name kibana kibana

The only reason to use Docker here is simplification. Of course, we shouldn't store log data inside the container, etc.

After installation you can analyse logs in ES through Kibana web interface: http://log1:5601

Target architecture

Here is the target architecture, that we've finally built. Looks nice, isn't it?

The only single point of failure in our scripts is HaProxy / Bamboo, but it can be easily fixed by deploying this pair on all the master nodes and using DNS-based round robin for clients.

Distributed Service

So, we have the cluster. Now it's time to take a look at a distributed service which we want run on it (running simple apps would be too boring :).

I've developed a tiny SpringBoot based REST service which calculates N-th member of the Fibonacci sequence. The killer feature of this service is that it invokes another instances of itself to calculate previous values of the sequence.

@RestController
public class FibonacciController {  
    private static final org.slf4j.Logger LOG = org.slf4j.LoggerFactory.getLogger(FibonacciController.class);

    @Autowired
    private FibonacciClient client;

    @RequestMapping(value = "/{n}", method = RequestMethod.GET)
    @ResponseBody
    public Integer calculate(@PathVariable("n") Integer n) throws ExecutionException, InterruptedException {
        LOG.info("Invoked Fibonacci calculation for " + n);

        Integer value;
        if (n < 0) {
            throw new IllegalArgumentException("N can't be less than 0");
        }

        if (n <= 1) {
            value = n;

        } else {
            // Invoke both calculations in parallel to double the computation speed
            // (yes, this solution has huge overhead, but this example is about distributed systems, not effective calculations)
            Future<ResponseEntity<Integer>> n1Future = client.invoke(n - 1);
            Future<ResponseEntity<Integer>> n2Future = client.invoke(n - 2);

            Integer n1 = n1Future.get().getBody();
            LOG.debug("Calculated N1");

            Integer n2 = n2Future.get().getBody();
            LOG.debug("Calculated N2");

            value = n1 + n2;
        }

        LOG.info("Calculation for " + n + " completed: " + value);
        return value;
    }
}

I know, that this implementation is extremely ineffective and even deadlock-prone (guess why), but my main objective here is illustration of cross-services communication.

The service uses MesosDNS HTTP API for service discovery:

@Service
@ConditionalOnExpression("${dns.enabled:true}")
public class MesosDnsDiscoveryService implements DiscoveryService {

    private static final org.slf4j.Logger LOG = org.slf4j.LoggerFactory.getLogger(MesosDnsDiscoveryService.class);

    @Value("${dns.host}")
    private String host;

    @Value("${dns.port:8123}")
    private String port;

    private String url;
    private RestTemplate restTemplate;

    @Autowired
    @Qualifier("dnsClientHttpRequestFactory")
    private SimpleClientHttpRequestFactory dnsClientHttpRequestFactory;

    @PostConstruct
    public void init() {
        this.url = "http://" + host + ":" + port + "/v1/services/";
        this.restTemplate = new RestTemplate(dnsClientHttpRequestFactory);
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public List<Pair<String, Integer>> getServiceInstances(String serviceName) {
        List<Pair<String, Integer>> instances = new ArrayList<>();

        try {
            String srvName = "_" + serviceName + "._tcp.marathon.mesos.";
            String request = url + srvName;

            LOG.debug("DNS request: {}", request);
            String response = restTemplate.getForObject(request, String.class);
            LOG.debug("DNS Response: {}", response);

            JSONArray jsonServices = new JSONArray(response);

            if (0 == jsonServices.length()) {
                throw new DiscoveryServiceException("Service " + serviceName + " has no instances");
            }

            for (int i = 0; i < jsonServices.length(); i++) {

                String ip = jsonServices.getJSONObject(i).getString("ip");
                Integer port = Integer.valueOf(jsonServices.getJSONObject(i).getString("port"));

                Pair<String, Integer> instance = new ImmutablePair<>(ip, port);
                instances.add(instance);
            }

        } catch (DiscoveryServiceException e) {
            throw e;

        } catch (Exception e) {
            throw new DiscoveryServiceException("Service " + serviceName + " can't be resolved", e);
        }

        return instances;
    }
}

and a simple client-side load-balancing (we don't want an excess point of failure, do we?):

@Service
public class RandomLoadBalancer implements LoadBalancer {

    private Random randomGenerator = new Random();

    @Autowired
    private DiscoveryService discoveryService;

    /**
     * {@inheritDoc}
     */
    public Pair<String, Integer> getInstance(String serviceName) {
        Pair<String, Integer> instance = null;

        List<Pair<String, Integer>> instances = discoveryService.getServiceInstances(serviceName);
        if (instances.size() > 0) {
            int index = randomGenerator.nextInt(instances.size());
            instance = instances.get(index);
        }

        return instance;
    }
}

Deploy

Before we start our deployment we have to build a Docker image for our application. I'm not going to describe the process here, but you can take a look at the Gradle config and the Docker file, if interested.

After that we have to publishing the image into a Docker registry (also can be done by Gradle), so that Marathon can download it through Docker instances on the slave nodes. You can find my example here: https://hub.docker.com/r/krestjaninoff/fibonacci-service/

And, finally, the major part - Marathon. As could see before, we can use Marathon UI for deployment. But that is not a "technical" approach :) Marathon also has its own REST API, which we are going to use with a simple "curl" client:

A snippet from the SpringBoot mesos installation manifest:

{
  "id": "{{app_id}}",

  "cpus": 1.0,
  "mem": {{memory}},
  "instances": {{instances}},

  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "{{image_name}}:{{image_version}}",
      "forcePullImage": true,
      "network": "BRIDGE",
      "portMappings": [
        { "containerPort": {{service_port}}, "hostPort": 0, "servicePort": 0, "protocol": "tcp" }
      ],
      "parameters": [
        { "key": "dns", "value": "172.17.42.1"},
        { "key": "env", "value": "LOGSTASH_HOST=172.17.42.1"},
        { "key": "env", "value": "dns.enabled=true"},
        { "key": "env", "value": "dns.host=172.17.42.1"},
        { "key": "env", "value": "dns.port=8123"},
        { "key": "env", "value": "srv.name={{service_name}}"}
      ]
    }
  },

  "backoffFactor": 2,
  "backoffSeconds": 30,
  "maxLaunchDelaySeconds": 3600,

  "healthChecks": [
    {
      "protocol": "HTTP",
      "portIndex": 0,
      "path": "/healthcheck/",
      "gracePeriodSeconds": 60,
      "intervalSeconds": 30,
      "maxConsecutiveFailures": 3
    }
  ],

  "upgradeStrategy": {
      "minimumHealthCapacity": 0.5,
      "maximumOverCapacity": 0.5
  }
}

Let me shortly describe what we have in this config (block by block):

  • application id (use it in Bamboo configuration);
  • resource limits (application won't be started if the cluster doesn't have necessary capacity - especially important for VM-based clusters) and number of instances;
  • container settings:
    • the forcePullImage is the only way to make your containers update when it's time;
    • SpringBoot allows to override configs through the environment variables, so that is a good way for container manipulation;
    • since the $HOST variable provides DNS name (and even isn't presented in the official Marathon docs), the only way to get the host's local IP from inside a Docker container is default Docker network interface 172.17.42.1 (yep, that smells);
  • backoff factor/settings prevents your cluster from "sick" applications which can't start over and over again (by delaying every next attempt of launch);
  • health checks allows Marathon to understand if an instance of your app is ok or must be rebounced;
  • upgradeStrategy helps you to update your app without inaccessibility (by deploying a new version before stopping the current one);

The last stop is Bamboo, which also can be configured through its REST API. This one is mush simpler:

curl -i -X PUT -d '{\"id\":\"{{app_id}}\", \"acl\":\"path_beg -i {{app_id}}\"}' http://{{bamboo_host}}:8000/api/services/{{app_id}}  

Gathering all together we've got the "deploy" role...

A snippet from the Deployment installation task:

- name: Update Marathon manifest
  template: src={{manifest}}.j2 dest=/tmp/{{manifest}}

- name: Uninstall Marathon task
  command: "curl -X DELETE http://{{marathon_host}}:8080/v2/apps/{{app_id}}"

- name: Install Marathon task
  command: "curl -X POST http://{{marathon_host}}:8080/v2/apps -d @/tmp/{{manifest}} -H 'Content-type: application/json'"

- name: Clean up marathon manifest
  command: rm -f /tmp/{{manifest}}

- name: Add BamBoo rule
  command: "curl -i -X PUT -d '{\"id\":\"{{app_id}}\", \"acl\":\"path_beg -i {{app_id}}\"}' http://{{bamboo_host}}:8000/api/services/{{app_id}}"

which can be invoked by running the "deploy" playbook:

- hosts: localhost
  connection: local
  vars:
    - marathon_host: 192.168.99.11
    - bamboo_host: 192.168.99.11
    - app_id: /service/fibonacci
    - manifest: springboot.json
    - memory: 256.0
    - image_name: krestjaninoff/fibonacci-service
    - image_version: latest
    - service_name: fibonacci-service
    - service_port: 8099
    - instances: 6
  roles:
    - deployment

Results

That's all! Now you can:

  • invoke the service: curl http://master1:5000/service/fibonacci/15 (or curl http://node2:31135/15 for direct invocation);
  • check the result (right value is 610);
  • take a look at logs (http://log1:5601);

The last point is very interesting. You can see which host was invoked or how much time the responses took:

P.S.

Yes, that article is too long for a simple "Hello World" manual. But let's be honest - that was interesting... wasn't it?

comments powered by Disqus