Microservices: CI with LambdaCD – The underlying infrastructure (1/3)

Abstract

In the last two months, we started our journey towards a new microservices architecture. Among other things, we found that our existing CD tools were not ready to scale with those new requirements. So we tried a new approach, defining our pipelines in code using LambdaCD. In combination with a Mesos cluster we can deploy new applications after a few minutes to see how they fit into our architecture by running tests against existing services.

Part 1: The underlying infrastructure
Part 2: Microservices and continuous integration
Part 3: Current architecture and vision for the future

What is the goal for our instracture?

Virtualization became very popular in the last ten years and now it is part of many data centers in almost all companies to make better use of the given hardware. Of course we use it for applications and it is very flexible. If I need a new VM I just open an operations ticket and wait for an hour. But there is the problem. We talk about virtual machines. If it is virtual, there should be a way to get rid of the process by automation. For me as a developer it is not important to know on which physical or virtual machine my application is running which is why it would be nice to have a platform which lets me deploy my application via a simple REST API.

Overview

This first part of my article will give you the big picture of the infrastructure we use to run microservices.

Docker

Each microservice is shipped in its own Docker container and we can deploy it on any machine which has installed the corresponding runtime. You can think of Docker containers like lightweight VMs that are based on arbitrary layers. You choose a base image, e.g. Ubuntu 14.04, and add software and files you need to run your application. Each command (copy files into it, install new packages, set an environment variable, …) creates a new layer. The advantage is if I want to run my container on an other machine I only have to transfer the layers which are not already there. In the best case this is only the last layer containing your code changes.
Back to our overview, it is very nice to deploy your application in a container which requires only the Docker runtime because we don’t have to care about the underlying hardware. To run a quick test, start the container on your local machine and for the deployment, do the same on the server without preparing the server by installing packages in a special version. The only thing you need is Docker.

Mesos

Apache Mesos is an open-source project abstracting all hardware resources in your cluster.
Every slave tells the master how many CPUs, how much RAM and disk space it can offer.

mesos1The master collects all this information and acts as contact point for Mesos frameworks. At OTTO we use Marathon to run applications in the cluster. Mesos sends the collected hardware information to the framework and this selects a slave which offers the necessary hardware.

The Mesos master forwards the decision of the framework and the task (start Docker container with specified URL) to the slaves, which run the task and allocate the resources. Finally, the slaves send back their updated resources offer to the master.

mesos3It is important to know if you only want to run one instance of an application you can’t distribute it over different slaves. You have to choose one which has the resources you need. But if you run your applications in HA mode with more than one instance you can tell Marathon to use a different slaves for every instance. If you don’t set this option it could happen that all your instances are located on the same slave and when this one goes down your application is offline until Marathon restarts it on another slave.
One problem we encountered is that all applications running in the Mesos cluster can’t persist any data. After a restart the application can’t access the files it wrote to before. It is possible to mount a filesystem from the host but then you have to guarantee that your application is deployed to the same slave and that the same filesystem is assigned to it. To solve this problem we use databases running outside of the cluster in traditional VMs. This, however, is only a workaround for the next months.

Marathon

If you want to start an application in your Mesos cluster you can’t send your requests directly to the master. You have to use a framework which knows the protocol to understand hardware offers from the slaves and to define your task. A very popular framework to do this is Marathon which provides a user friendly API to deploy Docker containers with their hardware requirements.
For every application you have to create a JSON configuration file which defines a unique id, required resources, the URL of the Docker container, etc. Marathon stores these configuration files and you can restart an application with an old one if you notice difficulties after deployment.
To start a new application or to restart a running application you have to use the Marthon REST API or the web UI. Both have the same functionality and so we use the former for automated processes, i.e. deployments in our shell scripts, and the latter for manual processes like scaling, restarting and troubleshooting.

marathon

If your application crashes, Marathon starts it again to avoid downtimes caused by manual processes. To use this feature every application has to implement a simple service which responses to health checks. This check can respond with an error code when internal processes go wrong or when the application should restart, e.g. to reload its configuration.
Another important feature is the rolling restart. You can configure how many instances should always be running which is necessary if you want to guarantee a zero downtime deployment.

Proxy

The third component in our infrastructure is a proxy server which also acts as a load balancer.
When Marathon starts or scales an application we create a new configuration with URLs depending on the application IDs. On the slaves every public port of an application is mapped to a random port to avoid clashes.

Deployment

Let’s have a look at the deployment process to see how all components interact with each other.

  1. Build a Docker container
    First, you have to create a Dockerfile which describes your container:

    FROM ubuntu:14.04
    RUN mkdir -p /var/opt/my-project
    ADD my-project.jar /var/opt/my-project/my-project.jar
    WORKDIR /var/opt/my-project
    CMD java -jar my-project.jar

    To build your container use the Docker cli:

    docker build -t my-project .

    build-container

  2. Store the container in your Dockerregistry. To let other people and Mesos use your container you have to publish it. Your company has to run an internal Docker registry for this purpose.
    docker push my-project

  3. Create a Marathon configuration file
    A Marathon configuration file can have many parameters which describe how the application should be deployed, how many instances you need or which port your application should use.

    {
      "id": "/develop/my-project",
      "container": {
        "type": "DOCKER",
        ...}
      "cpus": 0.5,
      "mem": 512.0,
      "instances": 2,
      ...
    }

    To have a complete example look at the Marathon REST API documentation.

  4. Deploy the application
    The easiest way to deploy the application is to use the Marathon REST API. You can also try out the web UI but deployments should be an automated process which is why we prefer the former way.

    curl -s -X PUT -H "Content-Type: application/json" --connect-timeout 5 -d @config.json "${MARATHON_HOST}/v2/apps/${ID}"

    deploy-app

  5. Monitor the deployment
    If you want to know which state your deployment has you can use the Marathon web UI. Marathon will look for a slave that can satisfy your requirements.
    To finish the process the application has to reply to the health checks after the deployment.
    monitor-deplyment
  6. Update your DNS and proxy
    In most cases you want to connect your application outside of the Mesos cluster which is why you have to update the configuration of your DNS server and your proxy. Currently, we use a simple Jenkins pipeline which is triggered by Marathon.
  7. Modify the configuration
    In the web UI you can scale up and down the number of instances, but to change any other parameter you have to use the REST API again. For example, if you want to deploy a new version of your project just create a new Docker container, change the URL in the Marathon configuration file and send this to Marathon. The instances with old configuration will shutdown after the new instances are deployed and healthy.

Conclusion

Our infrastructure is very powerful and gives us the ability to deploy new applications within a few minutes. We can integrate our microservices in an early state why we have a fast feedback loop. Because of this we avoid developing our services in a wrong direction.
For you as developer it is very easy to test new implementations and you don’t have to care about someone else spending time to create a VM for you. Just deploy your application and destroy it if you don’t like it. This infrastructure allows us to create microservices in very dynamic way.

In the next part we will start digging into the world of microservices. You will learn how we define the term „microservice“ and why we prefer this architecture style. Additionally you are introduced to LambaCD which lets us define CI pipelines in code.

Hi, ich bin Simon Monecke, 25 jahre alt, angehender Software-Entwickler aus Hamburg. Als Student der FH Wedel absolviere ich ein sechsmonatiges Praktikum bei Otto, um dort meine Masterarbeit zu schreiben. Ich beschäftige mich dabei hauptsächlich mit den Themen "Mircoservices" und "Continuous Integration".

Tagged with: , , , , , , ,
Veröffentlicht in Architektur, Grundlagen, Operations
7 comments on “Microservices: CI with LambdaCD – The underlying infrastructure (1/3)
  1. […] Microservices: CI with LambdaCD – The underlying infrastructure (1/3) […]

  2. […] a new and much more flexible Infrastructure based on Mesos and Docker. Simon Monecke has written a post about that, which is only the first of a series of […]

  3. […] LambdaCD is an open-source project initiated by Florian Sellmayr. It is written in clojure and promises to let you define pipelines in code which empowers you to customize this CI tool completely. By using a programming language to define entire pipelines you are not restricted by the completeness of any DSL and you can add all the functionality you need in your steps. It has many more advantages such as you can use your favorite IDE and SCM for writing and organizing your pipelines. Furthermore you can test your code with common tools and write your own library to reuse tasks in different pipelines. We do this to manage our deployment to Marathon at a central place. To automate the creation of new pipelines we created a Leiningen template which contains a typical configuration. After specifying a project name and a Git repository we have a production-ready pipeline which can run immediately in our new Mesos cluster. Because every pipeline is just a clojure project you can use LambdaCD to build and deploy it. You have read correctly: You can use a LambdaCD pipeline to build and deploy our LambdaCD pipelines! For this task every pipeline we create also has a second pipeline called the meta-pipeline. If you change the definition of a pipeline it is triggered by your commit and deploy itself. LambdaCD is a highly configurable CI tool which solve the most our problems. Surely, it is a challenge to switch to a new CI tool but it fits better as common CI tools in our dynamic infrastrature. […]

  4. […] Vamp oder Very Awesome Microservices Platform vereinfacht das Durchführen von Canary-Releases. Vamp unterstützt im Moment Mesos und Marathon, später soll aber auch Kubernetes als Container-Manager hinzukommen. Die Plattform wird von der niederländischen Firma magnetic.io entwickelt. Ähnlich wie Kubernetes ist Mesos ein Dienst zum Starten von Docker-Containern in einem verteilten Rechner-Cluster. Mesos ist zuständig für Ressourcenverwaltung im Cluster und kann mit Marathon das Deployment von Microservice-Anwendungen durchführen. Marathon übernimmt dabei die Aufgabe eines Schedulers, der die Verteilung und Ausführung von Docker-Containern steuert. Viele große Firmen setzen ähnliche Technologie-Stacks schon erfolgreich in Produktion ein, darunter sind z.B, Apple, Holidaycheck und Otto. […]

  5. […] Beschreibung der von uns verwendeten Technologien und unserer Infrastruktur ist auf unserem Blog zu […]

  6. […] Beschreibung der von uns verwendeten Technologien und unserer Infrastruktur ist auf unserem Blog zu […]

Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden / Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden / Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden / Ändern )

Google+ Foto

Du kommentierst mit Deinem Google+-Konto. Abmelden / Ändern )

Verbinde mit %s

%d Bloggern gefällt das: