Microservices: CI with LambdaCD – The underlying infrastructure (1/3)

Abstract

In the last two months, we started our journey towards a new microservices architecture. Among other things, we found that our existing CD tools were not ready to scale with those new requirements. So we tried a new approach, defining our pipelines in code using LambdaCD. In combination with a Mesos cluster we can deploy new applications after a few minutes to see how they fit into our architecture by running tests against existing services.

Part 1: The underlying infrastructure
Part 2: Microservices and continuous integration
Part 3: Current architecture and vision for the future

What is the goal for our instracture?

Virtualization became very popular in the last ten years and now it is part of many data centers in almost all companies to make better use of the given hardware. Of course we use it for applications and it is very flexible. If I need a new VM I just open an operations ticket and wait for an hour. But there is the problem. We talk about virtual machines. If it is virtual, there should be a way to get rid of the process by automation. For me as a developer it is not important to know on which physical or virtual machine my application is running which is why it would be nice to have a platform which lets me deploy my application via a simple REST API.

Overview

This first part of my article will give you the big picture of the infrastructure we use to run microservices.

Docker

Each microservice is shipped in its own Docker container and we can deploy it on any machine which has installed the corresponding runtime. You can think of Docker containers like lightweight VMs that are based on arbitrary layers. You choose a base image, e.g. Ubuntu 14.04, and add software and files you need to run your application. Each command (copy files into it, install new packages, set an environment variable, …) creates a new layer. The advantage is if I want to run my container on an other machine I only have to transfer the layers which are not already there. In the best case this is only the last layer containing your code changes.
Back to our overview, it is very nice to deploy your application in a container which requires only the Docker runtime because we don’t have to care about the underlying hardware. To run a quick test, start the container on your local machine and for the deployment, do the same on the server without preparing the server by installing packages in a special version. The only thing you need is Docker.

Mesos

Apache Mesos is an open-source project abstracting all hardware resources in your cluster.
Every slave tells the master how many CPUs, how much RAM and disk space it can offer.

mesos1The master collects all this information and acts as contact point for Mesos frameworks. At OTTO we use Marathon to run applications in the cluster. Mesos sends the collected hardware information to the framework and this selects a slave which offers the necessary hardware.

The Mesos master forwards the decision of the framework and the task (start Docker container with specified URL) to the slaves, which run the task and allocate the resources. Finally, the slaves send back their updated resources offer to the master.

mesos3It is important to know if you only want to run one instance of an application you can’t distribute it over different slaves. You have to choose one which has the resources you need. But if you run your applications in HA mode with more than one instance you can tell Marathon to use a different slaves for every instance. If you don’t set this option it could happen that all your instances are located on the same slave and when this one goes down your application is offline until Marathon restarts it on another slave.
One problem we encountered is that all applications running in the Mesos cluster can’t persist any data. After a restart the application can’t access the files it wrote to before. It is possible to mount a filesystem from the host but then you have to guarantee that your application is deployed to the same slave and that the same filesystem is assigned to it. To solve this problem we use databases running outside of the cluster in traditional VMs. This, however, is only a workaround for the next months.

Marathon

If you want to start an application in your Mesos cluster you can’t send your requests directly to the master. You have to use a framework which knows the protocol to understand hardware offers from the slaves and to define your task. A very popular framework to do this is Marathon which provides a user friendly API to deploy Docker containers with their hardware requirements.
For every application you have to create a JSON configuration file which defines a unique id, required resources, the URL of the Docker container, etc. Marathon stores these configuration files and you can restart an application with an old one if you notice difficulties after deployment.
To start a new application or to restart a running application you have to use the Marthon REST API or the web UI. Both have the same functionality and so we use the former for automated processes, i.e. deployments in our shell scripts, and the latter for manual processes like scaling, restarting and troubleshooting.

marathon

If your application crashes, Marathon starts it again to avoid downtimes caused by manual processes. To use this feature every application has to implement a simple service which responses to health checks. This check can respond with an error code when internal processes go wrong or when the application should restart, e.g. to reload its configuration.
Another important feature is the rolling restart. You can configure how many instances should always be running which is necessary if you want to guarantee a zero downtime deployment.

Proxy

The third component in our infrastructure is a proxy server which also acts as a load balancer.
When Marathon starts or scales an application we create a new configuration with URLs depending on the application IDs. On the slaves every public port of an application is mapped to a random port to avoid clashes.

Deployment

Let’s have a look at the deployment process to see how all components interact with each other.

  1. Build a Docker container
    First, you have to create a Dockerfile which describes your container:

    FROM ubuntu:14.04
    RUN mkdir -p /var/opt/my-project
    ADD my-project.jar /var/opt/my-project/my-project.jar
    WORKDIR /var/opt/my-project
    CMD java -jar my-project.jar

    To build your container use the Docker cli:

    docker build -t my-project .

    build-container

  2. Store the container in your Dockerregistry. To let other people and Mesos use your container you have to publish it. Your company has to run an internal Docker registry for this purpose.
    docker push my-project

  3. Create a Marathon configuration file
    A Marathon configuration file can have many parameters which describe how the application should be deployed, how many instances you need or which port your application should use.

    {
      "id": "/develop/my-project",
      "container": {
        "type": "DOCKER",
        ...}
      "cpus": 0.5,
      "mem": 512.0,
      "instances": 2,
      ...
    }

    To have a complete example look at the Marathon REST API documentation.

  4. Deploy the application
    The easiest way to deploy the application is to use the Marathon REST API. You can also try out the web UI but deployments should be an automated process which is why we prefer the former way.

    curl -s -X PUT -H "Content-Type: application/json" --connect-timeout 5 -d @config.json "${MARATHON_HOST}/v2/apps/${ID}"

    deploy-app

  5. Monitor the deployment
    If you want to know which state your deployment has you can use the Marathon web UI. Marathon will look for a slave that can satisfy your requirements.
    To finish the process the application has to reply to the health checks after the deployment.
    monitor-deplyment
  6. Update your DNS and proxy
    In most cases you want to connect your application outside of the Mesos cluster which is why you have to update the configuration of your DNS server and your proxy. Currently, we use a simple Jenkins pipeline which is triggered by Marathon.
  7. Modify the configuration
    In the web UI you can scale up and down the number of instances, but to change any other parameter you have to use the REST API again. For example, if you want to deploy a new version of your project just create a new Docker container, change the URL in the Marathon configuration file and send this to Marathon. The instances with old configuration will shutdown after the new instances are deployed and healthy.

Conclusion

Our infrastructure is very powerful and gives us the ability to deploy new applications within a few minutes. We can integrate our microservices in an early state why we have a fast feedback loop. Because of this we avoid developing our services in a wrong direction.
For you as developer it is very easy to test new implementations and you don’t have to care about someone else spending time to create a VM for you. Just deploy your application and destroy it if you don’t like it. This infrastructure allows us to create microservices in very dynamic way.

In the next part we will start digging into the world of microservices. You will learn how we define the term „microservice“ and why we prefer this architecture style. Additionally you are introduced to LambaCD which lets us define CI pipelines in code.