About Mesos in INDIGO-Datacloud Project

Mesos Cluster

The present document describes how Apache Mesos is used by the INDIGO-DataCloudarrow-up-right PaaS layer. INDIGO-DataCloud (start date: 01/04/2015, end date: 30/09/2017) is a project funded under the Horizon2020 framework program of the European Union and led by the National Institute for Nuclear Physics (INFN). It developed a data and computing platform targeting scientific communities, deployable on multiple hardware and provisioned over hybrid (private or public) e-infrastructures. The INDIGO solutions are being evolved in the context of other European projects like DEEP Hybrid-DataCloudarrow-up-right, eXtreme-DataCloudarrow-up-right and EOSC-Hubarrow-up-right

Table of Contents

The INDIGO-DataCloud PaaS relies on Apache Mesosarrow-up-right for:

  • managed service deployment

  • user applications execution

The instantiation of the high-available Mesos cluster is managed by the INDIGO Orchestratorarrow-up-right in a fully automated way as soon as a user request described by a TOSCA template is submitted. Once the cluster is up and running, it can be re-used for successive requests.

Mesos is able to manage cluster resources (cpu, mem) providing isolation and sharing across distributed applications (frameworks)

Marathonarrow-up-right and Chronosarrow-up-right are two powerful frameworks that can be deployed on top of a Mesos Cluster.

Sophisticated two-level scheduling and efficient resource isolation are the key-features of the Mesos middleware that are exploited in the INDIGO PaaS, in order to run different workloads (long-running services, batch jobs, etc) on the same resources while preserving isolation and prioritizing their execution.

INDIGO PaaS uses:

  • Marathon to deploy, monitor and scale Long-Running services, ensuring that they are always up and running.

  • Chronos to run user applications (jobs), taking care of fetching input data, handling dependencies among jobs, rescheduling failed jobs.

  • Automatic deployment through Ansible recipes embedded in TOSCA and HOT templates

    • All the services run in docker containers;

  • High-availability of the cluster components:

    • Leader election among master nodes managed by Zookeeper;

    • HA Load-balancing;

      -Service discovery through Consul that provides also DNS functionality and health checks;

    • services are automatically registered in Consul as soon as they are deployed on the cluster

  • The external access to the deployed services is ensured through load-balancers in HA (unique entrypoint: cluster Virtual IP)

  • Cluster elasticity and application auto-scaling through CLUES plugin

  • GPU support

  • The Ansible roles and TOSCA templates have been extended in order to support the usage of GPUs.

The INDIGO components developed for Mesos (ansible roles, docker images, tosca custom-types and templates) have been used to support different uses-cases:

  • Lifewatch-Algaebloom for water quality modeling and analysis:

  • Compact Muon Solenoid (CMS) analysis cluster on-demand:

  • Dariah Zenodo-based repository in the cloud using Marathon:

The core components are:

These components are distributed on the cluster nodes as shown in the diagram below.

INDIGO Mesos cluster architecture
  • Master nodes

    • On every master node the following (dockerized) components run: zookeeper, mesos master, consul server, marathon, chronos

  • Slave nodes

    • On every slave node the following (dockerized) components run: mesos slave, consul agent

  • Load-balancers

    • On the two load-balancers the following (dockerized) components run: keepalived and marathon-lb. keepalived ensures the high-availability of the load-balancer managing the cluster Virtual IP.

Release

Component version

indigo_1

Mesos 0.28.0 Marathon 1.1.1 Chronos 2.4.0

indigo_2

Mesos 1.1.0 Marathon 1.4.1 Chronos 3.0.2

deep_1

Mesos 1.5.0 Marathon 1.5.6 Chronos 3.0.2 patched for GPU support

You can use this guidearrow-up-right to deploy a Mesos cluster on a set of hosts using the following indigo-dc ansible roles:

These ansible roles are published on Ansible Galaxy and can be installed through ansible-galaxy command: ansible-galaxy install indigo-dc.rolename

You can use this TOSCA templatearrow-up-right for setting up a complete Mesos cluster on Cloud resources.

Last updated