The Nephelae platform

CERSAT conceived the Nephelae demonstration platform to test existing "big data" technologies and serve as a "proof of concept" to support both locally and remotely managed reprocessing activities on complete satellite mission archives. It heavily relies on recent developments in the field of petascale file systems and cloud computing. Nephelae is the nymph of clouds in Greek myths...

Context and requirements

Nephelae was conceived as an answer to cope with some of the issues raised by massive data processing and mining. It was also motivated by immediate needs for large scale reprocessing campaigns undertaken for ESA (ERS Wave Mode reprocessing) or CATDS (SMOS salinity reprocessing). In these contexts, the goal was to emphasize that the recent technologies - and in particular virtualization and cloud computing - now allow to circumvent limitations or extra efforts that have often hampered such endeavors :

  • minimize the resources required to set up and deploy a mission-dedicated processing environment :each reprocessing uses here existing (or to be extended) hardware by just deploying a virtual server image matching the requirements of reprocessing software (linux distribution, compiler versions,...) with no impact on other processings running on the same cloud (and using different images). Heterogeneous softwares and environments can run on the same hardware without having to reconfigure the physical servers.
  • make very easy to switch back to a former or new processing environment once it is completed. In that context, even if a reprocessing is completed and the cloud used for other tasks, partial (or complete) reprocessing (with updated software or configuration) can resume later or be repeated anytime by just restoring the server image, bringing a lot of flexibility in the reprocessing campaign. This is a key issue for long-term preservation of the archives to keep this possibility of replaying campaigns anytime.
  • make full usage of and easily scale the processing resource (CPU) by allocating more virtual servers and making use of smart batch processing scheduler. The cloud resources can be easily adapted to the required processing speed. Previous testings have demonstrated similar processing capabilities than the in-house Ifremer meso-scale scientific cluster with significantly less processing nodes.
  • demonstrate at low risk on real-life applications (as it is not much different than running directly on a physical platform) the potential of new emerging technologies for the exploitation of historical data from past and ongoing missions.

 

The CERSAT Nephelae demonstrator was also built over the following technical requirements :

  • horizontal scalability for storage, computation and overall performances (network, I/O...)
  • redundancy for reliability and data/services availability
  • ease to deploy, monitor, maintain and upgrade
  • compatibility with the existing environment of local users (linux servers and workstations using NFS mount)
  • native application compatibility (no specific API to access data ; OS compatibility).
  • usage of standard and cheap hardware components for hard drives and servers

 

Technical solution

Nephelae-architectureOverview of the Nephelae storage and cloud computing platform tested at Ifremer / Cersat. The processing resource is directly brought to the data (instead of bringing the data to the processors through a network) by distributing and attaching the storage disks (3 * 2TB) to each processing node (about 20 in total), as well as 300 GB of fast disk for server image and workspaces. Dedicated nodes are also deployed to manage the control and execution of virtual machines, or to host common services (typically job schedulers and management services for the distributed file system).

Physically, the platform is based on a dozen servers, each one having a good processing and large storage capacity with gigabit network connection. More precisely, Dell R510 servers with 2*CPU Intel Xeon 6core (X5650 @2.67GHz) - 24Go RAM - 10*2To disk with raid5 controller and Ubuntu Operating System were used. It is to be noted that homogeneous hardware were used for the servers on this prototype to simplify this early stage implementation, but it is not a requirement : the platform must support heterogeneous hardware to address the hardware upgrades over time.
The platform is also fully reconfigurable in terms of hardware, meaning that any physical server is just a “block” that can be added or removed seamlessly and harmlessly (except for performances maybe). This is achieved by embedding the hardware with application layers managing the distributed storage, process scheduling and OS & applications deployment.

  • Storage layer : MooseFS. MooseFS is an open source fault tolerant, network distributed file system. It spreads data over several physical servers which are visible to the user as one resource ; it is designed for highly reliable petabyte storage. It makes the storage in Nephelae easy to scale horizontally (disk space and IO performances), and provides additional features to secure the data storage such as replication, snapshots and trash. Moreover, accessing this filesystem is transparent for the applications (POSIX compliant, not 100% though). It has been preferred over other tested solutions (GlusterFS, HDFS,...). Currently, the Nephelae platform uses a virtual disk space of 150Tb distributed over 10 servers.
  • Private cloud computing layer : OpenStack. We use OpenStack open source solution to manage the Nephelae private cloud. It provides good integration with its constituting ubuntu servers, and allows the self-service provisioning of machine images. The machine images contain all the requirements to run applications or services on-demand, which enable running a variety of software on a unique platform, mixing the operating system as needed. It has replaced Eucalyptus, used in the first version.
  • Batch Processing layer : Torque/Maui & Cersat management tools. To schedule large batch processing jobs on the platform, we use the torque/maui solution to manage the resources available, queue priorities and job distribution. Our use-case batch processing jobs often need to run hundred thousand very short processes, which was quite difficult with torque/maui software as is. Moreover, we wanted to run the processing closer to the datas. To address this, we have some home-made tools to get a clever scheduler and enhance monitoring.
  • Realtime Processing layer : Gearman & Cersat management tools : Some processes, like interactive web applications, need fast real-time execution (our batch processing layer was not good at this). Deploying the Gearman solution on our platform enables to distribute and load balance this kind of requests. It is not transparent for the application, but was easy to use for our needs.