Google, Ganeti and Paravirtualization

Published: 03 Sep 2007

Google recently announced the first beta release of Ganeti, an open source virtual server management software built on top of Xen and other open source software. Naturally, since I've been using VMWare for the last year or so I was curious about Ganeti.

What is Ganeti?

In short, Ganeti is a cluster virtualization platform based on Xen. An "intent to package" or ITP has been filed with Debian which describes Ganeti as "a virtual server cluster management software tool built on top of the Xen virtual machine monitor and other Open Source software. After setting it up it will provide you with an automated environment to manage highly available virtual machine instances."

What is Xen?

xenXen is a free software virtual machine monitor for IA-32, x86-64, IA-64 and PowerPC architectures. It is software that runs on a host operating system and allows several guest operating systems to be run on top of the host on the same computer hardware at the same time. Modified versions of Linux and NetBSD can be used as hosts. Several modified Unix-like operating systems may be employed as guest systems; on certain hardware, as of Xen version 3.0, unmodified versions of Microsoft Windows and other proprietary operating systems can also be used as guests.

Xen originated as a research project at the University of Cambridge, led by Ian Pratt, senior lecturer at Cambridge and founder of XenSource, Inc. This company now supports the development of the open source project and also sells enterprise versions of the software. The first public release of Xen was made available in 2003. XenSource, Inc announced in a letter to customers and partners on the 15th of August 2007, that Citrix has signed a definitive agreement to acquire XenSource. The acquisition is expected to close in Q4 2007.

What is Paravirtualization?

If you wanted to know what paravirtualization was you might have gone to wikipedia.org like I did and encountered this definition:

...paravirtualization is a virtualization technique that presents a software interface to virtual machines that is similar but not identical to that of the underlying hardware. This requires operating systems to be explicitly ported to run on top of the virtual machine monitor (VMM), which the owner of exclusive rights in a proprietary operating system may decline to allow for strategic purposes, but may enable the VMM itself to be simpler or virtual machines that run on it to achieve performance closer to non-virtualized hardware.

That sounds all fine and dandy but reading this a couple times didn't really give me a good idea of what paravirtualization actually means. While I'm at it, what's a virtual machine monitor?

After further reading about Xen, a simpler definition came into the picture. Basically, paravirtualization is a collaboration between the guest operating system and the management API (m) and virtual hardware API(vh) as a function of available resources (CPU, memory, disk space, etc.) to achive optimal performance, p.

p = m + vh(x)

Why does Google care about paravirtualization?

In their own words...

Here at Google, we've used Ganeti in the internal corporate environment to facilitate cluster management of virtual servers in commodity hardware, increasing the efficiency of hardware usage and saving space, power and cooling. Ganeti also provides fast and simple recovery after physical failures.
  1. It's open source. Google has consistently leveraged open source software and extended it to suit their needs.
  2. It saves them tons of money. When you have server farms as large as Google's, every incremental cost savings on space, energy and time is huge.
  3. Fast failover. By now everyone expects a company like Google to not fail but inevitably people make mistakes and machines fail. It's times like these when seamless or fast failover shines. When one machine fails, there's always one or more other machines to take over the work load.
  4. Minimizes certification headaches. The I/O model of paravirtualization re-uses standard Linux device drivers which also goes a long way toward near-native performance.

Why should Software Engineers Care About Virtualization?

The Xen and the Art of Virtualization (PDF) paper explained that the negligible overhead of virtualization is well worth the price of the agility gained when it comes to configuration required to run software.

...provides an extremely high level of flexiibility since the user can dynamically create the precise execution environment their software requires. Unfortunate configuration interactions between various services and applications are avoided (for example, each Windows instance maintains its own registry).

Further Reading

  1. Ganeti: Open source virtual server management software released (original post from the Google Code blog).
  2. Ganeti Project Site - get the docs, downloads and source here.
  3. Xen Community site
  4. Official Zen Project Site - hosted at the University of Cambridge.
  5. VMWare.com - commercial virtualization software.