Thursday, July 1, 2010

How to cheaply turn single machine into a cluster

For development of membase which is a distributed storage system, I often need to run it on cluster of machines. Luckily I have two machines at home and at work so with trivial use of rsync & ssh running cluster of two nodes was easy. But I sometimes need to run more than two nodes so I decided to find something cheap that allows me to run multiple true nodes even when I have single machine. This is more important now, because I'll be on business trip for next month. So I'll have only single machine at my disposal.

One approach is to use virtualization. I tried it around a year ago for some project. Starting complete OS just to run single application is a bit too slow, but that can be alleviated by use of snapshots. In practice this is too painful. Even with snapshots it's slow. And free software virtualization products either have it wrong (virtualbox) or buggy (kvm & qemu). Bridge networking is relatively slow to come up. And I remember having some networking issues when restoring from snapshot.

Yesterday I finally tamed 'virtualization' that's cheap and works. My approach is to use LXC which is Linux's built-in containers implementation. It supports network virtualization as it's core feature and it doesn't require separate root for it's instances. So with my current solution I can create large number of 'servers' all having shared filesystem, but different hostnames and network stacks. It's reliable, starts quickly and it's easy to kill.

One of the problems was that I needed to create virtual host-only network that connects host and all containers. The problem is that current implementation of macvlan link type doesn't support networking with host. I worked that around by creating virtual ethernet pair and linking macvlans to one side of it, while using other side as host's end. So far it works beautifully!

The main script is at the following gist: It takes care of allocation of ip address & hostname and simply runs provided command inside container. It also takes care to kill everything inside container when main command exits. This is useful for killing any daemons (e.g. erlang port mapper) that might still be running inside.

Here's what I added to project's makefile to launch multiple instances of membase:

'make lxc-run' starts one instance of membase inside container. 'make lxc-cluster' starts three instances in tabs of new terminal window.

No comments:

Post a Comment