Server Failover For the Cheap and Forgetful
August 2 2012
I've long been of the opinion that real computer people should run their own servers. Most obviously, doing so gives you huge flexibility to do unusual things and share them easily with other people. Less obviously, I think it's hard to be a good programmer unless you're also a competent sysadmin. And so, for the last 12 years or so, I've hosted my own e-mail, web sites, and so on. This has proved to be a good choice: my systems have been more flexible and reliable than my various employer's, for example. I hope that I'm not dogmatic about doing everything myself, so when a better alternative appears, I switch to it. For example, after several years hosting git repositories for different projects myself, I now use github (and its ilk) almost exclusively, because they do a better job of multi-user repositories than I can on my own. But, in general, I prefer to do things myself.
However, there is one undeniable truth when hosting everything yourself: when things go wrong, there's no-one else to blame. If software crashes, hardware dies, or - and this has happened to me twice - the water company cuts through the power cable, it's your sites that are down. After the water company cut through the power cable for a second time, I realised that not having my main server running for three days wasn't very much fun. I eventually managed to route things through a cobbled together backup server, but it was a huge pain. I broke out in a cold sweat every time I wondered how long the next downtime might be: a week? or more? Even worse, I realised, I might not be available at the time to fix things.
I therefore started looking at ways to put in place a reasonable failover system. Importantly, I wanted the most important services on the main server to failover to a backup until the main server came back to life. I naively assumed there would be a fairly easy way of doing this, but, if there is one, I couldn't find it at the time (there are now several web-based DNS services which appear to offer this service, though I haven't tried any of them). Presumably this is why most people (including some surprisingly large organisations) never put into place any sort of failover solution and those that do tend to have large systems teams behind them. Some of the material I read made my eyes hurt: it involved tweaking vast numbers of options in vast numbers of configuration files. As someone who aims to spend, on average, a few minutes a week on sysadmin duties, I knew this was not a sensible long-term option. I'd inevitably forget all the complicated details, which, from a sysadmin viewpoint, is storing up problems for a rainy day.
After some thought, I cobbled together a relatively simple solution that has served me well for a couple of years or more. I think of it as failover for the cheap and forgetful. Nothing in here is new, as such, but it's difficult to find it written down in one place. Dedicated sysadmins will probably snigger at how crude it is, but it works well enough for my needs. It's not intended to deal with, for example, large scale denial of service attacks nor to keep every possible service up and running. It's designed to keep the main services (e.g. web and email) online when the main server is down. This article outlines the basic idea, gives some example configuration files, and the shell script I use to tie everything together.
Overall, my setup is analogous to having a main server in a data centre, and using a home computer for a backup server. The details are as follows.
I have a relatively beefy server (which came to me for nowt, courtesy of a friend-of-a-friend clearing out a data centre) sitting on a fast internet connection. I have a very unbeefy backup server (a 4 year old, passively cooled Tranquil T7, a fanless Atom-based machine; they no longer make that model, but I highly recommend their other machines, which are exquisitely made and totally silent) sitting on a DSL line. The main server runs some fairly hefty services (including a large database) which the backup server can't, but those aren't critical. The only advantage the backup server has is a vast disk, which I use for backing up my music; the main server has faster, but much smaller, disks.
The servers are hosted with completely different network providers and are physically distant, with well over 100 miles between them. Both decisions are deliberate: I don't want network or location specific problems taking down both machines at the same time (many bigger providers, from Redbus to Amazon have regretted locating too many services in one location). Both machines run the latest stable release of OpenBSD, so moving configuration options between machines is fairly easy.
What I want out of my system is fairly straightforward. The main server should do as much as it possibly can. The backup server should generally do relatively little, otherwise those who share its DSL line might become upset that their connection has slowed down. When the main server is down, it should be reasonably easy to access services on the backup server. If the backup server has taken over from the main server it should hand back to the main server as soon as that comes back up.
I also want as much of the setup as possible to be under my own control. Experience has taught me that unless (and, often, even if) you pay decent money, large systems run by large groups aren't as reliable as I would like: it's too easy for a small part of such a system to go down and its recovery not to be prioritised. Plus, doing things yourself is informative and good, clean fun.
Why a solution isn't easy
There are some complete, fully transparent, solutions for failover (e.g. CARP), but these generally have preconditions (e.g. machines being on the same network) which rule them out for my purposes.
Most people are familiar with the idea of having multiple mail servers. DNS allows a given domain name to specify multiple MX hosts, each with a priority. When sending mail to a domain, the MX servers are tried in order of priority until one accepts delivery. Failover for the sending of mail is therefore simple. But what about the reading of mail? and how do we
see the same set of e-mail at both machines? DNS doesn't have any answer for that. And DNS doesn't have a real-world answer for failover of anything except e-mail sending. Well, it does in theory (see here), but few people implement it, so it doesn't in practice. That means that HTTP failover, for example, must be handled on a per-domain basis.
There are some half-measures which provide partial solutions, but they're unsatisfactory. Round robin DNS, for example, allows requests to be shared between servers. Often used as a simple form of load balancing, it seems like it might also be a way of implementing failover on the cheap. However, it will happily point a proportion of clients at a server which is down, which isn't a good failover strategy.
Outline of a solution
The basis of a solution is relatively simple. First, we have to have a way of rewriting DNS records so that when the main server goes down, the DNS records for the domain are changed to point to the backup server (and back again, when the main server comes up). Second, we need a way of keeping the data within services in sync. As an example of the latter service, I'll describe how I synchronise e-mail.
DNS has a reputation for being a dark art. In large part, I suspect this is because the configuration files for BIND are so hard to read. For some years I used a web-based DNS service just to avoid the pain of working out how to configure BIND. To get failover up and running easily, I decided to tackle my DNS fear head on. I ended up using NSD instead of BIND, partly because it's a little easier to configure, but mostly because, as I understand things, it's likely to displace BIND in OpenBSD one day.
The basic tactic I use is to continually rewrite the main DNS entry to point at the preferred server. The backup server contains the master DNS. It regularly polls the main server by trying to download
index.html via HTTP. If the download succeeds, it rewrites the DNS records to point at the main server; if the transfer fails, the DNS is rewritten to point at the backup server. The main server runs a slave DNS, which is notified of any changes to the backup server's records. This might seem backwards, but if the main server were to contain the master DNS, it wouldn't be able to rewrite DNS to point at the backup server.
The rewriting process is in some ways crude and in some ways a bit clever.
It's crude, because I wanted a solution that needs only basic shell scripting (so it can run on a fresh OpenBSD install immediately) and normal text files (no key exchange, or anything which involves configuring things that I might forget). I therefore manually write DNS zone files as a normal user with two magic markers
@serial number@ and
@serial number@ is a monotonically increasing ID which allows other DNS servers to work out if they've got the latest version of the zone file or not.
@IP@ is the IP address of the currently favoured server, yo-yoing between the main and backup servers.
It's in some ways a bit clever because the
@serial number@ isn't an arbitrary integer: it's conventionally in the format
YYYYMMDDSS (YYYY is the year; MM the month; DD the day; SS the increment during the day). What this means is that in any given day, we can only increment
SS 99 times before we run out of options. Therefore we only increment it when
@IP@ has changed since the last write of the file; this allows the main server to go down and come back up 44 times in a day before we run out of IDs, which seems enough to cope with all but the most bizarre outcomes.
I therefore wrote a script called
zonerw which automatically copes with updating the zone files only when the main server has changed status since the previous execution (i.e. up to down, or down to up), and then notifies the slaves. There's also one other time when we want to rewrite the zone files: when the user has manually changed one or more entries in them (e.g. added a new entry).
zonerw therefore has two modes: minimal which is used when regularly polling the state of the main server and only increments
@serial number@ if a change is detected; and forced which the user can manually invoke when they have changed the zone files and which forcibly increments
zonerw is relatively simple. We need slightly different NSD config files on the main server (DNS slave) which we'll assume has IP address
18.104.22.168 and backup server (DNS master) which we'll assume is
22.214.171.124. The two config files are stored in
~/etc/ (so they're easily synchronised across machines). The user then creates a
base zone file in
~/var/zones/nsd/ with the magic markers placed as comments (after the
; character) e.g. tratt.net.base (they get rewritten to
/var/zones/nsd/, the normal place for NSD zone files).
zonerw then runs as a cronjob on the backup server every 5 or so minutes.
The zone files are relatively simple (assuming you can read such things in the first place: see e.g. this tutorial for pointers if you can't), but a few things are worthy of note. First, the domains have a short refresh time of 5 minutes but a long expiry time of 12 weeks. The short refresh time informs other DNS servers that they should regularly check to see if the domain's settings have changed: if the server status changes, we want the updated IP address to be quickly reflected in DNS lookups. The long expiry time means that, if all of my DNS servers are unavailable, other DNS servers should hold onto the records they previously fetched for a long time. This is because the complete disappearance of a domain from DNS (i.e. all a domains DNS servers are down) is a nightmare scenario: mail starts bouncing with fatal errors and, in general, people (and other servers) assume you've died and aren't coming back. Finally, notice that the master DNS server notifies two servers of DNS changes. As well as my 2 servers in the UK, I also use the free service at afraid.org as a DNS slave, just in case network connectivity to the UK is lost (it happened in the past though, admittedly, not for many, many years). It also reduces the DNS request load on the backup server. If I had a server outside the UK, I might do this myself, but I don't, so I'm grateful for afraid's service.
Our DNS configuration gives us a nice way of making sure the main DNS entry always points to an active server. That doesn't solve the problem of making the content on the servers synchronise. In other words, it's not much good pointing people to a backup server if the content it serves is out of date. In my case, my websites are relatively simple: a script updates them both from the same git directory, so they're nearly always in sync. Other services are, in general, more difficult. Some - most notably databases - have built-in support for synchronising across servers. Most, however, take a little more thought, and each needs to be done on a case-by-case basis. As a simple case study, I'm going to show how I synchronise e-mail across two machines.
E-mail is interesting in that both servers accept e-mail deliveries: we therefore have a two-way synchronisation problem. Traditionally, e-mail was delivered to a single
mbox file: synchronising that would be very hard. Most mail servers (and many mail clients) now support the
maildir format, where each e-mail is stored in an individual file. Maildirs also try very hard to make each e-mail's filename globally unique (chiefly by using the server's hostname as part of the filename). We thus know that e-mails delivered to the same user on different machines will have different filenames. We can then frame the solution in terms of synchronising directories, which is much easier.
Fortunately, there are already good directory synchronisation tools available, so we don't need to make our own. I use Unison, though there are alternatives. The backup server connects to the main server every 10 minutes and synchronises the maildirs across both machines. The connection is done using an unprivileged user and SSH keys (so that user doesn't have to type in a password) and
sudo (allowing the unprivileged user to only execute a mail synchronisation script as root).
The best bit about this solution is how easy it is to setup. It's also easy to extend to multiple users on a Unix box. It also gives a high degree of reliability. Mail that is delivered to one machine appears on the other with a maximum 10 minute gap - in other words, I can never lose more than 10 minutes worth of e-mail.
[As a side note, since I also read my mail through maildirs (using mutt), I also use Unison to synchronise my mail between my desktop, laptop and servers. By default, my mail fetching script connects to the main server. I can manually tell it to collect mail from the backup server, so I can always download my mail, even when the main server is down. If you're wondering why I don't use the main DNS entry for this purpose, it's because Unison uses the host name of the remote server in its archive file; if the hostname you're connecting to later points to a different machine, Unison gets very upset.]
Of course, some services are much harder to synchronise than e-mail, and there is no general strategy. But hopefully this gives an idea for one such strategy.
As I said earlier, the techniques outlined in this article aren't particularly sophisticated: they won't cope with DOS attacks, for example. I wouldn't even want my backup server to be subject to a slashdotting (or whatever the kids are calling it these days) - it doesn't really have the bandwidth to support it. But for the sort of usages my servers typically see, and for virtually no ongoing maintenance cost, these techniques have proved very useful. Hopefully documenting them might be of use to somebody else.
Follow me on Twitter