Why uptime is bad

Growing up in the world of linux uptime was always considered a good thing. On IRC every once in a while someone would post an uptime. Everyone else in that channel would then check their uptime and if it was greater or close they would post it in the channel. Most of these systems were home linux boxes used for compiling random programs or maybe hosting a webserver for experimenting. It was fun to see how long we could keep them running for. Since those days I have come to realize that high uptimes are a bad thing.
Keeping a server up for months or even years means that you aren’t maintaining it. It hasn’t been kept up to date with new kernels that have fixes for security holes. It doesn’t have new packages or new tools that can help it run more efficiently and have features that can make using it easier. It’s also not up to date with new servers that are being deployed which means that people logging into your server with a high uptime have to adjust themselves to the older software and possible missing tools.

Hardware fails, colos lose power, network connections, and sometimes catch on fire. If you’re entire system depends on a single server, say a mysql master. It’s going to fail. I know there are mysql servers out there that have been up for years. Those are going to fail. It’s inevitable. If you’re system is not designed to withstand the failure of a master it should be fixed. Jeremy Cole and I gave a tutorial at the 2006 MySQL User Conference about MySQL replication and failover. See Jeremy’s blog for links to the presentation and photos.

“But I can’t take down my master to fix it?” It’s much better to do a planned downtime than it is to get paged at 3am because the master died and the whole site is down. Take some time. Plan to take down the master and fix the system. It will be worth it in the end. If your manager says no to a planned downtime to make your website fault tolerant. Find a new job. Preferably at Yahoo! :)

By building a system that can handle the failure of a master it’s much easier to upgrade MySQL so you can take advantage of all the new nifty features.

5 Comments

  1. Dean Swift says:

    I’m glad that others have this opinion. Although a server may be patched and updated, high uptime means outdated kernel.

  2. gloomy says:

    Hi,

    I think you should see a difference between machines/services that are going down too often for not being stable itself or for maintainence. How often do you need to upgrade a kernel on a stable freebsd/linux machine? I guess once in one-two years, not more often. Tools, services can be upgraded/updated without actually rebooting the machine. Updated machine with high uptime and load can show the experience of the system admin and quality of hardware.

  3. Eric Bergen says:

    SuSE auto update seems to kick out a new kernel every month or so. Still an uptime of two years probably means you aren’t doing practice failovers.

  4. gloomy says:

    And don’t forget the marketing stuff (-; Many people judge about the hosting quality by looking at the server uptime graph.

  5. Luke Hollins says:

    Generally I would agree, but in some cases you can have a long running BSD or Linux box where by fluke none of the security advisories have warranted a reboot. This keeps happening to me on FreeBSD 4 boxes that are now soon to be replaced. In some cases a server may be doing a very specific task and just doesn’t need to be updated. While MySQL, Apache, PHP etc you don’t want to be running a four year old version missing out on great new features, there are cases where you may not need it or your odd application isn’t worth upgrading. If you’re really lucky you host a client’s server that has the long uptime because they won’t accept any propositions to update their software/sites.

Leave a Reply