Contributed by jose on from the don't-reboot! dept.
"This is not such a newsworthy topic, but my question is, who has the longest uptime with OpenBSD? My 2.8 box has 425 days of uptime, but I'm sure that's not the longest out there. Are there any original 2.1 boxes which have never rebooted? When I get a chance I'm going to upgrade to 3.1 and then the uptime starts back at zero again. "I think a more important question is how do you reliably and securely maintain a machine and aim for high availability (long uptimes)? How do you do this without ignoring application and library upgrades? This gets to be a bit trickier than just leaving it on and exposed with security holes. How have the readers here accomplished this?
(Comments are closed)
By Jedi/Sector One () j@bitchy-sex.com on http://www.bitchy-sex.com/
By Anonymous Coward () on
By Ben Goren () ben@trumpetpower.com on http://www.trumpetpower.com/
I know the answer to this one: Art does ! He has the longest uptime!
Seriously, you need to evaluate the situation. Until recently, the chances of a need to upgrade a production server were slim. For example, a lot of OpenBSD computers out there aren't running Apache; why (rush to) patch a non-running but vulnerable Apache? For those that were running a vulnerable httpd, it could be taken care of by upgrading and then an apachectl one-liner, with nobody ever noticing that the server was unavailable for a fraction of a second. Even still, some don't need upgrading, such as a bridging firewall.
Even in the worst case scenario, how many people running OpenBSD systems really can't afford five minutes of downtime at, say, three in the morning? There are lots of guides out there for upgrading, but the short version is that you can replace kernel and binaries and update /etc on a running system. Once you've done that, just reboot--no system (with clean disks) should take more than a few minutes to reboot.
If you can't even afford that kind of downtime, you're doing some kind of clustering / load balancing / whatever, and things get easy: take down one of your nodes, deal with it at your leisure. When you're done, bring it back up again and repeat with another node.
If downtime is ever a concern, you should absolutely have either a test box or a spare server. Practice on the test machine 'til you've got it perfect, and the real upgrade will be no problem. Or, use your spare server and swap network cables when you're done.
Careful planning is always your friend.
Cheers,
b&
By Scott Walters () scott@slowass.net on http://www.slowass.net
By Scott Walters () scott@slowass.net on http://www.slowass.net
The ideal case is that a close network of machines will have a dedicated admin. Slightly less optimal is a single admin maintaing a campus littered with machines, or a fleet of servers. Frightening cases include a machine that a client paid to have set up, but is not being maintained at all. Physical security makes this more realistic. I've seen OS9 machines abandoned for 10 years in the office to a movie theatre, Linux 1 boxes on autopilot for years, countless other examples. Typically, a machine on a good UPS will run until the dustbunnies kill it. Frequently the machine is replaced first. The movie theatre machine was replaced with a machine that sits on the Internet and talks to home office. It will receive as little maintaince as the OS9 box, but it no longer enjoys physical isolation. Saying "hire an admin!" won't fix the problem. The profit margins are extremely low due to advertising and distributors cuts on admissions. Paying above minimum wage for the popcorn shovelers is out of the question. Given the choice between an OS9 box freshly cleaned out dust bunnies and a brand new install of an OS that is likely to have a buffer overrun in a service when running minimal compliment, the OS9 solution is smarter. The fact that uptimes seldom excede a year on a box free of remote buffer exploits clearly says to me that modern OSs aren't ready for the prime time of far reaching autonomous deployment.
-scott
(oops, sorry for double submit)
By Miod Vallat () miod@openbsd.org on mailto:miod@openbsd.org
By pravus () on
i guess it doesn't really count unless you think about the machine only being turned off for about 2 hours in 2 years.
the funny thing is that it is an old Pentium-200 (clocked to 225) and it's more stable than my newer toys... sometimes you just can't beat the older stuff.
By KryptoBSD () krypto@uncompiled.com on http://www.uncompiled.com
By Jeff Flowers () jeffrey@jeffreyf.net on http://www.jeffreyf.net
The URI is http://uptimes.wonko.com/
IIRC, a NetBSD box is leading the pack.
By Anonymous Coward () on
10:38PM up 845 days, 20 mins, 1 user, load averages: 0.42, 0.31, 0.17
By Anonymous Coward () on
Anyway, I finally got a card called a PC Weasel (www.realweasel.com) which lets me access the screen over a serial connection. I am going to send it off, along with a 3.1 CD and some new IDE drives, as soon as I get a chance, and upgrade this thing. Does anyone have experience with these with OpenBSD?
Any other thoughts or experiences on upgrading or managing OpenBSD boxes in cases where you really can't get to it? I realize that PC hardware was never designed for this kind of remote use, but that's all I can afford at the moment.
My current plan is to send a couple of new IDE drives and the PC Weasel card. They can install it there. I will back up the RAID array onto the IDE cards, and then install 3.1, and then copy the data back over from the IDE drives to the RAID. After that I will have tons of extra IDE space, maybe to backup some Oggs, etc. How does this sound as a safe way to do this?
By edu () on
My problem was really that it had reached ~70 days of uptime before I got an UPS and I was too greedy to shut it down in order to install the UPS. Well I got 691 days of uptime before there was a power outage about 1½ month ago that took it down.
Still I think that it was close to a miracle to get 691 days of uptime without an UPS as last summer there was a huge power outage in Helsinki, Espoo and Vantaa (Finland) which lasted for a few hours (affecting about 700000 people). The funny thing was that a small part of the city of Espoo had electricity and I was lucky enough to live in that area ;)
Getting a high uptime for an OpenBSD box doing nothing isn't such a big deal, as it would be close to a miracle if it would crash under no stress and to me a hardware failure seems to be the most likely cause to crash a box.
By Scientifik () devont@sdf.lonestar.org on mailto:devont@sdf.lonestar.org
By Anonymous Coward () on
ns1 up 524 days, 2:31, load average: 0.07 0.08 0.08
ns2 up 367 days, 2 mins, load average: 0.09 0.08 0.08
I was 1800 miles away to a week long conference when in late Jan/early Feb of 2001 the Bind vulnerability surfaced. I was running a version that was vulnerable and in big trouble if they all went down. I had good connectivity at the hotel I was staying at (10 Meg, going through a T1).
There was a message in misc@ from Oct 2000 that talked about remote updating to 2.8, even to snapshots. I had tried it and had good success with running the latest snapshots, so I remotely upgraded a box to a snapshot of 2.8 at the time:
OpenBSD 2.8-current (GENERIC) #487: Sun Jan 28 03:46:59 MST 2001
I compiled the latest bind ver 8.x and got it up as a test name server. A few zones, about 6,000 hosts in the biggest one. It worked, so I went for it, remotely upgrading the two name servers to the latest snapshot and installing the non vulnerable version of bind.
I looked like a hero, when we started getting bind attacks on the name servers within 24 hours.
Been running ever since. Hats off to you OpenBSD developers!
What did I learn from the experience? That THAT PARTICULAR SNAPSHOT WAS ROCK SOLID!. Nothing more. I have had other snapshots that installed fine and then crashed when I rebooted. I took a chance and dodged the bullet. It was either wait and hope that we didn't get hacked, try the install and hope it worked, or get a plane ticket back several days early, to either clean up after the hackers or failed remote install of a _snapshot_.
Now uptime is cute, when comparing with the Windows Server folks, who struggle to stay up longer than 2 weeks. Oh, and by the way, these nameservers handle about 50 queries per second, day in, day out, month in, month out, year in, year out. Now I'm going to kill the uptimes and upgrade to fix the ssh and new bind vulnerabilities. If you can keep it running for years without worrying about vulnerabilities, and it is stable, go for it, is my advice.
By Tom () tom@php.lu on mailto:tom@php.lu
1:37PM up 5 days, 13:09, 1 user, load averages: 0.15, 0.10, 0.09
But hey, it's 3.1-stable ... security is more important than uptime!
By Cindy () on
I am glad we are talking about uptimes of OpenBSD and not of guys. :)
-- Cindy
By Anonymous Coward () on
3:14PM up 664 days, 17:06, 1 user, load averages: 0.15, 0.10, 0.09
It has handled my mail and webpages perfectly, but I guess I really should upgrade...
By TASM () on