As I'm sure many of you probably know by now, one of the Arthmoor servers is down.
Late last night, ironically as I was setting up and preparing the first set of regular database backups, the drive in the server died. One minute it was running fine, doing whatever, the next it was sitting there clicking. The infamous click of death. I spent only a short amount of time trying to revive it as I knew stuff like this is normally mechanical and could have done further damage. So when it didn't come back, I killed the power and pulled it out.
This morning I took the drive down to datamechanix.com in Irvine and handed it over to them for 3 hours. I'm not sure if I believe the conclusion the guy reached since he was doing everything he could to try and keep me from wanting to drive down there on a weekend. Supposedly something called the "servo track" is either corrupted or destroyed. He said he couldn't see "the pattern" on the scope. Maybe he's right, I don't know. But because he seemed more interested in getting rid of me than telling me how to fix this I brought the drive back home and am seeking second opinions from other data recovery firms.
So what does all this mean for you?
The following sites are going to be down, or operating in a severely limited fashion until something more can be done:
www.afkmud.com - The codebase files are safe. However the database containing 5 years worth of posts is gone.
www.fussproject.org - Similar situation. The codebase files are safe, but the database containing 5+ years worth of information is gone.
www.arthmoor.com - The static pages for the site are still operational. But what little was in the forum is lost. Email service is being migrated to the other server. DNS service is going to be impacted slightly due to running on the slave system. Boralis is effectively dead, Crondonia is up and its databases have all been backed up to off system storage.
www.iguanadons.net - The code is safe. Like anyone really cares about this, but I do. Recovery on this one is feasible from Google cache and such, there wasn't a ton of stuff there yet.
IMC2 Server02 - Down until it can be moved. Anyone connected to it will just need to be patient.
www.alsherok.net - Offline. Not lost through. The irony of never having moved it up to non-static HTML pages.
There are other sites affected by this obviously. 15 customers in total have been directly impacted. Their sites are now down. Hopefully they've heeded my repeated advice in the past to keep their own backups. I find myself wishing that advice had been heeded here. Two of my immortal staff have probably lost years worth of blog posts. One of their girlfriends might have lost the whole thing going back to 2002. There's probably a bunch of other miscellaneous losses that have resulted from this too.
So what now? Well. I'm going to get some other recovery opinions. If that doesn't go well then I suppose there's nothing left to do but pick up the pieces and rebuild as best as possible. The lesson to be learned here is not to put off doing backups. Especially after having hounded so many people to do them. I know. I should know better considering I work in IT and see this stuff happen a lot. The other lesson is next time don't be a cheap bastard and put off buying the necessary drives to do RAID mirroring. Had I done so when the servers got upgraded this would be a simple matter of popping in a new drive and waiting for the hardware to fix it. Hindsight is 20/20.
If anyone happens to know of a top notch data recovery firm that can give me a definitive answer, please let me know.
I'm going to be putting in a new pair of drives and setting it up in a RAID1 configuration. I should be getting those in tomorrow and once I've done that I'll rebuild the virtual servers as needed. So I'd rather wait to do that if possible. But if you'd like something up faster I can arrange that on the Crondonia box.
Alright. There's a ray of hope. As I was scouring Google for blog entries I ran across one that reminded me I never blanked the old drives from the server upgrades back in February. This isn't an ideal situation but it beats the hell out of trying to scrape through the other crap backup I had from 3 years ago.
The major downside of course is that anything newer than the first week of February 2007 is still gone. But I know that at least for me, losing 8 months of data from sites is a lot better than losing it all. And it makes the drudge work of actually reinstalling most of it pretty much vanish. The only thing left to do will be to make any necessary upgrades and skinning repairs.
The new server parts will be in later this afternoon and I'll get Boralis rebuilt as quickly as I can and then copy over the recovered data.
Not at this point. The cost for data recovery would be prohibitively high. I've had 3 different estimates come in ranging from $1000 to $4000, and they all say that if the first assessment of the servo track ( apparently this is actually quite important ) is correct that it would require nothing short of a miracle to recover anything at all. Which would drive the cost up even higher.
you need a friend in police computer data retrieval :) my dad used todo it back in the day when he was a police officer, though he did computer crime and fraud investigation….. like, 10 years ago. But hey, i'm sure you'll find someone!