Wednesday, August 1, 2012

RAID 5 Hell

In the IT world it's apparently a given that RAID 5 sucks. No, maybe the IT tech you talk to won't give you those exact words when you inquire about RAID 5, but that RAID 5 sucks will be the message behind his words. It's true, it's true. Before this week I hadn't encountered a RAID 5 problem because where I work we have a fairly humble system of simply backing up all data and images to a NAS, so if there is a problem we restore from our backups. I'm not sure this non-RAID system is good for every network so don't think I'm promoting a non-RAID data security plan for you, just know that it's doable and not a terrible plan.

One of my co-IT-workers is off for sick leave this week and probably two more weeks so all calls from his department will go to me. I get a call on Monday that a lady can't remote in to her mapping server. She said she went to the server to see if it was powered on (good idea!) and it was on, but it was stuck at "press F1 to resume." She said that when she hit F1 the next screen was something "...about RAID-5 degraded not bootable" message. My first thought: CRAP. I know that RAID-5 needs at least three hard drives to be operational. If one fails, you can swap the bad drive out with a new drive of identical brand and same size or larger to rebuild the RAID, but if two fail then there is a serious possibility of not being able to rebuild the RAID. In fact, I don't know of anyone rebuilding a RAID-5 array from a two out three hard drive failure; maybe someone out there has and if you have then please tell me the story because I would love to add it to my growing list of tech stories. Anyway, I check things out and after about a hour of doing troubleshooting I come to the conclusion that two of the three hard drives are in fact failed drives. CRAP. I let the user of the server know that I can run to the local Staples, grab two new drives and just see if I can rebuild the array. She says, "cool" and that's the plan. I write down the type of hard drive used in the array: 1 TB Seagate Barracuda 7200 RPM.

I arrive at Staples, they have that type of drive and I immediately do my happy dance. I take the card to the cashier, she calls for a tech to go in the back and grab two drives for me, he comes back to the front with one drive and informs me that they only had one drive left. I then execute my sad face. I go ahead and purchase the one drive. When I get back to work, I inform the user of the happenings. She is somewhat sad. I tell her we can try and rebuild the array with the working drive, the new drive, and one of the failed drives just to see if we can rebuild it (I'm somewhat of a hopeful guy who likes to try almost certain failure scenarios) and to no surprise the RAID would not rebuild. We went through the immediate actions we could take:
* Reinstall Windows Server 2008 R2 on the new drive, add the second drive as a backup drive, and order a third drive and install it when it arrives and use that as a backup drive also. The server would be up and running in just a few hours.
* Wait for a new drive to arrive in at the earliest two days. Install the third drive with only a small possibility of the RAID-5 Array rebuilding. The server would not be up and running soon. Two departments use the server to access maps out on the road and at remote locations.

We went with the first option. From this experience it's plain to me that RAID-5 sucks unless you have a five or six RAID-5 array. I would rather have updated system-image backups of the computer for simple restore options in case of system failure. RAID-6 and RAID-10 I think are nice arrays, but this is the second time I've had problems with Intel's RAID-5 arrays. In the past, before this experience, I was able to shut-down the PC, disconnect then reconnect the sata cables and the array would rebuild itself. I never had an issue with RAID-5 array drives failing. The good thing in this experience is that this mapping server pulled its data from another PC so there was actually zero critical data loss. The only thing lost was availability. The mapping company will have to come down to our location and install the mapping software, but that should be by the end of this week.

RAID 5 or RAID 6?

No comments:

Post a Comment

Life in IT appreciates and encourages your comments, but we do have guidelines for posting comments:

1. Avoid profanities or foul language unless it is contained in a necessary quote.

2. Stay on topic.

3. Disagree, but avoid ad hominem attacks.

4. Threats are treated seriously and reported to law enforcement.

5. Spam and advertising are not permitted in the comments area.