 | Level: Introductory Daniel Robbins (drobbins@gentoo.org), President/CEO, Gentoo Technologies, Inc.
01 Feb 2001 The new 2.4 kernel is finally here, and now's an ideal time to track down a spare PC, put Linux on it, and see what it can do. In this two-part series, Daniel Robbins introduces you to Linux 2.4 Software RAID, a technology used to increase disk performance and reliability by distributing data over multiple disks. In this article, Daniel explains what software RAID-1, 4, and 5 can and cannot do for you and how you should approach the implementation of these RAID levels in a production environment. In the second half of the article, Daniel walks you through the simulation of a RAID-1 failed drive replacement. Real-world RAID
In my previous article, I introduced you to
Linux 2.4's software RAID functionality, showing you how to set up linear,
RAID-0, and RAID-1 volumes. In this article, we look at what you need to know
in order to use RAID-1 to increase availability in a production environment.
This requires a lot more understanding and knowledge than just setting up
RAID-1 on a test server or at home -- specifically, you'll need to know exactly
what RAID-1 will protect you against, and how to keep your RAID volume up and
running in case of a disk failure. In this article, we'll cover these topics,
starting with an overview of what RAID-1, 4, and 5 can and can't do for
you, and ending with a complete test simulation of a failed RAID 1 drive
replacement -- something that you should actually do (with this article as your guide) if at all
possible. After going through the simulation, you'll have all the
experience you need to handle a RAID-1 failure in a real-world environment.
What RAID doesn't do
The fault-tolerant features of RAID are designed to protect you from the
negative impacts of a spontaneous complete drive failure. That's a good thing. But RAID isn't a
perfect fix for every kind of reliability problem. Before implementing a
fault-tolerant form of RAID (1,4,5) in a production environment, it's extremely
important that you know exactly what RAID will and will not do for you.
When we're in a situation where we're depending on RAID to perform, we don't
want to make any false assumptions about what it does. Let's start by
dispelling common myths about RAID 1, 4, and 5.
A lot of people think that if they place all their important data on a RAID
1/4/5 volume, then they won't have to perform regular backups. This is
completely false -- here's why. RAID 1/4/5 helps to protect against
unplanned downtime caused by a random drive failure. However, it offers
no protection against accidental or malicious data corruption. If you
type "cd /; rm -rf *" as root on a RAID volume, you'll lose a lot of very
important data in a matter of seconds, and the fact that you have a 10 drive
RAID-5 configuration will be of little significance. Also, RAID won't help you
if your server is physically stolen or if there's a fire in your building. And
of course, if you don't implement a backup strategy, you won't have an archive
of past data -- if someone in your office deletes a bunch of important files,
you won't be able to recover them. That alone should be enough to convince you
that, in most circumstances, you should plan and implement a backup strategy
before even thinking about tackling RAID-1, 4, or 5.
Another mistake is to implement software RAID on a system composed of
low-quality hardware. If you're putting together a server that's going to do
something important, it makes sense to purchase the highest-quality hardware
that's still comfortably within your budget. If your system is unstable or
improperly cooled, you'll run into problems that RAID can't solve. On a
similar note, RAID obviously can't give you any additional uptime in the case
of a power outage. If your server is going to be doing anything relatively
important, make sure that it's been equipped with an uninterruptible power
supply (UPS).
Next, we move on to filesystem issues. The filesystem exists "on top" of your
software RAID volume. This means that using software RAID does not allow you
to escape filesystem issues, such as long and potentially problematic fscks if
you happen to be using a non-journalled or flaky filesystem. So, software RAID
isn't going to make the ext2 filesystem more reliable; that's why it's so
important that the Linux community has ReiserFS, as well as JFS and XFS in the
works. Software RAID and a reliable journalling filesystem make a great
combination.
RAID - intelligent implementation
Hopefully, the previous section dispelled any RAID myths that you might have
had. When you implement RAID-1, 4, or 5, it's very important that you
view the technology as something that will enhance uptime. When you
implement one of these RAID levels, you're protecting yourself against a very
specific situation -- a spontaneous complete (single or multiple) drive
failure. If you experience this situation, software RAID will allow the system
to continue running, while you make arrangements to replace the failed drive
with a new one. In other words, if you implement RAID 1,4, or 5, you'll be
reducing your risk of having a long, unplanned downtime due to a complete drive
failure. Instead, you can have a short planned downtime -- just enough time to
replace the dead drive. Obviously, this means that if having a
highly-available system isn't a priority for you, then you shouldn't be
implementing software RAID, unless you plan to use it primarily as a way to
boost file I/O performance.
A smart system administrator uses software RAID for a specific purpose --
to improve the reliability of an already very reliable server. If you're a
smart sysadmin, you've already covered the basics. You've protected your
organization against catastrophe by implementing a regular backup plan. You've
hooked your server up to a UPS, and have the UPS monitoring software up and
running so that your server will shut down safely in the case of an extended
power outage. Maybe you're using a journalling filesystem such as ReiserFS to
reduce fsck time and increase filesystem reliability and performance. And
hopefully, your server is well-cooled and is composed of high-quality hardware,
and you've paid close attention to security issues. Now, and only now, should
you consider implementing software RAID-1, 4 or 5 -- by doing so, you'll
potentially give your server a few more percentage points of uptime by guarding
it against a complete drive failure. Software RAID is that added layer of
protection that makes an already rugged server even better.
 |
A RAID-1 walkthrough
Now that you've read about what RAID can and can't do, I hope you have reasonable
expectations and the right attitude. In this section, I'll walk you through
the process of simulating a disk failure, and then bringing your RAID volume
back out of degraded mode. If you're have the ability to set up a RAID-1
volume on a test machine and follow along with me, I highly recommend that you
do so. This kind of simulation can be fun. And having a little fun right now
will help to ensure that when a drive really fails, you'll be calm and collected,
and know exactly what to do.
OK, our first step is to set up a RAID-1 volume; refer to my previous article if you need a refresher on how to do this.
To perform this test, it's essential that you set up your RAID-1 volume so
that you can still boot your Linux system with one hard drive unplugged,
because this is how we're going to simulate a drive failure.
Once you've set up your volume, you'll see something like Listing 1 if you cat /proc/mdstat.
Note that I'm using devfs, and that's why you see the extremely long device
names listed above. I'm actually using /dev/hda5 and /dev/hde1 as my RAID-1
disks. At the moment, the kernel software RAID code is synchronizing the
drives so that they're exact mirrors of each other. If your RAID-1 volume is
at this point, you can go ahead and create a filesystem on the volume, and then
mount it somewhere. Copy some files over to it, and then set up your
/etc/fstab so that the volume (/dev/md0) will be mounted when your system
boots. Here's the line I added to my fstab; yours may differ slightly:
/dev/md0 /mnt/raid1 reiserfs defaults 0 0
|
OK; we're almost ready to simulate a drive failure, but not quite. First, cat
/proc/mdstat again, and wait until all your volume's disks are synchronized.
When they are, your /proc/mdstat will look like Listing 2.
The simulation begins
OK, now that the resync is complete, we're ready for the simulation.
Go ahead and shut down your machine and power it down. Then, open it up and
unplug one of the hard disks that make up your RAID-1 array. Of course, you
won't want to unplug the disk that contains your Linux root partition -- we'll
need to boot Linux again! OK, now that the hard drive is unplugged, bring
the machine back up. Once you log in, you should find that /dev/md0 is mounted
and that you're still able to use the volume. When you cat /proc/mdstat,
you'll see the major change:
# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5]
read_ahead 1024 sectors
md0 : active raid1 ide/host0/bus0/target0/lun0/part5[0]
4610496 blocks [2/1] [U_]
unused devices: <none>
|
Here, you can see that my /dev/md0 volume is running in degraded mode. I
unplugged drive /dev/hde, so /dev/hde1 wasn't found when the kernel booted and
tried to autostart my array. Fortunately, the kernel found /dev/hda5,
and /dev/md0 was able to start in degraded mode. As you can see, the /dev/hde1 partition
isn't listed in /proc/mdstat, and one of the RAID disks is marked as "down"
("[U_]" instead of "[UU]"). But hey, since /dev/md0 is still going, software
RAID-1 is doing what it's supposed to do -- keeping our data available.
Recovery
Right now, we're experiencing a simulated drive failure. If the drive that
currently doesn't have power actually failed while the system was running, this
is the kind of situation we'd be in. Our RAID-1 volume would be running in
degraded mode, meaning that our volume is still available but without any
redundancy. At a convenient time, we'd want to shut down the system, replace
the failed drive, and start the system back up again. Our RAID-1 volume would
still be running in degraded mode at this point.
Once we have the new drive in the machine, we'd want to create a RAID
autodetect ("FD") partition of the appropriate size on our new disk. An
additional reboot may be needed so that Linux can reread the disk's partition
tables. Once the new partition is visible to the system, we're ready to
restore our degraded RAID-1 array -- then, we'll have some redundancy again.
Of course, we're only performing a simulation. To practice adding a partition
back into our RAID array, we can do one of two things, depending on what kind
of scenario you'd like to prepare or. You can either shut down your machine,
plug the drive in, boot it up, and add the old partition back to the array, or
you can shut down your machine, plug the drive in, boot up, wipe the drive,
create a new RAID autodetect ("FD") partition to add the array (of the
correct size, of course -- at least as big as the partition it's replacing) and
then add this brand-new partition to the array. The second choice would be
closer to what would happen in the event of a real drive failure, while the
first would simulate something like a failed disk controller or bad cable
situation -- where one of your mirror drives was temporarily unavailable,
causing /dev/md0 to run in degraded mode, and requiring one of the partitions
to be added back to the volume after the problem was remedied. Whichever
simulation you choose to do, the "fix" is the same -- after the new partition
is ready, we need to manually add it back to the /dev/md0 volume.
 |
Looking at dmesg
Before we add the partition back to our array, this would be a good time to
take a look at our kernel boot messages. If you type "dmesg | more", you'll be
able to view the kernel boot messages. You should see a bunch of text similar to Listing 3.
Now would be a good time to carefully read these messages, because they'll help
you to understand the process that the kernel uses to autostart /dev/md0,
giving you another valuable insight into the inner workings of Linux software
RAID. If you read the kernel output listed above, you'll find that my kernel
found /dev/hda5 and /dev/hde1, but hde1 was out of sync with hda5. So, the
kernel started up /dev/md0 in degraded mode, using /dev/hda5 and not touching
/dev/hde1 at all. Now, it's time to add our original (or newly created)
partition to our volume. Here's how.
Restoration continues
First, if your replacement partition has a new device name, update /etc/raidtab
so that it reflects this new information. Then, add the new partition to the
volume using the following command, replacing /dev/hde1 with the device name of
the partition you're adding:
# raidhotadd /dev/md0 /dev/hde1
|
Your hard drive lights should begin glowing as reconstruction begins. Go ahead
and cat /proc/mdstat to check the status of the RAID-1 reconstruction that's
now in progress (see Listing 4).
In a matter of minutes, your RAID-1 volume will be back to normal (see Listing 5). Voila! We've successfully recovered from a simulated drive failure, and
you're ready to start using RAID-1 in a production environment. You can now
affix your homemade "RAID-1 certified" sticker to your forehead and begin
flapping your arms and running around the office to the delight of your
coworkers. Actually, maybe that isn't such a great idea. See you next article :)
Resources
About the author  | |  | Daniel Robbins lives in Albuquerque, New Mexico. He is the
President/CEO of Gentoo Technologies, Inc., the Chief Architect of the
Gentoo Project and a contributing author to several books published by MacMillan: Caldera OpenLinux Unleashed, SuSE Linux Unleashed, and Samba Unleashed. Daniel has been involved with computers
in some fashion since the second grade, when he was first exposed to the Logo programming language as well as a potentially dangerous dose of Pac-Man. This probably explains why he has since served as a Lead Graphic Artist at SONY Electronic Publishing/Psygnosis. Daniel enjoys spending time with his wife Mary, and his new baby daughter Hadassah. You can contact Daniel at drobbins@gentoo.org. |
Rate this page
|  |