Intro to RAID

Jarret W. Buse · Jul 10, 2013

RAID (Redundant Array of Inexpensive Disks)

Many file systems include support for redundancy and fault tolerance.
Redundancy is duplicating parts to prevent a failure as a whole.
Fault tolerance is the ability of a computer system to continue working after a failure.

RAID allows fault tolerance of data and hard drives. RAID can be implemented by special hardware controllers or by the software (in this case the file system).
To start with, a disk array is a grouping of physical hard disks that act as one virtual disk drive. The virtual disk allows fault tolerance (in some cases).
Some LINUX file systems natively support one or more of these three RAID types. They are:
1. RAID 0 (Striping)
2. RAID 1 (Mirroring)
3. RAID 1+0 or RAID 10 (Striping/Mirror)

RAID 0 (Striping)
RAID 0 is a RAID type which does not support fault tolerance, but does support a performance boost.
The method of writing contiguous blocks of data to a disk is interleaving. Interleaving is a mapping of blocks which may not be laid out physically next to one another. As the disk spins, the read/write heads hover over the disk as it spins. Depending on the speed of the disk and the ability of the read/write heads, it may not be possible to read each physical contiguous block. If not, the disk has to spin all the way around again to get the next block under the heads. This wastes time. To increase reading and writing, the interleave may be set to every second or third block. If it were every second block, then it means the heads will be able to read every second block as they pass under it. By interleaving in this way, the data can be read contiguously without having to spin the disk around again and waste time.
What occurs in RAID 0 is that two or more disks are used and the interleave starts on the first disk. It then goes to the second and so on until it returns to the first disk and goes around to the other disks. The method is referred to as "striping" since it writes a section of data on one disk, then the second and so on. It writes data in "stripes" across each disk. Since each disk is "striped" identically in size, each disk partition used must be the same size. For example, two 500 GB disk and one 750 GB disk would result in only 500 GB used from each disk for 1500 GB (1.5 TB).
The reason for the performance increase is that if one disk is busy reading/writing part of one file, another part of a file can be accessed which may be on a separate disk that is not busy. For example, if a system process or a background application is running from one stripe from the first disk and another application is started which is located on a second disk, the processes will not "fight" for hard disk throughput. Of course, this works much better if each hard disk is on its own physical controller so one controller does not cause a bottleneck of data bandwidth - it can also produce a single point of hardware failure.
No matter the number of disks used in the disk array, if one disk fails, the whole virtual drive is inaccessible. The main thing to remember about RAID 0 is that it does not provide redundancy of any sort, only a performance boost. RAID 0 is no substitution for regular data backups.

RAID 1 (Mirroring)
RAID 1 is Mirroring or sometimes called disk cloning. There is fault tolerance with RAID 1, with disk some performance boost.
What occurs is that an even number of disks are required. For example, using two disks would store the same data on both disks. The second disk is an exact "mirror" of the first disk. Any data changed on one is changed on the second disk. If one disk fails, the second disk remains online and continues to provide access to the files. Once the first disk is fixed, the data can be mirrored again and the fault tolerance is available again.
As with RAID 0, performance and fault tolerance can be increased by using two disk controllers instead of one for both disks.
It is possible to have four disks, with two being mirrored to the other two disks. For example, Disk 1 is mirrored to Disk 2, while Disk 3 is mirrored to Disk 4. At this point you may be asking about the loss of usable drive space. If two 500 GBdrives are mirrored, the only 500 GB of data can be written to the mirror. Until the mirror is broken, the total usable drive space is the size of the smallest drive in the mirrored set. To look at it another way, if a 500 GB disk is used with a 750 GB disk, the mirror could only be as large as 500 GB. The other 250 GB on the larger disk could be a separate partition for data, but this can slow the mirror's performance.
Be aware that when one disk in the mirror fails, the RAID 1 fault tolerance is no longer available to provide redundancy. If both drives are connected to the same disk controller and it fails, both drives are no longer available.

RAID 1+0 - RAID 10 (Stripe/Mirror)
RAID 1+0 or sometimes called RAID 10 is a combination of striping and mirroring. Here a minimum of four disks are required. The first two drives are striped the same as a RAID 0 configuration. The data is then mirrored to the last two drives. Now the performance of RAID 0 is combined with the fault tolerance of RAID 1.
If any one disk fails, the fault tolerance is lost until the disk replaced and the RAID 1 option is repaired between the disks. The RAID 0 performance is still in place for the remaining complete striped set. For example, if Disk 1 and 2 are striped, then mirrored to Disk 3 and 4 but Disk 3 fails, the set of Disk 3 and 4 is no longer available. RAID 0 on Disk 1 and 2 are still active. Once Disk 3 is replaced or repaired, then RAID 1 can be reactivated and the fault tolerance is once again in action.

Again, please note to place each disk or set of disks on a separate controller for better performance and fault tolerance. As always with any fault tolerant setup, it is not a 100% guarantee against all data loss. Always perform a backup of all data regularly.

Intro to RAID

Jarret W. Buse

Guest

Members online

Latest posts