RAID Technology

RAID Technology (Redundant Array of Independent Disks)

Nowadays,several storage virtualization technologies are in use in order to enable better functionality and more advanced features within the storage system. RAID Technology is such a powerful storage virtualization technology defined by David Patterson,Garth A Gibson,and Randy Katz at University of California,Berkeley in 1987.

This blog will make you walk through the various aspects of RAID.

RAID Defined

RAID is the acronym for “Redundant Array of Independent Disks”. RAID Technology combines multiple small, independent disk drives into a single logical unit .This will yield performance exceeding that of one large, independent drive. Such logical units are commonly known as RAID arrays which appear to the computer as a single virtual drive.All the hard disks in RAID array will be accessed in parallel.

Why do we use RAID Technology?

The need for RAID can be described in two points given below:

1,Overall increase in I/O performance: Array of disks accessed in parallel will give more data throughput than a single disk.

2,Provides data redundancy: Provides fault-tolerance by redundantly storing information in various ways.

RAID Implementations

There are two forms of implementation for RAID Technology, hardware and software.

Software RAID:

Software RAID is part of OS,which works on partition level. It runs on the server’s CPU and is directly dependent on server CPU performance and load. It is more suitable for RAID 0 and RAID 1 setup.Its a low cost solution which makes it a perfect buddy for home and small business users. Since it occupies the host system memory and CPU cycles ,it may result in the degradation of server performance,which makes it least recommended for implementing higher order RAID Technology levels which uses parity functions. Another drawback is that It does not support advanced RAID Technology features like Hot Swapping(process of replacing the faulty hard disk without shutting down the server).

Hardware RAID:

It uses dedicated hardware RAID Technology controller to manage the RAID array. Since It runs on Raid controllers CPU,there won’t be any overhead(CPU ,RAM etc). Host CPU can execute applications while the array adapter’s processor simultaneously executes array functions. Hardware RAID Technology controllers interface with the system and hard drives through SCSI or IDE/ATA. RAID controllers can either be integrated with the system motherboard itself(Bus-Based or Controller Card Hardware RAID) or it can be an external dedicated hardware solution(Intelligent, External RAID Controller). Since it provides the highest performance possible,it can be used for mission critical applications.

RAID Levels:

Raid can be classified into six levels based on how the Raid Controller distribute the data in virtual hard disk to the physical disks. Six RAID levels are RAID-0 to RAID-5. Each RAID levels has different disk fault tolerance and trade-offs in features and performance. Among the different RAID levels,only RAID 0, 1, 3 and 5 are commonly used.
Apart from the above mentioned levels,certain combinations like RAID 10,RAID 01 etc are also possible.

RAID-0

Figure 1 RAID- 0 layout

Technique(s) used: Block Level Striping.

Explanation: In RAID 0,the data is broken into block sized units and it is striped across the hard disks. In this example, the first block of data, A0, is written to the first disk, the second block of data, A1, is written to the second disk, the third block of data, B0, is written to the first disk, and so on. Striping data across the disks means that the overall write performance of the disk set is very fast, usually much faster than a single disk. When a read request is received by the RAID controller,it reads both A0 and A1 at the same time since they are on separate disks,doubling the read performance relative to a single disk. Performance is very good but the failure of any one disk in the array results in data loss.

Minimum number of disks needed to setup a RAID-0 array is 2.

The capacity of RAID 0 can be calculated as:

Capacity = n * min(disk sizes)

where n = number of disks in the array.
min(disk sizes) = minimum common capacity across the drives.

Merits:

High I/O performance

De-merits:

No data redundancy.

RAID-1

Figure-2 RAID-1 layout

Technique(s) used : Mirroring

Explanation: It is the first real implementation of RAID which offers Redundancy. RAID 1 uses Mirroring technique to provide redundancy.RAID-1 takes an incoming block of data to one drive and creates a mirror image (copy) of it on a second drive.In this illustration when block A1 is written to disk 0, the same block is also written to disk 1. So the overall write performance of a RAID-1 array is the same as a single disk .The read performance is actually faster for a RAID-1 array relative to a single disk. If one of the drives fails, the other drive still has all the data that existed on the system. This provides a full level of redundancy for the data on the system.

Minimum number of disks needed to setup RAID-1 array is 2.

Storage capacity is only as large as the smallest drive. Capacity of the RAID-1 array can be calculated using

Capacity = min(disk sizes)

where,min(disk sizes) = minimum common capacity across the drives.

Merit:

Provides full redundancy of data and good read performance.

De-Merits:

Limited write performance.

RAID-2

Techniques used: Bit level striping+Hamming code error correction.

Explanation:The basic concept is that,RAID-2 stripes data at the bit level instead of the block level and uses a Hamming Coding for parity computations. In RAID 2,first bit is written on first disk,second bit is written to second disk and so on. Then Hamming Code is calculated for the bits and the resulting bit is stored on a separate disk. This level of RAID is intended for use in drives which do not have built-in error detection. If one of the disks fail the remaining bits of the byte and the associated ECC(Error Correction Codes) bits can be used to reconstruct the data.

RAID-2 is no longer really used,since the hard disks nowadays uses has built-in error correction techniques.

RAID-3

Figure 3 RAID-3 layout

Technique(s) used:Byte level striping + parity.

Explanation: It uses multiple data disks, and a dedicated disk to store parity.A chunk of data is split into bytes. Byte A0 is written to disk 0,byte A1 is written to disk 1,and byte A2 written to disk 2 and so on.Then the parity of bytes A0,A1 and A2 are computed and written to disk 3.Since parity is used, a RAID 3 can withstand a single disk failure without losing data or access to data.Parity is calculated using XOR Operation. A RAID-3 array tolerates the loss of any single drive. Data in the failed drive can be reconstructed by XOR ing the data bytes in the remaining drives in the array without much degradation in performance. Accessing a single block of data requires access to more than one hard disks,so that the spindles should be synchronized.

Minimum number of disks needed to setup RAID-3 array is 3. The three disks has to be identical.

Capacity of the RAID-3 array can be calculated using

Capacity = min(disk sizes) * (n-1)

where, n = number of disks in the array.
min(disk sizes) = minimum common capacity across the drives.

Merit:

1)RAID-3 provides high throughput (both read and write) for large data transfers
2) Disk failures do not significantly slow down throughput.

De-merit:

1) Cannot be implemented using software RAID since it is more resource intensive.
2) Performance is slower for small I/O operations 3) Controller design is fairly complex.

RAID-4

Figure-4 RAID-4 layout

Technique(s) used: Block level striping +Parity

Description: RAID 4 is quite similar to that of RAID 3. It also uses a dedicated parity disk, but the difference is that, it stripes the data at block level. A chunk of data is splitted into blocks. Block A0 is written to Disk-0,A1 is written to Disk-1 and A2 is written to Disk-2 and so on. The parity of the blocks(Ap) are calculated and stored in the dedicated parity disk. Since the data is striped at the block level ,hard disks can be accessed independently during data read/write. It can tolerate the loss of one drive

Minimum number of disks needed to setup RAID-4 array is 3(Two disks for storing the data and third one for storing parity information) . All the disks should be identical.

Capacity of RAID-4 array can be calculated using:

Capacity = min(disk sizes) * (n-1)

where n = number of disks in the array.
min(disk sizes) = minimum common capacity across the drives.

Merits:

1) Unlike RAID 3, it does not require synchronized spindles. 2) Good read performance since all of the drives are read at the same time.

De-merits:

Write performance is not that good because of the bottleneck of the parity drive.

RAID-5

Figure:5 RAID 5 layout

Technologies used:Striping+Distributed parity

Description: This is the most popular RAID level.It uses block level striping and parity is distributed across all the drives in the array. It offers higher performance than level 1-4.It also provides high storage capacity. Here parity information will also be stored along with the data disks.

Minimum number of disks needed is 3(two disks for storing data and the third one for storing parity information.). All the 3 disks should be identical.

Capacity of the RAID-5 array can be calculated using

Capacity = min(disk sizes) * (n-1)

where n = number of disks in the array.
min(disk sizes) = minimum common capacity across the drives.

Merits:

1) Good data redundancy/availability (can tolerate the lose of 1 drive)
2) Very good read performance since all of the drives can be read at the same time.

De-merits:

1) Write performance is adequate (better than RAID-4).
2) Write performance for small I/O is poor.

Conclusion

RAID is a good solution for those who need more transfer performance, redundancy, and storage capacity in their data storage systems. I hope you’ve enjoyed reading this article and maybe even learned more about RAID.

SERVICES

COMPANY

ISO CERTIFIED

Our product