Raid 1 description. Comparison of standard levels. What is a RAID array

Today we will talk about RAID arrays. Let's figure out what it is, why we need it, what it is like and how to use all this magnificence in practice.

So, in order: what is RAID array or simply RAID? This abbreviation stands for "Redundant Array of Independent Disks" or "redundant (backup) array of independent disks." To put it simply, RAID array this is a collection of physical disks combined into one logical disk.

It usually happens the other way around - in system unit one physical disk is installed, which we split into several logical ones. Here the situation is the opposite - somewhat hard drives first they are combined into one, and then the operating system is perceived as one. Those. The OS firmly believes that it physically only has one disk.

RAID arrays There are hardware and software.

Hardware RAID arrays are created before the OS boots via special utilities, hardwired into RAID controller- something like a BIOS. As a result of creating such RAID array already at the OS installation stage, the distribution kit “sees” one disk.

Software RAID arrays created by OS tools. Those. during loading operating system"understands" that she has several physical disks and only after the OS starts, through software disks are combined into arrays. Naturally, the operating system itself is not located on RAID array, since it is installed before it is created.

"Why is all this needed?" - you ask? The answer is: to increase the speed of reading/writing data and/or increase fault tolerance and security.

"How RAID array can increase speed or secure data?" - to answer this question, consider the main types RAID arrays, how they are formed and what it gives as a result.

RAID-0. Also called "Stripe" or "Tape". Two or more hard drives are combined into one by sequential merging and summing up the volumes. Those. if we take two 500GB disks and create them RAID-0, the operating system will perceive this as one terabyte disk. At the same time, the read/write speed of this array will be twice as high as that of one disk, since, for example, if the database is physically located in this way on two disks, one user can read data from one disk, and another user can write to another disk at the same time. While in the case of the database location on one disk, the HDD read/write tasks different users will execute sequentially. RAID-0 will allow reading/writing in parallel. As a consequence, the more disks in the array RAID-0, the faster the array itself works. The dependence is directly proportional - the speed increases N times, where N is the number of disks in the array.
At the array RAID-0 there is only one drawback that outweighs all the advantages of using it - the complete lack of fault tolerance. If one of the physical disks of the array dies, the entire array dies. There's an old joke about this: "What does the '0' in the title mean? RAID-0? - the amount of information restored after the death of the array!"

RAID-1. Also called "Mirror" or "Mirror". Two or more hard drives are combined into one by parallel merging. Those. if we take two 500GB disks and create them RAID-1, the operating system will perceive this as one 500GB disk. In this case, the read/write speed of this array will be the same as that of one disk, since information is read/written to both disks simultaneously. RAID-1 does not provide a gain in speed, but provides greater fault tolerance, since in the event of the death of one of the hard drives, there is always a complete duplicate of information located on the second drive. It must be remembered that fault tolerance is provided only against the death of one of the array disks. If the data was deleted purposefully, it is deleted from all disks of the array simultaneously!

RAID-5. More safe option RAID-0. The volume of the array is calculated using the formula (N - 1) * DiskSize RAID-5 from three 500GB disks, we get an array of 1 terabyte. The essence of the array RAID-5 is that several disks are combined into RAID-0, and the last disk stores the so-called “checksum” - service information intended to restore one of the array disks in the event of its death. Array write speed RAID-5 somewhat lower, since time is spent calculating and writing the checksum to a separate disk, but the reading speed is the same as in RAID-0.
If one of the array disks RAID-5 dies, the read/write speed drops sharply, since all operations are accompanied by additional manipulations. Actually RAID-5 turns into RAID-0 and if recovery is not taken care of in a timely manner RAID array there is a significant risk of losing data completely.
With an array RAID-5 You can use the so-called Spare disk, i.e. spare. During stable operation RAID array This disk is idle and not used. However, in the event of a critical situation, restoration RAID array starts automatically - information from the damaged one is restored to the spare disk using checksums located on a separate disk.
RAID-5 is created from at least three disks and saves from single errors. In case of simultaneous occurrence of different errors on different disks RAID-5 doesn't save.

RAID-6- is an improved version of RAID-5. The essence is the same, only for checksums, not one, but two disks are used, and the checksums are calculated using different algorithms, which significantly increases the fault tolerance of everything RAID array generally. RAID-6 assembled from at least four disks. The formula for calculating the volume of an array looks like (N - 2) * DiskSize, where N is the number of disks in the array, and DiskSize is the size of each disk. Those. while creating RAID-6 from five 500GB disks, we get an array of 1.5 terabytes.
Write speed RAID-6 lower than RAID-5 by about 10-15%, which is due to additional time spent on calculating and writing checksums.

RAID-10- also sometimes called RAID 0+1 or RAID 1+0. It is a symbiosis of RAID-0 and RAID-1. The array is built from at least four disks: on the first RAID-0 channel, on the second RAID-0 to increase read/write speed, and between them in a RAID-1 mirror to increase fault tolerance. Thus, RAID-10 combines the advantages of the first two options - fast and fault-tolerant.

RAID-50- similarly, RAID-10 is a symbiosis of RAID-0 and RAID-5 - in fact, RAID-5 is built, only its constituent elements are not independent hard disks, and the arrays are RAID-0. Thus, RAID-50 gives very good read/write speed and contains the stability and reliability of RAID-5.

RAID-60- the same idea: we actually have RAID-6, assembled from several RAID-0 arrays.

There are also other combined arrays RAID 5+1 And RAID 6+1- they look like RAID-50 And RAID-60 the only difference is that the basic elements of the array are not RAID-0 tapes, but RAID-1 mirrors.

How do you understand combined RAID arrays: RAID-10, RAID-50, RAID-60 and options RAID X+1 are direct descendants of the basic array types RAID-0, RAID-1, RAID-5 And RAID-6 and serve only to increase either read/write speed or increase fault tolerance, while carrying the functionality of basic, parent types RAID arrays.

If we move on to practice and talk about the use of certain RAID arrays in life, the logic is quite simple:

RAID-0 We do not use it in its pure form at all;

RAID-1 We use it where read/write speed is not particularly important, but fault tolerance is important - for example, on RAID-1 It’s good to install operating systems. In this case, no one except the OS accesses the disks, the speed of the hard disks themselves is quite sufficient for operation, fault tolerance is ensured;

RAID-5 We install it where speed and fault tolerance are needed, but there is not enough money to buy more hard drives or there is a need to restore arrays in case of damage without stopping work - spare Spare drives will help us here. Common Application RAID-5- data storage;

RAID-6 used where it is simply scary or there is a real threat of death of several disks in the array at once. In practice it is quite rare, mainly among paranoid people;

RAID-10- used where it is necessary to work quickly and reliably. Also the main direction for use RAID-10 are file servers and database servers.

Again, if we simplify further, we come to the conclusion that where there is no large and voluminous work with files, it is quite enough RAID-1- operating system, AD, TS, mail, proxy, etc. Where serious work with files is required: RAID-5 or RAID-10.

The ideal solution for a database server is a machine with six physical disks, two of which are combined into a mirror RAID-1 and the OS is installed on it, and the remaining four are combined into RAID-10 for fast and reliable data processing.

If, after reading all of the above, you decide to install it on your servers RAID arrays, but don’t know how to do it and where to start - contact us! - we will help you select the necessary equipment, as well as carry out installation work for implementation RAID arrays.

All modern motherboards are equipped with an integrated RAID controller, and top models even have several integrated RAID controllers. The extent to which integrated RAID controllers are in demand by home users is a separate question. In any case, modern motherboard provides the user with the ability to create a RAID array of several disks. However, not every home user knows how to create a RAID array, what array level to choose, and generally has little idea of the pros and cons of using RAID arrays.
In this article, we will give brief recommendations on creating RAID arrays on home PCs and use a specific example to demonstrate how you can independently test the performance of a RAID array.

History of creation

The term “RAID array” first appeared in 1987, when American researchers Patterson, Gibson and Katz from the University of California Berkeley in their article “A Case for Redundant Arrays of Inexpensive Discs, RAID” described how In this way, you can combine several low-cost hard drives into one logical device so that the resulting capacity and performance of the system are increased, and the failure of individual drives does not lead to failure of the entire system.

More than 20 years have passed since this article was published, but the technology of building RAID arrays has not lost its relevance today. The only thing that has changed since then is the decoding of the RAID acronym. The fact is that initially RAID arrays were not built on cheap disks at all, so the word Inexpensive (inexpensive) was changed to Independent (independent), which was more true.

Operating principle

So, RAID is a redundant array of independent disks (Redundant Arrays of Independent Discs), which is tasked with ensuring fault tolerance and increasing performance. Fault tolerance is achieved through redundancy. That is, part of the disk space capacity is allocated for official purposes, becoming inaccessible to the user.

Increased performance of the disk subsystem is ensured by the simultaneous operation of several disks, and in this sense, the more disks in the array (up to a certain limit), the better.

The joint operation of disks in an array can be organized using either parallel or independent access. With parallel access, disk space is divided into blocks (strips) for recording data. Similarly, information to be written to disk is divided into the same blocks. When writing, individual blocks are written to different disks, and multiple blocks are written to different disks simultaneously, which leads to increased performance in write operations. The necessary information is also read in separate blocks simultaneously from several disks, which also increases performance in proportion to the number of disks in the array.

It should be noted that the parallel access model is implemented only if the size of the data write request is larger than the size of the block itself. Otherwise, parallel recording of several blocks is almost impossible. Let's imagine a situation where the size of an individual block is 8 KB, and the size of a request to write data is 64 KB. In this case, the source information is cut into eight blocks of 8 KB each. If you have a four-disk array, you can write four blocks, or 32 KB, at a time. Obviously, in the example considered, the write and read speeds will be four times higher than when using a single disk. This is only true for an ideal situation, but the request size is not always a multiple of the block size and the number of disks in the array.

If the size of the recorded data is less than the block size, then a fundamentally different model is implemented - independent access. Moreover, this model can also be used when the size of the data being written is larger than the size of one block. With independent access, all data from a single request is written to a separate disk, that is, the situation is identical to working with one disk. The advantage of the independent access model is that if several write (read) requests arrive simultaneously, they will all be executed on separate disks independently of each other. This situation is typical, for example, for servers.

In accordance with various types access exist and different types RAID arrays, which are usually characterized by RAID levels. In addition to the type of access, RAID levels differ in the way they accommodate and generate redundant information. Redundant information can either be placed on a dedicated disk or distributed among all disks. There are many ways to generate this information. The simplest of them is complete duplication (100 percent redundancy), or mirroring. In addition, error correction codes are used, as well as parity calculations.

RAID levels

Currently, there are several RAID levels that can be considered standardized - these are RAID 0, RAID 1, RAID 2, RAID 3, RAID 4, RAID 5 and RAID 6.

Various combinations of RAID levels are also used, which allows you to combine their advantages. Typically this is a combination of some kind of fault-tolerant level and a zero level used to improve performance (RAID 1+0, RAID 0+1, RAID 50).

Note that all modern RAID controllers support the JBOD (Just a Bench Of Disks) function, which is not intended for creating arrays - it provides the ability to connect individual disks to the RAID controller.

It should be noted that the RAID controllers integrated on motherboards for home PCs do not support all RAID levels. Dual-port RAID controllers only support levels 0 and 1, while RAID controllers with more ports (for example, the 6-port RAID controller integrated into the southbridge of the ICH9R/ICH10R chipset) also support levels 10 and 5.

In addition, if we talk about motherboards based on Intel chipsets, they also implement the Intel Matrix RAID function, which allows you to create hard drives x simultaneously RAID matrices of several levels, allocating part of the disk space for each of them.

RAID 0

RAID level 0, strictly speaking, is not a redundant array and, accordingly, does not provide reliable data storage. Nevertheless this level actively used in cases where it is necessary to ensure high performance of the disk subsystem. When creating a RAID level 0 array, information is divided into blocks (sometimes these blocks are called stripes), which are written to separate disks, that is, a system with parallel access is created (if, of course, the block size allows it). By allowing simultaneous I/O from multiple disks, RAID 0 provides the fastest data transfer speeds and maximum disk space efficiency because no storage space is required for checksums. The implementation of this level is very simple. RAID 0 is mainly used in areas where fast transfer of large amounts of data is required.

RAID 1 (Mirrored disk)

RAID Level 1 is an array of two disks with 100 percent redundancy. That is, the data is simply completely duplicated (mirrored), due to which a very high level of reliability (as well as cost) is achieved. Note that to implement level 1, it is not necessary to first partition the disks and data into blocks. In the simplest case, two disks contain the same information and are one logical disk. If one disk fails, its functions are performed by another (which is absolutely transparent to the user). Restoring an array is performed by simple copying. In addition, this level doubles the speed of reading information, since this operation can be performed simultaneously from two disks. This type of information storage scheme is used mainly in cases where the cost of data security is much higher than the cost of implementing a storage system.

RAID 5

RAID 5 is a fault-tolerant disk array with distributed checksum storage. When recording, the data stream is divided into blocks (stripes) at the byte level and simultaneously written to all disks of the array in cyclic order.

Suppose the array contains n disks, and the stripe size d. For each portion of n–1 stripes, the checksum is calculated p.

Stripe d 1 recorded on the first disk, stripe d 2- on the second and so on up to the stripe dn–1, which is written to ( n–1)th disk. Next on n-disk checksum is written p n, and the process is repeated cyclically from the first disk on which the stripe is written d n.

Recording process (n–1) stripes and their checksum are produced simultaneously for all n disks.

The checksum is calculated using a bitwise exclusive-or (XOR) operation applied to the data blocks being written. So, if there is n hard drives, d- data block (stripe), then the checksum is calculated using the following formula:

pn=d1 d 2 ... d 1–1.

If any disk fails, the data on it can be restored using the control data and the data remaining on the working disks.

As an illustration, consider blocks of four bits. Let there be only five disks for storing data and recording checksums. If there is a sequence of bits 1101 0011 1100 1011, divided into blocks of four bits, then to calculate the checksum it is necessary to perform the following bitwise operation:

1101 0011 1100 1011 = 1001.

Thus, the checksum written to the fifth disk is 1001.

If one of the disks, for example the fourth, fails, then the block d 4= 1100 will not be available when reading. However, its value can be easily restored using the checksum and the values of the remaining blocks using the same “exclusive OR” operation:

d4 = d1 d 2d 4p5.

In our example we get:

d4 = (1101) (0011) (1100) (1011) = 1001.

In the case of RAID 5, all disks in the array are the same size, but the total capacity of the disk subsystem available for writing becomes exactly one disk smaller. For example, if five disks are 100 GB in size, then the actual size of the array is 400 GB because 100 GB is allocated for control information.

RAID 5 can be built on three or more hard drives. As the number of hard drives in an array increases, its redundancy decreases.

RAID 5 has an independent access architecture, which allows multiple reads or writes to be performed simultaneously.

RAID 10

RAID level 10 is a combination of levels 0 and 1. The minimum requirement for this level is four drives. In a RAID 10 array of four drives, they are combined in pairs into level 0 arrays, and both of these arrays as logical drives are combined into a level 1 array. Another approach is also possible: initially the disks are combined into mirrored arrays of level 1, and then logical drives based on these arrays - into an array of level 0.

Intel Matrix RAID

The considered RAID arrays of levels 5 and 1 are rarely used at home, which is primarily due to the high cost of such solutions. Most often, for home PCs, a level 0 array on two disks is used. As we have already noted, RAID level 0 does not provide secure data storage, and therefore end users are faced with a choice: create a fast but unreliable RAID level 0 array or, doubling the cost of disk space, RAID- a level 1 array that provides reliable data storage, but does not provide significant performance benefits.

To solve this difficult problem, Intel developed Intel Matrix Storage Technology, which combines the benefits of Tier 0 and Tier 1 arrays on just two physical disks. And in order to emphasize that in this case we are not just talking about a RAID array, but about an array that combines both physical and logical disks, the word “matrix” is used in the name of the technology instead of the word “array”.

So, what is a two-disk RAID matrix using Intel Matrix Storage technology? The basic idea is that if the system has several hard drives and a motherboard with an Intel chipset that supports Intel Matrix Storage Technology, it is possible to divide the disk space into several parts, each of which will function as a separate RAID array.

Let's look at a simple example of a RAID matrix consisting of two disks of 120 GB each. Any of the disks can be divided into two logical disks, for example 40 and 80 GB. Next, two logical drives of the same size (for example, 40 GB each) can be combined into a RAID level 1 matrix, and the remaining logical drives into a RAID level 0 matrix.

In principle, using two physical disks, it is also possible to create just one or two RAID level 0 matrices, but it is impossible to obtain only level 1 matrices. That is, if the system has only two disks, then Intel technology Matrix Storage allows you to create the following types of RAID matrices:

one level 0 matrix;
two level 0 matrices;
level 0 matrix and level 1 matrix.

If the system has three hard drives, the following types of RAID matrices can be created:

one level 0 matrix;
one level 5 matrix;
two level 0 matrices;
two level 5 matrices;
level 0 matrix and level 5 matrix.

If the system has four hard drives, then it is additionally possible to create a RAID matrix of level 10, as well as combinations of level 10 and level 0 or 5.

From theory to practice

If we talk about home computers, the most popular and popular are RAID arrays of levels 0 and 1. The use of RAID arrays of three or more disks in home PCs is rather an exception to the rule. This is due to the fact that, on the one hand, the cost of RAID arrays increases in proportion to the number of disks involved in it, and on the other hand, for home computers, the capacity of the disk array is of primary importance, and not its performance and reliability.

Therefore, in the future we will consider RAID levels 0 and 1 based on only two disks. The objective of our research will be to compare the performance and functionality of RAID arrays of levels 0 and 1, created on the basis of several integrated RAID controllers, as well as to study the dependence of the speed characteristics of the RAID array on the stripe size.

The fact is that although theoretically, when using a RAID level 0 array, the read and write speed should double, in practice the increase in speed characteristics is much less modest and it varies for different RAID controllers. The same is true for a RAID level 1 array: despite the fact that theoretically the read speed should double, in practice it’s not so smooth.

For our RAID controller comparison testing, we used the Gigabyte GA-EX58A-UD7 motherboard. This board is based on Intel chipset X58 Express with ICH10R southbridge, which has an integrated RAID controller for six SATA II ports, which supports the organization of RAID arrays of levels 0, 1, 10 and 5 with the Intel Matrix RAID function. In addition, the Gigabyte GA-EX58A-UD7 board integrates the GIGABYTE SATA2 RAID controller, which has two SATA II ports with the ability to organize RAID arrays of levels 0, 1 and JBOD.

Also on the GA-EX58A-UD7 board is an integrated SATA III controller Marvell 9128, on the basis of which two SATA III ports are implemented with the ability to organize RAID arrays of levels 0, 1 and JBOD.

Thus, the Gigabyte GA-EX58A-UD7 board has three separate RAID controllers, on the basis of which you can create RAID arrays of levels 0 and 1 and compare them with each other. Let us recall that the SATA III standard is backward compatible with the SATA II standard, therefore, based on the Marvell 9128 controller, which supports drives with the SATA III interface, you can also create RAID arrays using drives with the SATA II interface.

The testing stand had the following configuration:

processor - Intel Core i7-965 Extreme Edition;
motherboard - Gigabyte GA-EX58A-UD7;
BIOS version - F2a;
hard drives - two Western Digital WD1002FBYS drives, one Western Digital WD3200AAKS drive;
integrated RAID controllers:
ICH10R,
GIGABYTE SATA2,
Marvell 9128;
memory - DDR3-1066;
memory capacity - 3 GB (three modules of 1024 MB each);
memory operating mode - DDR3-1333, three-channel operating mode;
video card - Gigabyte GeForce GTS295;
power supply - Tagan 1300W.

Testing was carried out under operating system control Microsoft Windows 7 Ultimate (32-bit). The operating system was installed on a Western Digital WD3200AAKS drive, which was connected to the port of the SATA II controller integrated into the ICH10R southbridge. The RAID array was assembled on two WD1002FBYS drives with a SATA II interface.

To measure the speed characteristics of the created RAID arrays, we used the IOmeter utility, which is the industry standard for measuring the performance of disk systems.

IOmeter utility

Since we intended this article as a kind of user guide for creating and testing RAID arrays, it would be logical to start with a description of the IOmeter (Input/Output meter) utility, which, as we have already noted, is a kind of industry standard for measuring the performance of disk systems. This utility is free and can be downloaded from http://www.iometer.org.

The IOmeter utility is a synthetic test and allows you to work with hard drives that are not partitioned into logical partitions, so you can test drives regardless of file structure and reduce the influence of the operating system to zero.

When testing, it is possible to create a specific access model, or “pattern,” which allows you to specify the execution of specific operations by the hard drive. In case of creation specific model access is allowed to change the following parameters:

size of the data transfer request;
random/sequential distribution (in%);
distribution of read/write operations (in%);
The number of individual I/O operations running in parallel.

The IOmeter utility does not require installation on a computer and consists of two parts: IOmeter itself and Dynamo.

IOmeter is a monitoring part of the program with user defined graphical interface, allowing you to make all the necessary settings. Dynamo is a load generator that has no interface. Each time you run IOmeter.exe, the Dynamo.exe load generator automatically starts.

To start working with the IOmeter program, just run the IOmeter.exe file. This opens the main window of the IOmeter program (Fig. 1).

Rice. 1. Main window of the IOmeter program

It should be noted that the IOmeter utility allows you to test not only local disk systems (DAS), but also network drives(NAS). For example, it can be used to test the performance of a server's disk subsystem (file server) using several network clients. Therefore, some of the bookmarks and tools in the IOmeter utility window relate specifically to network settings programs. It is clear that when testing disks and RAID arrays we will not need these program capabilities, and therefore we will not explain the purpose of all tabs and tools.

So, when you start the IOmeter program, a tree structure of all running load generators (Dynamo instances) will be displayed on the left side of the main window (in the Topology window). Each running Dynamo load generator instance is called a manager. Additionally, the IOmeter program is multi-threaded and each individual thread running on a Dynamo load generator instance is called a Worker. The number of running Workers always corresponds to the number of logical processor cores.

In our example, we use only one computer with a quad-core processor that supports Hyper-Threading technology, so only one manager (one instance of Dynamo) and eight (according to the number of logical processor cores) Workers are launched.

Actually, to test disks in this window there is no need to change or add anything.

If you select the name of the computer with the mouse in the tree structure of running Dynamo instances, then in the window Target on the tab Disk Target All disks, disk arrays and other drives (including network drives) installed on the computer will be displayed. These are the drives that IOmeter can work with. Media may be marked yellow or blue. Logical partitions of media are marked in yellow, and physical devices without logical partitions created on them are marked in blue. A logical section may or may not be crossed out. The fact is that in order for the program to work with a logical partition, it must first be prepared by creating a special file on it, equal in size to the capacity of the entire logical partition. If the logical partition is crossed out, this means that the section is not yet prepared for testing (it will be prepared automatically at the first stage of testing), but if the section is not crossed out, this means that a file has already been created on the logical partition, completely ready for testing .

Note that, despite the supported ability to work with logical partitions, it is optimal to test drives that are not partitioned into logical partitions. You can delete a logical disk partition very simply - through a snap-in Disk Management. To access it, just right-click on the icon Computer on the desktop and select the item in the menu that opens Manage. In the window that opens Computer Management on the left side you need to select the item Storage, and in it - Disk Management. After that, on the right side of the window Computer Management All connected drives will be displayed. By right clicking on to the desired disk and selecting the item in the menu that opens Delete Volume..., you can delete a logical partition on a physical disk. Let us remind you that when you delete a logical partition from a disk, all information on it is deleted without the possibility of recovery.

In general, using the IOmeter utility you can only test blank disks or disk arrays. That is, you cannot test a disk or disk array on which the operating system is installed.

So, let's return to the description of the IOmeter utility. In the window Target on the tab Disk Target you must select the disk (or disk array) that will be tested. Next you need to open the tab Access Specifications(Fig. 2), on which it will be possible to determine the testing scenario.

Rice. 2. Access Specifications tab of the IOmeter utility

In the window Global Access Specifications There is a list of predefined test scripts that can be assigned to the boot manager. However, we won’t need these scripts, so all of them can be selected and deleted (there is a button for this Delete). After that, click on the button New to create a new test script. In the window that opens Edit Access Specification You can define the boot scenario for a disk or RAID array.

Suppose we want to find out the dependence of the speed of sequential (linear) reading and writing on the size of the data transfer request block. To do this, we need to generate a sequence of boot scripts in sequential read mode at different block sizes, and then a sequence of boot scripts in sequential write mode at different block sizes. Typically, block sizes are chosen as a series, each member of which is twice the size of the previous one, and the first member of this series is 512 bytes. That is, the block sizes are as follows: 512 bytes, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 KB, 1 MB. There is no point in making the block size larger than 1 MB for sequential operations, since with such large data block sizes the speed of sequential operations does not change.

So, let's create a loading script in sequential reading mode for a block of 512 bytes.

In field Name window Edit Access Specification enter the name of the loading script. For example, Sequential_Read_512. Next in the field Transfer Request Size we set the data block size to 512 bytes. Slider Percent Random/Sequential Distribution(the percentage ratio between sequential and selective operations) we shift all the way to the left so that all our operations are only sequential. Well, the slider , which sets the percentage ratio between read and write operations, is shifted all the way to the right so that all our operations are read only. Other parameters in the window Edit Access Specification no need to change (Fig. 3).

Rice. 3. Edit Access Specification Window to Create a Sequential Read Load Script
with a data block size of 512 bytes

Click on the button Ok, and the first script we created will appear in the window Global Access Specifications on the tab Access Specifications IOmeter utilities.

Similarly, you need to create scripts for the remaining data blocks, however, to make your work easier, it is easier not to create the script anew each time by clicking the button New, and having selected the last created scenario, press the button Edit Copy(edit copy). After this the window will open again Edit Access Specification with the settings of our last created script. It will be enough to change only the name and size of the block. Having completed a similar procedure for all other block sizes, you can begin to create scripts for sequential recording, which is done in exactly the same way, except that the slider Percent Read/Write Distribution, which sets the percentage ratio between read and write operations, must be moved all the way to the left.

Similarly, you can create scripts for selective writing and reading.

After all the scripts are ready, they need to be assigned to the download manager, that is, indicate which scripts will work with Dynamo.

To do this, we check again what is in the window Topology The name of the computer (that is, the load manager on the local PC) is highlighted, and not the individual Worker. This ensures that load scenarios will be assigned to all Workers at once. Next in the window Global Access Specifications select all the load scenarios we have created and press the button Add. All selected load scenarios will be added to the window (Fig. 4).

Rice. 4. Assigning the created load scenarios to the load manager

After this you need to go to the tab Test Setup(Fig. 5), where you can set the execution time of each script we created. To do this in a group Run Time set the execution time of the load scenario. It will be enough to set the time to 3 minutes.

Rice. 5. Setting the execution time of the load scenario

Moreover, in the field Test Description You must specify the name of the entire test. In principle, this tab has a lot of other settings, but they are not needed for our tasks.

After all the necessary settings have been made, it is recommended to save the created test by clicking on the button with the image of a floppy disk on the toolbar. The test is saved with the extension *.icf. Subsequently, you can use the created load scenario by running not the IOmeter.exe file, but the saved file with the *.icf extension.

Now you can start testing directly by clicking on the button with a flag. You will be asked to specify the name of the file containing the test results and select its location. Test results are saved in a CSV file, which can then be easily exported to Excel and, by setting a filter on the first column, select the desired data with test results.

During testing, intermediate results can be seen on the tab Result Display, and you can determine which load scenario they belong to on the tab Access Specifications. In the window Assigned Access Specification a running script appears in green, completed scripts in red, and unexecuted scripts in blue.

So, we looked at the basic techniques for working with the IOmeter utility, which will be required for testing individual disks or RAID arrays. Note that we have not talked about all the capabilities of the IOmeter utility, but a description of all its capabilities is beyond the scope of this article.

Creating a RAID array based on the GIGABYTE SATA2 controller

So, we begin creating a RAID array based on two disks using the GIGABYTE SATA2 RAID controller integrated on the board. Of course, Gigabyte itself does not produce chips, and therefore under the GIGABYTE SATA2 chip is hidden a relabeled chip from another company. As you can find out from the driver INF file, we are talking about a JMicron JMB36x series controller.

Access to the controller setup menu is possible at the system boot stage, for which you need to press the Ctrl+G key combination when the corresponding inscription appears on the screen. Naturally, first in BIOS settings you need to define the operating mode of two SATA ports belonging to the GIGABYTE SATA2 controller as RAID (otherwise access to the RAID array configurator menu will not be possible).

The setup menu for the GIGABYTE SATA2 RAID controller is quite simple. As we have already noted, the controller is dual-port and allows you to create RAID arrays of level 0 or 1. Through the controller settings menu, you can delete or create a RAID array. When creating a RAID array, you can specify its name, select the array level (0 or 1), set the stripe size for RAID 0 (128, 84, 32, 16, 8 or 4K), and also determine the size of the array.

Once the array is created, then any changes to it are no longer possible. That is, you cannot subsequently change for the created array, for example, its level or stripe size. To do this, you first need to delete the array (with loss of data), and then create it again. Actually, this is not unique to the GIGABYTE SATA2 controller. The inability to change the parameters of created RAID arrays is a feature of all controllers, which follows from the very principle of implementing a RAID array.

Once an array based on the GIGABYTE SATA2 controller has been created, current information about it can be viewed using the GIGABYTE RAID Configurer utility, which is installed automatically along with the driver.

Creating a RAID array based on the Marvell 9128 controller

Configuring the Marvell 9128 RAID controller is only possible through the settings Board BIOS Gigabyte GA-EX58A-UD7. In general, it must be said that the Marvell 9128 controller configurator menu is somewhat crude and can mislead inexperienced users. However, we will talk about these minor shortcomings a little later, but for now we will consider the main ones functionality Marvell 9128 controller.

So, although this controller supports SATA III drives, it is also fully compatible with SATA II drives.

The Marvell 9128 controller allows you to create a RAID array of levels 0 and 1 based on two disks. For a level 0 array, you can set the stripe size to 32 or 64 KB, and also specify the name of the array. In addition, there is an option such as Gigabyte Rounding, which needs explanation. Despite the name, which is similar to the name of the manufacturer, the Gigabyte Rounding function has nothing to do with it. Moreover, it is in no way connected with the RAID level 0 array, although in the controller settings it can be defined specifically for an array of this level. Actually, this is the first of those shortcomings in the Marvell 9128 controller configurator that we mentioned. The Gigabyte Rounding feature is defined only for RAID Level 1. It allows you to use two drives (for example, from different manufacturers or different models), the capacity of which is slightly different from each other. The Gigabyte Rounding function precisely sets the difference in the sizes of the two disks used to create a RAID level 1 array. In the Marvell 9128 controller, the Gigabyte Rounding function allows you to set the difference in the sizes of the disks to 1 or 10 GB.

Another flaw in the Marvell 9128 controller configurator is that when creating a RAID level 1 array, the user has the ability to select the stripe size (32 or 64 KB). However, the concept of stripe is not defined at all for RAID level 1.

Creating a RAID array based on the controller integrated into the ICH10R

The RAID controller integrated into the ICH10R southbridge is the most common. As already noted, this RAID controller is 6-port and supports not only the creation of RAID 0 and RAID 1 arrays, but also RAID 5 and RAID 10.

Access to the controller setup menu is possible at the system boot stage, for which you need to press the key combination Ctrl + I when the corresponding inscription appears on the screen. Naturally, first in the BIOS settings you should define the operating mode of this controller as RAID (otherwise access to the RAID array configurator menu will be impossible).

The RAID controller setup menu is quite simple. Through the controller settings menu, you can delete or create a RAID array. When creating a RAID array, you can specify its name, select the array level (0, 1, 5 or 10), set the stripe size for RAID 0 (128, 84, 32, 16, 8 or 4K), and also determine the size of the array.

RAID performance comparison

To test RAID arrays using the IOmeter utility, we created sequential read, sequential write, selective read, and selective write load scenarios. The data block sizes in each load scenario were as follows: 512 bytes, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 KB, 1 MB.

On each of the RAID controllers, we created a RAID 0 array with all allowable stripe sizes and a RAID 1 array. In addition, in order to be able to evaluate the performance gain obtained from using a RAID array, we also tested a single disk on each of the RAID controllers.

So, let's look at the results of our testing.

GIGABYTE SATA2 Controller

First of all, let's look at the results of testing RAID arrays based on the GIGABYTE SATA2 controller (Fig. 6-13). In general, the controller turned out to be literally mysterious, and its performance was simply disappointing.


Rice. 6.Speed sequential and selective disk operations Western Digital WD1002FBYS	Rice. 7.Speed sequential with a stripe size of 128 KB (GIGABYTE SATA2 controller)


Rice. 12.Serial speed and selective operations for RAID 0 with a stripe size of 4 KB (GIGABYTE SATA2 controller)	Rice. 13.Serial speed and selective operations for RAID 1 (GIGABYTE SATA2 controller)

If you look at the speed characteristics of one disk (without a RAID array), the maximum sequential read speed is 102 MB/s, and the maximum sequential write speed is 107 MB/s.

When creating a RAID 0 array with a stripe size of 128 KB, the maximum sequential read and write speed increases to 125 MB/s, an increase of approximately 22%.

With stripe sizes of 64, 32, or 16 KB, the maximum sequential read speed is 130 MB/s, and the maximum sequential write speed is 141 MB/s. That is, with the specified stripe sizes, the maximum sequential read speed increases by 27%, and the maximum sequential write speed increases by 31%.

In fact, this is not enough for a level 0 array, and I would like the maximum speed of sequential operations to be higher.

With a stripe size of 8 KB, the maximum speed of sequential operations (reading and writing) remains approximately the same as with a stripe size of 64, 32 or 16 KB, however, there are obvious problems with selective reading. As the data block size increases up to 128 KB, the selective read speed (as it should) increases in proportion to the data block size. However, when the data block size is more than 128 KB, the selective read speed drops to almost zero (to approximately 0.1 MB/s).

With a stripe size of 4 KB, not only the selective read speed drops when the block size is more than 128 KB, but also the sequential read speed when the block size is more than 16 KB.

Using a RAID 1 array on a GIGABYTE SATA2 controller does not significantly change the sequential read speed (compared to a single drive), but the maximum sequential write speed is reduced to 75 MB/s. Recall that for a RAID 1 array, the read speed should increase, and the write speed should not decrease compared to the read and write speed of a single disk.

Based on the results of testing the GIGABYTE SATA2 controller, only one conclusion can be drawn. It makes sense to use this controller to create RAID 0 and RAID 1 arrays only if all other RAID controllers (Marvell 9128, ICH10R) are already used. Although it is quite difficult to imagine such a situation.

Marvell 9128 controller

The Marvell 9128 controller demonstrated much higher speed characteristics compared to the GIGABYTE SATA2 controller (Fig. 14-17). In fact, the differences appear even when the controller operates with one disk. If for the GIGABYTE SATA2 controller the maximum sequential read speed is 102 MB/s and is achieved with a data block size of 128 KB, then for the Marvell 9128 controller the maximum sequential read speed is 107 MB/s and is achieved with a data block size of 16 KB.

When creating a RAID 0 array with stripe sizes of 64 and 32 KB, the maximum sequential read speed increases to 211 MB/s, and sequential write speed increases to 185 MB/s. That is, with the specified stripe sizes, the maximum sequential read speed increases by 97%, and the maximum sequential write speed increases by 73%.

There is no significant difference in the speed performance of a RAID 0 array with a stripe size of 32 and 64 KB, however, the use of a 32 KB stripe is more preferable, since in this case the speed of sequential operations with a block size of less than 128 KB will be slightly higher.

When creating a RAID 1 array on a Marvell 9128 controller, the maximum sequential operation speed remains virtually unchanged compared to a single disk. So, if for a single disk the maximum speed of sequential operations is 107 MB/s, then for RAID 1 it is 105 MB/s. Also note that for RAID 1, selective read performance degrades slightly.

In general, it should be noted that the Marvell 9128 controller has good speed characteristics and can be used both to create RAID arrays and to connect single disks to it.

Controller ICH10R

The RAID controller built into the ICH10R turned out to be the highest performing of all those we tested (Figure 18-25). When working with a single drive (without creating a RAID array), its performance is virtually the same as that of the Marvell 9128 controller. The maximum sequential read and write speed is 107 MB and is achieved with a data block size of 16 KB.

Rice. 18.Serial speed
and selective operations
for Western Digital WD1002FBYS disk (ICH10R controller)

If we talk about the RAID 0 array on the ICH10R controller, then the maximum sequential read and write speed does not depend on the stripe size and is 212 MB/s. Only the size of the data block at which the maximum sequential reading and writing speed is achieved depends on the stripe size. Test results show that for RAID 0 based on the ICH10R controller, it is optimal to use a 64 KB stripe. In this case, the maximum sequential read and write speed is achieved with a data block size of only 16 KB.

So, to summarize, we once again emphasize that the RAID controller built into the ICH10R significantly exceeds all other integrated RAID controllers in performance. And given that it also has greater functionality, it is optimal to use this particular controller and simply forget about the existence of all the others (unless, of course, the system uses SATA drives III).

If you have encountered or expect to soon encounter one of the following problems on your computer:

There is clearly not enough physical capacity of the hard drive as a single logical drive. Most often this problem occurs when working with large files (video, graphics, databases);
The hard drive's performance is clearly not enough. Most often, this problem occurs when working with non-linear video editing systems or when a large number of users simultaneously access files on the hard drive;
The hard drive's reliability is clearly lacking. Most often, this problem arises when it is necessary to work with data that must never be lost or that must always be available to the user. Sad experience shows that even the most reliable equipment sometimes breaks down and, as a rule, at the most inopportune moment.

Creating a RAID system on your computer can solve these and some other problems.

What is "RAID"?

In 1987, Patterson, Gibson, and Katz of the University of California, Berkeley, published “A Case for Redundant Arrays of Inexpensive Disks (RAID).” This article described different types of disk arrays, abbreviated RAID - Redundant Array of Independent (or Inexpensive) Disks (redundant array of independent (or inexpensive) disk drives). RAID is based on the following idea: by combining several small and/or cheap disk drives into an array, you can get a system that is superior in capacity, speed and reliability to the most expensive disk drives. On top of that, from a computer's point of view, such a system looks like one single disk drive.

It is known that the mean time between failures of a drive array is equal to the mean time between failures of a single drive divided by the number of drives in the array. As a result, the array's mean time between failures is too short for many applications. However, a disk array can be made tolerant of the failure of a single drive in several ways.

In this article, five types (levels) of disk arrays were defined: RAID-1, RAID-2, ..., RAID-5. Each type provided fault tolerance as well as different advantages over a single drive. Along with these five types, the RAID-0 disk array, which is NOT redundant, has also gained popularity.

What RAID levels are there and which one should you choose?

RAID-0. Typically defined as a non-redundant group of disk drives without parity. RAID-0 is sometimes called “Striping” based on the way information is placed on the drives included in the array:

Since RAID-0 does not have redundancy, failure of one drive leads to failure of the entire array. On the other hand, RAID-0 provides maximum data transfer speed and efficient use of disk drive space. Because RAID-0 does not require complex math or logic calculations, its implementation costs are minimal.

Scope of application: audio and video applications requiring high speed continuous data transfer, which cannot be provided by a single drive. For example, research conducted by Mylex to determine the optimal disk system configuration for a non-linear video editing station shows that, compared to a single drive, a RAID-0 array of two drives gives a 96% increase in write/read speed, of three drives - by 143% (according to the Miro VIDEO EXPERT Benchmark test).

RAID-1. Better known as "Mirroring" ("disk mirror") or a pair of disk drives containing the same information and making up one logical disk:

Drive 0	Drive 1

Recording is performed on both drives in each pair. However, drives in a pair can perform simultaneous read operations. Thus, "mirroring" can double the read speed, but the write speed remains unchanged. RAID-1 has 100% redundancy and a failure of one drive does not lead to a failure of the entire array - the controller simply switches read/write operations to the remaining drive.

RAID-1 provides highest speed work among all types of redundant arrays, especially in a multi-user environment, but the worst use of disk space. Because RAID-1 does not require complex math or logic calculations, its implementation costs are minimal.

The minimum number of drives in the array is 2.

To increase write speed and ensure reliable data storage, several RAID-1 arrays can, in turn, be combined into RAID-0. This configuration is called “two-level” RAID or RAID-10 (RAID 0+1)

The minimum number of drives in the array is 4.

Scope of application: cheap arrays in which the main thing is reliability of data storage.

RAID-2. Distributes data into sector-sized stripes across a group of disk drives. Some drives are dedicated to ECC (Error Correction Code) storage. Since most drives store ECC codes on a per-sector basis by default, RAID-2 does not provide special advantages compared to RAID-3 and, therefore, is practically not used.

RAID-3. As in the case of RAID-2, data is distributed over stripes of one sector in size, and one of the array drives is allocated to store parity information:

RAID-3 relies on ECC codes stored in each sector to detect errors. If one of the drives fails, the information stored on it can be restored by calculating exclusive OR (XOR) using the information on the remaining drives. Each record is typically distributed across all drives and therefore this type of array is good for disk-intensive applications. Because each I/O operation accesses all the disk drives in the array, RAID-3 cannot perform multiple operations simultaneously. Therefore, RAID-3 is good for single-user, single-tasking environments with long records. To work with short recordings, it is necessary to synchronize the rotation of the disk drives, since otherwise a decrease in the exchange speed is inevitable. Rarely used, because inferior to RAID-5 in terms of disk space usage. Implementation requires significant costs.

RAID-4. RAID-4 is identical to RAID-3 except that the stripe size is much larger than one sector. In this case, reads are performed from a single drive (not counting the drive that stores parity information), so multiple read operations can be performed simultaneously. However, since each write operation must update the contents of the parity drive, it is not possible to perform multiple write operations simultaneously. This type of array does not have any noticeable advantages over a RAID-5 array.

RAID-5. This type of array is sometimes called a "rotating parity array". This type The array successfully overcomes the inherent disadvantage of RAID-4 - the inability to simultaneously perform several write operations. This array, like RAID-4, uses large stripes, but, unlike RAID-4, parity information is stored not on one drive, but on all drives in turn:

Write operations access one drive with data and another drive with parity information. Since parity information for different stripes is stored on different drives, performing multiple simultaneous writes is impossible only in those rare cases where either the data stripes or the stripes with parity information are on the same drive. The more drives in the array, the less often the location of the information and parity stripes coincides.

Scope of application: reliable large-volume arrays. Implementation requires significant costs.

The minimum number of drives in the array is 3.

RAID-1 or RAID-5?

RAID-5, compared to RAID-1, uses disk space more economically, since for redundancy it stores not a “copy” of information, but a check number. As a result, RAID-5 can combine any number of drives, of which only one will contain redundant information.

But higher disk space efficiency comes at the expense of lower information exchange rates. When writing information to RAID-5, the parity information must be updated each time. To do this, you need to determine which parity bits have changed. First, the old information to be updated is read. This information is then XORed with the new information. The result of this operation is a bit mask in which each bit =1 means that the value in the parity information at the corresponding position must be replaced. The updated parity information is then written to the appropriate location. Therefore, for each program request to write information, RAID-5 performs two reads, two writes, and two XOR operations.

There is a cost to using disk space more efficiently (storing a parity block instead of a copy of the data): additional time is required to generate and write parity information. This means that the write speed on RAID-5 is lower than on RAID-1 by a ratio of 3:5 or even 1:3 (i.e., the write speed on RAID-5 is 3/5 to 1/3 the write speed RAID-1). Because of this, RAID-5 is pointless to create in software. They also cannot be recommended in cases where recording speed is critical.

Which RAID implementation method should you choose – software or hardware?

After reading the descriptions of the various RAID levels, you will notice that nowhere is there any mention of any specific hardware requirements that are needed to implement RAID. From which we can conclude that all that is needed to implement RAID is to connect the required number of disk drives to the controller available in the computer and install special software on the computer. This is true, but not entirely!

Indeed, it is possible to implement RAID in software. An example is the Microsoft Windows NT 4.0 Server OS, in which software implementation of RAID-0, -1 and even RAID-5 is possible. However this decision should be considered as extremely simplified, not allowing to fully realize the capabilities of a RAID array. It is enough to note that with software implementation of RAID, the entire burden of placing information on disk drives, calculating control codes, etc. lie down on CPU, which naturally does not increase the performance and reliability of the system. For the same reasons, there are practically no service functions here and all operations to replace a faulty drive, add a new drive, change the RAID level, etc. are carried out with complete loss of data and with the complete prohibition of performing any other operations. The only advantage of software implementation of RAID is its minimal cost.

Much more possibilities are provided by hardware implementation of RAID using special RAID controllers:

a specialized controller significantly relieves the central processor from RAID operations, and the controller’s effectiveness is more noticeable the higher the RAID complexity level;
controllers, as a rule, are equipped with drivers that allow you to create RAID for almost any popular OS;
The controller's built-in BIOS and included management programs allow the system administrator to easily connect, disconnect or replace drives included in RAID, create multiple RAID arrays, even at different levels, monitor the status of the disk array, etc. With “advanced” controllers, these operations can be performed “on the fly”, i.e. without turning off the system unit. Many operations can be performed in " background", i.e. without interrupting current work and even remotely, i.e. from any (of course, if you have access) workplace;
controllers can be equipped with a buffer memory (“cache”), in which the last few blocks of data are stored, which, with frequent access to the same files, can significantly increase the performance of the disk system.

The disadvantage of hardware RAID implementation is the relatively high cost of RAID controllers. However, on the one hand, you have to pay for everything (reliability, speed, service). On the other hand, in Lately, with the development of microprocessor technology, the cost of RAID controllers (especially younger models) began to fall sharply and became comparable to the cost of ordinary disk controllers, which makes it possible to install RAID systems not only in expensive mainframes, but also in servers entry level and even to workstations.

Forum visitors ask us the question: “Which RAID level is the most reliable?” Everyone knows that the most common level is RAID5, but it is not without serious drawbacks that are not obvious to non-specialists.

RAID 0, RAID 1, RAID 5, RAID6, RAID 10 or what are RAID levels?

In this article, I will try to characterize the most popular RAID levels, and then formulate recommendations for using these levels. To illustrate the article, I created a diagram in which I placed these levels in the three-dimensional space of reliability, performance and cost efficiency.

JBOD(Just a Bunch of Disks) is a simple spanning of hard drives, which is not formally a RAID level. A JBOD volume can be an array of a single disk or an aggregation of multiple disks. The RAID controller does not need to perform any calculations to operate such a volume. In our diagram, the JBOD drive serves as a "single" or starting point - its reliability, performance and cost values are the same as those of a single drive. hard drive.

RAID 0(“Striping”) has no redundancy, and distributes information immediately across all disks included in the array in the form of small blocks (“stripes”). Due to this, performance increases significantly, but reliability suffers. As with JBOD, we get 100% of the disk capacity for our money.

Let me explain why the reliability of data storage on any composite volume decreases - since if any of the hard drives included in it fail, all information is completely and irretrievably lost. In accordance with probability theory, mathematically, the reliability of a RAID0 volume is equal to the product of the reliabilities of its constituent disks, each of which is less than one, so the total reliability is obviously lower than the reliability of any disk.

Good level - RAID 1(“Mirroring”, “mirror”). It has protection against failure of half of the available hardware (in the general case, one of two hard drives), provides an acceptable write speed and gains in read speed due to parallelization of requests. The disadvantage is that you have to pay the cost of two hard drives to get the usable capacity of one hard drive.

Initially, it is assumed that the hard drive is a reliable thing. Accordingly, the probability of failure of two disks at once is equal (according to the formula) to the product of the probabilities, i.e. orders of magnitude lower! Unfortunately, real life is not a theory! Two hard drives are taken from the same batch and operate under the same conditions, and if one of the disks fails, the load on the remaining one increases, so in practice, if one of the disks fails, urgent measures must be taken to restore redundancy. To do this, it is recommended to use hot spare disks with any RAID level (except zero) HotSpare. The advantage of this approach is maintaining constant reliability. The disadvantage is even greater costs (i.e. the cost of 3 hard drives to store the volume of one disk).

Mirror on many disks is a level RAID 10. When using this level, mirrored pairs of disks are arranged in a “chain”, so the resulting volume can exceed the capacity of a single hard drive. The advantages and disadvantages are the same as for the RAID1 level. As in other cases, it is recommended to include HotSpare hot spare disks in the array at the rate of one spare for every five workers.

RAID 5, indeed, the most popular of the levels - primarily due to its efficiency. By sacrificing the capacity of just one disk from the array for redundancy, we gain protection against failure of any of the volume’s hard drives. Writing information to a RAID5 volume requires additional resources, since additional calculations are required, but when reading (compared to a separate hard drive), there is a gain, because data streams from several array drives are parallelized.

The disadvantages of RAID5 appear when one of the disks fails - the entire volume goes into critical mode, all write and read operations are accompanied by additional manipulations, performance drops sharply, and the disks begin to heat up. If immediate action is not taken, you may lose the entire volume. Therefore, (see above) you should definitely use a Hot Spare disk with a RAID5 volume.

In addition to the basic levels RAID0 - RAID5 described in the standard, there are combined levels RAID10, RAID30, RAID50, RAID15, which are interpreted differently by different manufacturers.

The essence of such combinations is briefly as follows. RAID10 is a combination of one and zero (see above). RAID50 is a combination of “0” level 5 volumes. RAID15 is a “mirror” of the “fives”. And so on.

Thus, combined levels inherit the advantages (and disadvantages) of their “parents”. So, the appearance of a “zero” in the level RAID 50 does not add any reliability to it, but has a positive effect on performance. Level RAID 15, probably very reliable, but it is not the fastest and, moreover, extremely uneconomical (the useful capacity of the volume is less than half the size of the original disk array).

RAID 6 differs from RAID 5 in that in each row of data (in English stripe) has not one, but two checksum block. Checksums are “multidimensional”, i.e. independent of each other, so even the failure of two disks in the array allows you to save the original data. Calculating checksums using the Reed-Solomon method requires more intensive calculations compared to RAID5, so previously the sixth level was practically not used. Now it is supported by many products, since they began to install specialized microcircuits that perform all the necessary mathematical operations.

According to some studies, restoring integrity after a single disk failure on a RAID5 volume composed of large SATA disks (400 and 500 gigabytes) ends in data loss in 5% of cases. In other words, in one case out of twenty, during the regeneration of a RAID5 array to a Hot Spare disk, the second disk may fail... Hence the recommendations of the best RAID drives: 1) Always do backups; 2) use RAID6!

Recently new levels RAID1E, RAID5E, RAID5EE have appeared. The letter “E” in the name means Enhanced.

RAID level-1 Enhanced (RAID level-1E) combines mirroring and data striping. This mixture of levels 0 and 1 is arranged as follows. The data in a row is distributed exactly as in RAID 0. That is, the data row has no redundancy. The next row of data blocks copies the previous one with a shift of one block. Thus, as in standard RAID 1 mode, each data block has a mirror copy on one of the disks, so the useful volume of the array is equal to half the total volume of the hard drives included in the array. RAID 1E requires a combination of three or more drives to operate.

I really like the RAID1E level. For powerful graphics workstation or even for home computer – optimal choice! It has all the advantages of the zero and first levels - excellent speed and high reliability.

Let's now move on to the level RAID level-5 Enhanced (RAID level-5E). This is the same as RAID5, only with a backup disk built into the array spare drive. This integration is carried out as follows: on all disks of the array, 1/N part of the space is left free, which is used as a hot spare if one of the disks fails. Due to this, RAID5E demonstrates, along with reliability, better performance, since reading/writing is performed in parallel from a larger number of drives at the same time and the spare drive is not idle, as in RAID5. Obviously, the backup disk included in the volume cannot be shared with other volumes (dedicated vs. shared). A RAID 5E volume is built on a minimum of four physical disks. The useful volume of a logical volume is calculated using the formula N-2.

RAID level-5E Enhanced (RAID level-5EE) similar to RAID level-5E, but it has more efficient distribution spare drive and, as a result, faster recovery time. Like the RAID5E level, this RAID level distributes blocks of data and checksums in rows. But it also distributes free blocks of the spare drive, and does not simply reserve part of the disk space for these purposes. This reduces the time required to reconstruct the integrity of a RAID5EE volume. The backup disk included in the volume cannot be shared with other volumes - as in the previous case. A RAID 5EE volume is built on a minimum of four physical disks. The useful volume of a logical volume is calculated using the formula N-2.

Oddly enough, no mention of level RAID 6E I couldn’t find it on the Internet - so far this level is not offered or even announced by any manufacturer. But the RAID6E (or RAID6EE?) level can be offered according to the same principle as the previous one. Disk HotSpare Necessarily must accompany any RAID volume, including RAID 6. Of course, we will not lose information if one or two disks fail, but it is extremely important to start regenerating the integrity of the array as early as possible in order to quickly bring the system out of the “critical” mode. Since the need for a Hot Spare disk is beyond doubt for us, it would be logical to go further and “spread” it over the volume as is done in RAID 5EE in order to get the benefits of using more quantity disks ( best speed read-write and more fast recovery integrity).

RAID levels in “numbers”.

I have collected some of them in a table important parameters almost all RAID levels, so that you can compare them with each other and better understand their essence.

Level
~~~~~~~

Huts-
exactly
ness
~~~~~~~

Use
Disk capacity
~~~~~~~

Production
ditel-
ness
reading

~~~~~~~

Production
ditel-
ness
records

~~~~~~~

Built-in
disk
reserve

~~~~~~~

Min. number of disks
~~~~~~~

Max. number of disks

~~~~~~~

Exc.

All “mirror” levels are RAID 1, 1+0, 10, 1E, 1E0.

Let's try again to thoroughly understand how these levels differ?

RAID 1.
This is a classic “mirror”. Two (and only two!) hard drives work as one, being a complete copy of each other. Failure of either of these two drives does not result in loss of your data, as the controller continues to operate on the remaining drive. RAID1 in numbers: 2x redundancy, 2x reliability, 2x cost. Write performance is equivalent to that of a single hard drive. Read performance is higher because the controller can distribute read operations between two disks.

RAID 10.
The essence of this level is that the disks of the array are combined in pairs into “mirrors” (RAID 1), and then all these mirror pairs, in turn, are combined into a common striped array (RAID 0). That is why it is sometimes referred to as RAID 1+0. Important point– RAID 10 can only combine an even number of disks (minimum 4, maximum 16). Advantages: reliability is inherited from the “mirror”, performance for both reading and writing is inherited from “zero”.

RAID 1E.
The letter "E" in the name means "Enhanced", i.e. "improved". The principle of this improvement is as follows: the data is “stripped” in blocks across all disks of the array, and then “striped” again with a shift to one disk. RAID 1E can combine from three to 16 disks. Reliability corresponds to the “ten” indicators, and performance becomes a little better due to greater “alternation”.

RAID 1E0.
This level is implemented like this: we create a “null” array from RAID1E arrays. Therefore, the total number of disks must be a multiple of three: a minimum of three and a maximum of sixty! In this case, we are unlikely to get a speed advantage, and the complexity of the implementation may adversely affect reliability. The main advantage is the ability to combine a very large (up to 60) number of disks into one array.

The similarity of all RAID 1X levels lies in their redundancy indicators: for the sake of reliability, exactly 50% of the total capacity of the array disks is sacrificed.

Depending on the selected RAID specification, read and write speeds and/or data loss protection may be improved.

When working with disk subsystems, IT specialists often face two main problems.

The first one is low speed read/write, sometimes even the speeds of an SSD drive are not enough.
The second is the failure of disks, which means loss of data, the recovery of which may be impossible.

Both of these problems are solved using RAID technology (redundant array of independent disks) - a virtual data storage technology that combines several physical disks into one logical element.

Depending on the selected RAID specification, read/write speeds and/or data loss protection may be improved.

The RAID specification levels are: 1,2,3,4,5,6,0. In addition, there are combinations: 01,10,50,05,60,06. In this article we will look at the most common types of RAID Arrays. But first let's say that there are hardware and software RAID arrays.

Hardware and software RAID arrays

Software arrays are created after the installation of the Operating System using software products and utilities, which is the main disadvantage of such disk arrays.
Hardware RAIDs create a disk array before installing the Operating System and are not dependent on it.

RAID 1

RAID 1 (also called "Mirror" - Mirror) involves complete duplication of data from one physical disk to another.

The disadvantages of RAID 1 include the fact that you get half the disk space. Those. If you use TWO 250 GB disks, the system will see only ONE 250 GB in size. This type RAID does not provide a gain in speed, but it significantly increases the level of fault tolerance, because if one disk fails, there is always a complete copy of it. Recording and erasing from disks occurs simultaneously. If information was intentionally deleted, then there will be no way to restore it from another disk.

RAID 0

RAID 0 (also called Striping) involves dividing information into blocks and simultaneously writing different blocks to different disks.

This technology increases the read/write speed, allows the user to use the full total capacity of the disks, but reduces fault tolerance, or rather reduces it to zero. So, if one of the disks fails, it will be almost impossible to restore information. To build RAID 0, it is recommended to use only highly reliable disks.

RAID 5 can be called a more advanced RAID 0. You can use up to 3 hard drives. Raid 0 is recorded on all but one, and a special checksum is recorded on the last one, which allows you to save information on the hard drives in the event of the “death” of one of them (but not more than one). The operating speed of such an array is high. If you replace the disk, it will take a lot of time.

RAID 2, 3, 4

These are methods of distributed information storage using disks allocated for parity codes. They differ from each other only in block sizes. In practice, they are practically not used due to the need to devote a large share of disk capacity to storing ECC and/or parity codes, as well as due to low performance.

RAID 10

It is a mix of RAID arrays 1 and 0. And it combines the advantages of each: high performance and high fault tolerance.

The array must contain an even number of disks (minimum 4) and is the most reliable option for storing information. The disadvantage is the high cost of the disk array: the effective capacity will be half of the total capacity of the disk space.

Is a mix of RAID arrays 5 and 0. RAID 5 is being built, but its components will not be independent hard drives, but RAID 0 arrays.

Peculiarities.

If the RAID controller breaks down, it is almost impossible to restore the information (does not apply to the Mirror). Even if you buy exactly the same controller, there is a high probability that the RAID will be assembled from other disk sectors, which means that information on the disks will be lost.

As a rule, discs are purchased in one batch. Accordingly, their working life may be approximately the same. In this case, it is recommended to immediately, at the time of purchasing disks for the array, purchase some excess. For example, to configure RAID 10 of 4 disks, you should buy 5 disks. So, if one of them fails, you can quickly replace it with a new one before other disks fail.

Conclusions.

In practice, most often only three types of RAID arrays are used. These are RAID 1, RAID 10 and RAID 5.

In terms of cost/performance/fault tolerance, it is recommended to use:

RAID 1(mirroring) to form a disk subsystem for user operating systems.
RAID 10 for data that has high requirements to write and read speed. For example, for storing 1C:Enterprise databases, mail server, AD.
RAID 5 used to store file data.

The ideal server solution according to the majority system administrators is a server with six disks. The two disks are “mirrored” and the operating system is installed on RAID 1. The four remaining drives are combined into RAID 10 for fast, trouble-free, reliable system operation.