Applies to SUSE Linux Enterprise Server 15 SP4

7 Software RAID configuration #

Revision History: Documentation de SUSE Linux Enterprise Server

The purpose of RAID (redundant array of independent disks) is to combine several hard disk partitions into one large virtual hard disk to optimize performance, data security, or both. Most RAID controllers use the SCSI protocol, because it can address a larger number of hard disks in a more effective way than the IDE protocol and is more suitable for parallel processing of commands. There are some RAID controllers that support IDE or SATA hard disks. Software RAID provides the advantages of RAID systems without the additional cost of hardware RAID controllers. However, this requires some CPU time and has memory requirements that make it unsuitable for real high performance computers.

Important: RAID on cluster file systems

Software RAID underneath clustered file systems needs to be set up using a cluster multi-device (Cluster MD). Refer to the Administration Guide for SUSE Linux Enterprise High Availability.

SUSE Linux Enterprise offers the option of combining several hard disks into one soft RAID system. RAID implies several strategies for combining several hard disks in a RAID system, each with different goals, advantages, and characteristics. These variations are commonly known as RAID levels.

7.1 Understanding RAID levels #

This section describes common RAID levels 0, 1, 2, 3, 4, 5, and nested RAID levels.

7.1.1 RAID 0 #

This level improves the performance of your data access by spreading out blocks of each file across multiple disks. Actually, this is not a RAID, because it does not provide data backup, but the name RAID 0 for this type of system has become the norm. With RAID 0, two or more hard disks are pooled together. The performance is very good, but the RAID system is destroyed and your data lost if even one hard disk fails.

7.1.2 RAID 1 #

This level provides adequate security for your data, because the data is copied to another hard disk 1:1. This is known as hard disk mirroring. If a disk is destroyed, a copy of its contents is available on another mirrored disk. All disks except one could be damaged without endangering your data. However, if damage is not detected, damaged data might be mirrored to the correct disk and the data is corrupted that way. The writing performance suffers a little in the copying process compared to when using single disk access (10 to 20 percent slower), but read access is significantly faster in comparison to any one of the normal physical hard disks, because the data is duplicated so can be scanned in parallel. RAID 1 generally provides nearly twice the read transaction rate of single disks and almost the same write transaction rate as single disks.

7.1.3 RAID 2 and RAID 3 #

These are not typical RAID implementations. Level 2 stripes data at the bit level rather than the block level. Level 3 provides byte-level striping with a dedicated parity disk and cannot service simultaneous multiple requests. Both levels are rarely used.

7.1.4 RAID 4 #

Level 4 provides block-level striping like Level 0 combined with a dedicated parity disk. If a data disk fails, the parity data is used to create a replacement disk. However, the parity disk might create a bottleneck for write access. Nevertheless, Level 4 is sometimes used.

7.1.5 RAID 5 #

RAID 5 is an optimized compromise between Level 0 and Level 1 in terms of performance and redundancy. The hard disk space equals the number of disks used minus one. The data is distributed over the hard disks as with RAID 0. Parity blocks, created on one of the partitions, are there for security reasons. They are linked to each other with XOR, enabling the contents to be reconstructed by the corresponding parity block in case of system failure. With RAID 5, no more than one hard disk can fail at the same time. If one hard disk fails, it must be replaced when possible to avoid the risk of losing data.

7.1.6 RAID 6 #

RAID 6 is an extension of RAID 5 that allows for additional fault tolerance by using a second independent distributed parity scheme (dual parity). Even if two of the hard disks fail during the data recovery process, the system continues to be operational, with no data loss.

RAID 6 provides for extremely high data fault tolerance by sustaining multiple simultaneous drive failures. It handles the loss of any two devices without data loss. Accordingly, it requires N+2 drives to store N drives worth of data. It requires a minimum of four devices.

The performance for RAID 6 is slightly lower but comparable to RAID 5 in normal mode and single disk failure mode. It is very slow in dual disk failure mode. A RAID 6 configuration needs a considerable amount of CPU time and memory for write operations.

Table 7.1: Comparison of RAID 5 and RAID 6 #

Feature	RAID 5	RAID 6
Number of devices	N+1, minimum of 3	N+2, minimum of 4
Parity	Distributed, single	Distributed, dual
Performance	Medium impact on write and rebuild	More impact on sequential write than RAID 5
Fault-tolerance	Failure of one component device	Failure of two component devices

7.1.7 Nested and complex RAID levels #

Other RAID levels have been developed, such as RAIDn, RAID 10, RAID 0+1, RAID 30, and RAID 50. Some are proprietary implementations created by hardware vendors. Examples for creating RAID 10 configurations can be found in Chapter 9, Creating software RAID 10 devices.

7.2 Soft RAID configuration with YaST #

The YaST soft RAID configuration can be reached from the YaST Expert Partitioner. This partitioning tool also enables you to edit and delete existing partitions and create new ones that should be used with soft RAID. These instructions apply on setting up RAID levels 0, 1, 5, and 6. Setting up RAID 10 configurations is explained in Chapter 9, Creating software RAID 10 devices.

Launch YaST and open the Partitioner.
If necessary, create partitions that should be used with your RAID configuration. Do not format them and set the partition type to 0xFD Linux RAID. When using existing partitions it is not necessary to change their partition type—YaST will automatically do so. Refer to Section 10.1, « Utilisation de l'outil Partitionnement en mode expert » for details.
It is strongly recommended to use partitions stored on different hard disks to decrease the risk of losing data if one is defective (RAID 1 and 5) and to optimize the performance of RAID 0.
For RAID 0 at least two partitions are needed. RAID 1 requires exactly two partitions, while at least three partitions are required for RAID 5. A RAID 6 setup requires at least four partitions. It is recommended to use only partitions of the same size because each segment can contribute only the same amount of space as the smallest sized partition.
In the left panel, select RAID.
A list of existing RAID configurations opens in the right panel.
At the lower left of the RAID page, click Add RAID.
Select a RAID Type and Add an appropriate number of partitions from the Available Devices dialog.
You can optionally assign a RAID Name to your RAID. It will make it available as /dev/md/NAME. See Section 7.2.1, “RAID names” for more information.
Figure 7.1: Example RAID 5 configuration #

Proceed with Next.
Select the Chunk Size and, if applicable, the Parity Algorithm. The optimal chunk size depends on the type of data and the type of RAID. See https://raid.wiki.kernel.org/index.php/RAID_setup#Chunk_sizes for more information. More information on parity algorithms can be found with man 8 mdadm when searching for the --layout option. If unsure, stick with the defaults.
Choose a Role for the volume. Your choice here only affects the default values for the upcoming dialog. They can be changed in the next step. If in doubt, choose Raw Volume (Unformatted).
Under Formatting Options, select Format Partition, then select the File system. The content of the Options menu depends on the file system. Usually there is no need to change the defaults.
Under Mounting Options, select Mount partition, then select the mount point. Click Fstab Options to add special mounting options for the volume.
Click Finish.
Click Next, verify that the changes are listed, then click Finish.

Important: RAID on disks

While the partitioner makes it possible to create a RAID on top of disks instead of partitions, we do not recommend this approach for a number of reasons. Installing a bootloader on such RAID is not supported, so you need to use a separate device for booting. Tools like fdisk and parted do not work properly with such RAIDs, which may lead to incorrect diagnosis and actions by a person who is unaware of the RAID's particular setup.

7.2.1 RAID names #

By default, software RAID devices have numeric names following the pattern mdN, where N is a number. As such they can be accessed as, for example, /dev/md127 and are listed as md127 in /proc/mdstat and /proc/partitions. Working with these names can be clumsy. SUSE Linux Enterprise Server offers two ways to work around this problem:

Providing a named link to the device

You can optionally specify a name for the RAID device when creating it with YaST or on the command line with mdadm --create '/dev/md/ NAME'. The device name will still be mdN, but a link /dev/md/NAME will be created:

> ls -og /dev/md
total 0
lrwxrwxrwx 1 8 Dec  9 15:11 myRAID -> ../md127

The device will still be listed as md127 under /proc.

Providing a named device

In case a named link to the device is not sufficient for your setup, add the line CREATE names=yes to /etc/mdadm.conf by running the following command:

> echo "CREATE names=yes" | sudo tee -a  /etc/mdadm.conf

This will cause names like myRAID to be used as a “real” device name. The device will not only be accessible at /dev/myRAID, but also be listed as myRAID under /proc. Note that this will only apply to RAIDs configured after the change to the configuration file. Active RAIDs will continue to use the mdN names until they get stopped and re-assembled.

Warning: Incompatible tools

Not all tools may support named RAID devices. In case a tool expects a RAID device to be named mdN, it will fail to identify the devices.

7.3 Configuring stripe size on RAID 5 on AArch64 #

By default, the stripe size is set to 4kB. If you need to change the default stripe size, for example, to match the typical page size of 64kB on AArch64, you can configure the stripe size manually using CLI:

> sudo echo 16384  > /sys/block/md1/md/stripe_size

The above command sets the stripe size to 16kB. You can set other values such as 4096, 8192; but the value must be a power of 2.

7.4 Monitoring software RAIDs #

You can run mdadm as a daemon in the monitor mode to monitor your software RAID. In the monitor mode, mdadm performs regular checks on the array for disk failures. If there is a failure, mdadm sends an email to the administrator. To define the time interval of the checks, run the following command:

mdadm --monitor --mail=root@localhost --delay=1800 /dev/md2

The command above turns on monitoring of the /dev/md2 array in intervals of 1800 seconds. In the event of a failure, an email will be sent to root@localhost.

Note: RAID checks are enabled by default

The RAID checks are enabled by default. It may happen that the interval between each check is not long enough and you may encounter warnings. Thus, you can increase the interval by setting a higher value with the delay option.

7.5 More information #

Configuration instructions and more details for soft RAID can be found in the HOWTOs at:

The Linux RAID wiki: https://raid.wiki.kernel.org/
The Software RAID HOWTO in the /usr/share/doc/packages/mdadm/Software-RAID.HOWTO.html file

Linux RAID mailing lists are also available, such as linux-raid at http://marc.info/?l=linux-raid.