Legal Notices
Copyright © 2006–2015 SUSE Linux GmbH. and contributors. All rights reserved.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”.
All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LINUX GmbH, the authors, nor the translators shall be held liable for possible errors or the consequences thereof.
Trademarks
| For Novell trademarks, see the Novell Trademark and Service Mark list. |
| Linux* is a registered trademark of Linus Torvalds. All other third party trademarks are the property of their respective owners. |
| A trademark symbol (®, ™, etc.) denotes a Novell trademark; an asterisk (*) denotes a third party trademark. |
This guide provides information about how to manage storage devices on a SUSE Linux Enterprise Server 11 Support Pack 4 (SP4) server.
This guide is intended for system administrators.
We want to hear your comments and suggestions about this manual and the other documentation included with this product. Please use the User Comments feature at the bottom of each page of the online documentation, or go to www.novell.com/documentation/feedback.html and enter your comments there.
For the most recent version of the SUSE Linux Enterprise Server 11 Storage Administration Guide, visit the SUSE Documentation Web site for SUSE Linux Enterprise Server 11.
For information about partitioning and managing devices, see “Advanced Disk Setup” in the SUSE Linux Enterprise Server 11 Deployment Guide.
SUSE Linux Enterprise Server ships with a number of different file systems from which to choose, including Btrfs, Ext3, Ext2, ReiserFS, and XFS. Each file system has its own advantages and disadvantages.
Professional high-performance setups might require a highly available storage systems. To meet the requirements of high-performance clustering scenarios, SUSE Linux Enterprise Server includes OCFS2 (Oracle Cluster File System 2) and the Distributed Replicated Block Device (DRBD) in the SLES High-Availability Storage Infrastructure (HASI) release. These advanced storage systems are not covered in this guide. For information, see the SUSE Linux Enterprise 11 SP4 High Availability Extension Guide.
A data structure that is internal to the file system. It assures that all of the on-disk data is properly organized and accessible. Essentially, it is “data about the data.” Almost every file system has its own structure of metadata, which is on reason that the file systems show different performance characteristics. It is extremely important to maintain metadata intact, because otherwise all data on the file system could become inaccessible.
A data structure on a file system that contains various information about a file, including size, number of links, pointers to the disk blocks where the file contents are actually stored, and date and time of creation, modification, and access.
In the context of a file system, a journal is an on-disk structure containing a type of log in which the file system stores what it is about to change in the file system’s metadata. Journaling greatly reduces the recovery time of a file system because it has no need for the lengthy search process that checks the entire file system at system startup. Instead, only the journal is replayed.
SUSE Linux Enterprise Server offers a variety of file systems from which to choose. This section contains an overview of how these file systems work and which advantages they offer.
It is very important to remember that no file system best suits all kinds of applications. Each file system has its particular strengths and weaknesses, which must be taken into account. In addition, even the most sophisticated file system cannot replace a reasonable backup strategy.
The terms data integrity and data consistency, when used in this section, do not refer to the consistency of the user space data (the data your application writes to its files). Whether this data is consistent must be controlled by the application itself.
Unless stated otherwise in this section, all the steps required to set up or change partitions and file systems can be performed by using YaST.
Btrfs is a copy-on-write (COW) file system developed by Chris Mason. It is based on COW-friendly B-trees developed by Ohad Rodeh. Btrfs is a logging-style file system. Instead of journaling the block changes, it writes them in a new location, then links the change in. Until the last write, the new changes are not committed.
Because Btrfs is capable of storing snapshots of the file system, it is advisable to reserve twice the amount of disk space than the standard storage proposal. This is done automatically by the YaST Partitioner in the Btrfs storage proposal for the root file system.
Btrfs provides fault tolerance, repair, and easy management features, such as the following:
Writable snapshots that allow you to easily roll back your system if needed after applying updates, or to back up files.
Multiple device support that allows you to grow or shrink the file system. The feature is planned to be available in a future release of the YaST Partitioner.
Compression to efficiently use storage space.
Use Btrfs commands to set up transparent compression. Compression and Encryption functionality for Btrfs is currently under development and is currently not supported on SUSE Linux Enterprise Server.
Different RAID levels for metadata and user data.
Different checksums for metadata and user data to improve error detection.
Integration with Linux Logical Volume Manager (LVM) storage objects.
Integration with the YaST Partitioner and AutoYaST on SUSE Linux.
Offline migration from existing Ext2, Ext3, and Ext4 file systems.
Bootloader support for /boot on Btrfs is planned to
be available beginning in SUSE Linux Enterprise 12.
Btrfs creates a default subvolume in its assigned pool of space. It allows you to create additional subvolumes that act as individual file systems within the same pool of space. The number of subvolumes is limited only by the space allocated to the pool.
If Btrfs is used for the root (/) file system, the
YaST Partitioner automatically prepares the Btrfs file system for use
with Btrfs subvolumes. You can cover any subdirectory as a subvolume. For
example, Table 1.1, “Default Subvolume Handling for Btrfs in YaST”
identifies the subdirectories that we recommend you treat as subvolumes
because they contain files that you should not snapshot for the reasons
given:
|
Path |
Reason to Cover as a Subvolume |
|---|---|
|
|
Contains third-party add-on application software packages. |
|
|
Contains |
|
|
Contains temporary files. |
|
|
Contains memory dumps of crashed kernels. |
|
|
Contains system and applications’ log files, which should never be rolled back. |
|
|
Contains run-time variable data. |
|
|
Contains data that is awaiting processing by a program, user, or administrator, such as news, mail, and printer queues. |
|
|
Contains temporary files or directories that are preserved between system reboots. |
After the installation, you can add or remove Btrfs subvolumes by using the YaST Expert Partitioner. For information, see “Managing Btrfs Subvolumes using YaST” in the SUSE Linux Enterprise Server Deployment Guide.
Btrfs provides writable snapshots with the SUSE Snapper infrastructure
that allow you to easily roll back your system if needed after applying
updates, or to back up files. Snapper allows you to create and delete
snapshots, and to compare snapshots and revert the differences between
them. If Btrfs is used for the root (/) file system,
YaST automatically enables snapshots for the root file system.
For information about Snapper and its integration in ZYpp
(snapper-zypp-plugin) and YaST
(yast2-snapper), see
“Snapshots/Rollback
with Snapper” in the
SSUSE Linux Enterprise Server
Administration Guide.
To prevent snapshots from filling up the system disk, you can change the
Snapper cleanup defaults to be more aggressive in the
/etc/snapper/configs/root configuration file, or for
other mount points. Snapper provides three algorithms to clean up old
snapshots that are executed in a daily cron-job. The cleanup frequency is
defined in the Snapper configuration for the mount point. Lower the
TIMELINE_LIMIT parameters for daily, monthly, and yearly to reduce how
long and the number of snapshots to be retained. For information, see
“Adjusting
the Config File” in the
SUSE Linux Enterprise Server
Administration Guide.
For information about the SUSE Snapper project, see the Snapper Portal wiki at OpenSUSE.org.
The scrub check and repair functionality is available
as part of the Btrfs command line tools. It verifies the integrity of
data and metadata, assuming the tree structures is fine. You can run
scrub periodically on a mounted file system; it runs
as a background process during normal operation.
You can create Btrfs on Multiple Devices (MD) and Device Mapper (DM) storage configurations by using the YaST Partitioner.
You can migrate data volumes from existing Ext file systems (Ext2, Ext3, or Ext4) to the Btrfs file system. The conversion process occurs offline and in place on the device. The file system needs least 15% of available free space on the device.
To convert the Ext file system to Btrfs, take the file system offline, then enter:
btrfs-convert <device>
To roll back the migration to the original Ext file system, take the file system offline, then enter:
btrfs-convert -r <device>
When rolling back to the original Ext file system, all data will be lost that you added after the conversion to Btrfs. That is, only the original data is converted back to the Ext file system.
Btrfs is integrated in the YaST Partitioner and AutoYaST. It is available during the installation to allow you to set up a solution for the root file system. You can use the YaST Partitioner after the install to view and manage Btrfs volumes.
Btrfs administration tools are provided in the
btrfsprogs package. For information about using
Btrfs commands, see the btrfs(8),
btrfsck(8), mkfs.btrfs(8), and
btrfsctl(8) man pages. For information about Btrfs
features, see the
Btrfs
wiki.
The fsck.btrfs(8) tool will soon be available in the
SUSE Linux Enterprise update repositories.
The Btrfs root file system subvolumes
/var/log, /var/crash and
/var/cache can use all of the available disk space
during normal operation, and cause a system malfunction. To help avoid
this situation, SUSE Linux Enterprise now offers Btrfs quota support for
subvolumes. See the btrfs(8) manual page for more
details.
The origins of Ext2 go back to the early days of Linux history. Its predecessor, the Extended File System, was implemented in April 1992 and integrated in Linux 0.96c. The Extended File System underwent a number of modifications and, as Ext2, became the most popular Linux file system for years. With the creation of journaling file systems and their short recovery times, Ext2 became less important.
A brief summary of Ext2’s strengths might help understand why it was—and in some areas still is—the favorite Linux file system of many Linux users.
Being quite an “old-timer,” Ext2 underwent many improvements
and was heavily tested. This might be the reason why people often refer
to it as rock-solid. After a system outage when the file system could not
be cleanly unmounted, e2fsck starts to analyze the file system data.
Metadata is brought into a consistent state and pending files or data
blocks are written to a designated directory (called
lost+found). In contrast to journaling file systems,
e2fsck analyzes the entire file system and not just the recently modified
bits of metadata. This takes significantly longer than checking the log
data of a journaling file system. Depending on file system size, this
procedure can take half an hour or more. Therefore, it is not desirable
to choose Ext2 for any server that needs high availability. However,
because Ext2 does not maintain a journal and uses significantly less
memory, it is sometimes faster than other file systems.
Because Ext3 is based on the Ext2 code and shares its on-disk format as well as its metadata format, upgrades from Ext2 to Ext3 are very easy.
Ext3 was designed by Stephen Tweedie. Unlike all other next-generation file systems, Ext3 does not follow a completely new design principle. It is based on Ext2. These two file systems are very closely related to each other. An Ext3 file system can be easily built on top of an Ext2 file system. The most important difference between Ext2 and Ext3 is that Ext3 supports journaling. In summary, Ext3 has three major advantages to offer:
The code for Ext2 is the strong foundation on which Ext3 could become a highly-acclaimed next-generation file system. Its reliability and solidity are elegantly combined in Ext3 with the advantages of a journaling file system. Unlike transitions to other journaling file systems, such as ReiserFS or XFS, which can be quite tedious (making backups of the entire file system and recreating it from scratch), a transition to Ext3 is a matter of minutes. It is also very safe, because re-creating an entire file system from scratch might not work flawlessly. Considering the number of existing Ext2 systems that await an upgrade to a journaling file system, you can easily see why Ext3 might be of some importance to many system administrators. Downgrading from Ext3 to Ext2 is as easy as the upgrade. Just perform a clean unmount of the Ext3 file system and remount it as an Ext2 file system.
Some other journaling file systems follow the
“metadata-only” journaling approach. This means your
metadata is always kept in a consistent state, but this cannot be
automatically guaranteed for the file system data itself. Ext3 is
designed to take care of both metadata and data. The degree of
“care” can be customized. Enabling Ext3 in the
data=journal mode offers maximum security (data
integrity), but can slow down the system because both metadata and data
are journaled. A relatively new approach is to use the
data=ordered mode, which ensures both data and metadata
integrity, but uses journaling only for metadata. The file system driver
collects all data blocks that correspond to one metadata update. These
data blocks are written to disk before the metadata is updated. As a
result, consistency is achieved for metadata and data without sacrificing
performance. A third option to use is data=writeback,
which allows data to be written into the main file system after its
metadata has been committed to the journal. This option is often
considered the best in performance. It can, however, allow old data to
reappear in files after crash and recovery while internal file system
integrity is maintained. Ext3 uses the data=ordered
option as the default.
To convert an Ext2 file system to Ext3:
Create an Ext3 journal by running tune2fs -j as the
root user.
This creates an Ext3 journal with the default parameters.
To specify how large the journal should be and on which device it
should reside, run tune2fs -J
instead together with the desired journal options
size= and device=. More information
about the tune2fs program is available in the
tune2fs man page.
Edit the file /etc/fstab as the
root user to change the file system type
specified for the corresponding partition from ext2
to ext3, then save the changes.
This ensures that the Ext3 file system is recognized as such. The change takes effect after the next reboot.
To boot a root file system that is set up as an Ext3 partition, include
the modules ext3 and jbd in the
initrd.
Edit /etc/sysconfig/kernel as
root, adding ext3 and
jbd to the INITRD_MODULES variable,
then save the changes.
Run the mkinitrd command.
This builds a new initrd and prepares it for use.
Reboot the system.
An inode stores information about the file and its block location in the file system. To allow space in the inode for extended attributes and ACLs, the default inode size for Ext3 was increased from 128 bytes on SLES 10 to 256 bytes on SLES 11. As compared to SLES 10, when you make a new Ext3 file system on SLES 11, the default amount of space pre-allocated for the same number of inodes is doubled, and the usable space for files in the file system is reduced by that amount. Thus, you must use larger partitions to accommodate the same number of inodes and files than were possible for an Ext3 file system on SLES 10.
When you create a new Ext3 file system, the space in the inode table is pre-allocated for the total number of inodes that can be created. The bytes-per-inode ratio and the size of the file system determine how many inodes are possible. When the file system is made, an inode is created for every bytes-per-inode bytes of space:
number of inodes = total size of the file system divided by the number of bytes per inode
The number of inodes controls the number of files you can have in the file system: one inode for each file. To address the increased inode size and reduced usable space available, the default for the bytes-per-inode ratio was increased from 8192 bytes on SLES 10 to 16384 bytes on SLES 11. The doubled ratio means that the number of files that can be created is one-half of the number of files possible for an Ext3 file system on SLES 10.
After the inodes are allocated, you cannot change the settings for the inode size or bytes-per-inode ratio. No new inodes are possible without recreating the file system with different settings, or unless the file system gets extended. When you exceed the maximum number of inodes, no new files can be created on the file system until some files are deleted.
When you make a new Ext3 file system, you can specify the inode size and
bytes-per-inode ratio to control inode space usage and the number of
files possible on the file system. If the blocks size, inode size, and
bytes-per-inode ratio values are not specified, the default values in the
/etc/mked2fs.conf file are applied. For information,
see the mke2fs.conf(5) man page.
Use the following guidelines:
Inode size: The default inode size is 256 bytes. Specify a value in bytes that is a power of 2 and equal to 128 or larger in bytes and up to the block size, such as 128, 256, 512, and so on. Use 128 bytes only if you do not use extended attributes or ACLs on your Ext3 file systems.
Bytes-per-inode ratio: The default bytes-per-inode ratio is 16384 bytes. Valid bytes-per-inode ratio values must be a power of 2 equal to 1024 or greater in bytes, such as 1024, 2048, 4096, 8192, 16384, 32768, and so on. This value should not be smaller than the block size of the file system, because the block size is the smallest chunk of space used to store data. The default block size for the Ext3 file system is 4 KB.
In addition, you should consider the number of files and the size of files you need to store. For example, if your file system will have many small files, you can specify a smaller bytes-per-inode ratio, which increases the number of inodes. If your file system will have a very large files, you can specify a larger bytes-per-inode ratio, which reduces the number of possible inodes.
Generally, it is better to have too many inodes than to run out of them. If you have too few inodes and very small files, you could reach the maximum number of files on a disk that is practically empty. If you have too many inodes and very large files, you might have free space reported but be unable to use it because you cannot create new files in space reserved for inodes.
If you do not use extended attributes or ACLs on your Ext3 file systems, you can restore the SLES 10 behavior specifying 128 bytes as the inode size and 8192 bytes as the bytes-per-inode ratio when you make the file system. Use any of the following methods to set the inode size and bytes-per-inode ratio:
Modifying the default settings for all new Ext3 files:
In a text editor, modify the defaults section of
the /etc/mke2fs.conf file to set the
inode_size and inode_ratio to
the desired default values. The values apply to all new Ext3 file
systems. For example:
blocksize = 4096 inode_size = 128 inode_ratio = 8192
At the command line:
Pass the inode size (-I 128) and the
bytes-per-inode ratio (-i 8192) to the
mkfs.ext3(8) command or the
mke2fs(8) command when you create a new Ext3 file
system. For example, use either of the following commands:
mkfs.ext3 -b 4096 -i 8092 -I 128 /dev/sda2 mke2fs -t ext3 -b 4096 -i 8192 -I 128 /dev/sda2
During installation with YaST: Pass the inode size and bytes-per-inode ratio values when you create a new Ext3 file system during the installation. In the YaST Partitioner on the page under, select , then click . In the dialog box, select the desired values from the , , and drop-down lists.
For example, select 4096 for the drop-down list, select 8192 from the drop-down list, select 128 from the drop-down list, then click .
During installation with autoyast:
In an autoyast profile, you can use the fs_options
tag to set the opt_bytes_per_inode ratio
value of 8192 for -i and the opt_inode_density
value of 128 for -I:
<partitioning config:type="list">
<drive>
<device>/dev/sda</device>
<initialize config:type="boolean">true</initialize>
<partitions config:type="list">
<partition>
<filesystem config:type="symbol">ext3</filesystem>
<format config:type="boolean">true</format>
<fs_options>
<opt_bytes_per_inode>
<option_str>-i</option_str>
<option_value>8192</option_value>
</opt_bytes_per_inode>
<opt_inode_density>
<option_str>-I</option_str>
<option_value>128</option_value>
</opt_inode_density>
</fs_options>
<mount>/</mount>
<partition_id config:type="integer">131</partition_id>
<partition_type>primary</partition_type>
<size>25G</size>
</partition>For information, see SLES 11 ext3 partitions can only store 50% of the files that can be stored on SLES10 [Technical Information Document 7009075].
Officially one of the key features of the 2.4 kernel release, ReiserFS has been available as a kernel patch for 2.2.x SUSE kernels since version 6.4. ReiserFS was designed by Hans Reiser and the Namesys development team. It has proven itself to be a powerful alternative to Ext2. Its key assets are better disk space utilization, better disk access performance, faster crash recovery, and reliability through data journaling.
The ReiserFS file system is fully supported for the lifetime of SUSE Linux Enterprise Server 11 specifically for migration purposes. SUSE plans to remove support for creating new ReiserFS file systems starting with SUSE Linux Enterprise Server 12.
In ReiserFS, all data is organized in a structure called a B*-balanced tree. The tree structure contributes to better disk space utilization because small files can be stored directly in the B* tree leaf nodes instead of being stored elsewhere and just maintaining a pointer to the actual disk location. In addition to that, storage is not allocated in chunks of 1 or 4 KB, but in portions of the exact size needed. Another benefit lies in the dynamic allocation of inodes. This keeps the file system more flexible than traditional file systems, like Ext2, where the inode density must be specified at file system creation time.
For small files, file data and “stat_data” (inode) information are often stored next to each other. They can be read with a single disk I/O operation, meaning that only one access to disk is required to retrieve all the information needed.
Using a journal to keep track of recent metadata changes makes a file system check a matter of seconds, even for huge file systems.
ReiserFS also supports data journaling and ordered data modes similar to
the concepts outlined in
Section 1.2.3, “Ext3”.
The default mode is data=ordered, which ensures both
data and metadata integrity, but uses journaling only for metadata.
Originally intended as the file system for their IRIX OS, SGI started XFS development in the early 1990s. The idea behind XFS was to create a high-performance 64-bit journaling file system to meet extreme computing challenges. XFS is very good at manipulating large files and performs well on high-end hardware. However, even XFS has a drawback. Like ReiserFS, XFS takes great care of metadata integrity, but less care of data integrity.
A quick review of XFS’s key features explains why it might prove to be a strong competitor for other journaling file systems in high-end computing.
At the creation time of an XFS file system, the block device underlying the file system is divided into eight or more linear regions of equal size. Those are referred to as allocation groups. Each allocation group manages its own inodes and free disk space. Practically, allocation groups can be seen as file systems in a file system. Because allocation groups are rather independent of each other, more than one of them can be addressed by the kernel simultaneously. This feature is the key to XFS’s great scalability. Naturally, the concept of independent allocation groups suits the needs of multiprocessor systems.
Free space and inodes are handled by B+ trees inside the allocation groups. The use of B+ trees greatly contributes to XFS’s performance and scalability. XFS uses delayed allocation, which handles allocation by breaking the process into two pieces. A pending transaction is stored in RAM and the appropriate amount of space is reserved. XFS still does not decide where exactly (in file system blocks) the data should be stored. This decision is delayed until the last possible moment. Some short-lived temporary data might never make its way to disk, because it is obsolete by the time XFS decides where actually to save it. In this way, XFS increases write performance and reduces file system fragmentation. Because delayed allocation results in less frequent write events than in other file systems, it is likely that data loss after a crash during a write is more severe.
Before writing the data to the file system, XFS reserves (preallocates) the free space needed for a file. Thus, file system fragmentation is greatly reduced. Performance is increased because the contents of a file are not distributed all over the file system.
For a side-by-side feature comparison of the major operating systems in SUSE Linux Enterprise Server, see File System Support and Sizes on the SUSE Linux Enterprise Server Technical Information Web site.
Table 1.2, “File System Types in Linux” summarizes some other file systems supported by Linux. They are supported mainly to ensure compatibility and interchange of data with different kinds of media or foreign operating systems.
|
File System Type |
Description |
|---|---|
|
|
Compressed ROM file system: A compressed read-only file system for ROMs. |
|
|
High Performance File System: The IBM OS/2 standard file system. Only supported in read-only mode. |
|
|
Standard file system on CD-ROMs. |
|
|
This file system originated from academic projects on operating systems and was the first file system used in Linux. Today, it is used as a file system for floppy disks. |
|
|
|
|
|
Network File System: Here, data can be stored on any machine in a network and access might be granted via a network. |
|
|
Windows NT file system; read-only. |
|
|
Server Message Block is used by products such as Windows to enable file access over a network. |
|
|
Used on SCO UNIX, Xenix, and Coherent (commercial UNIX systems for PCs). |
|
|
Used by BSD, SunOS, and NextStep. Only supported in read-only mode. |
|
|
UNIX on MS-DOS: Applied on top of a standard |
|
|
Virtual FAT: Extension of the |
Originally, Linux supported a maximum file size of 2 GiB (231 bytes). Unless a file system comes with large file support, the maximum file size on a 32-bit system is 2 GiB.
Currently, all of our standard file systems have LFS (large file support), which gives a maximum file size of 263 bytes in theory. Table 1.3, “Maximum Sizes of Files and File Systems (On-Disk Format, 4 KiB Block Size)” offers an overview of the current on-disk format limitations of Linux files and file systems. The numbers in the table assume that the file systems are using 4 KiB block size, which is a common standard. When using different block sizes, the results are different. The maximum file sizes in Table 1.3, “Maximum Sizes of Files and File Systems (On-Disk Format, 4 KiB Block Size)” can be larger than the file system's actual size when using sparse blocks.
In this document: 1024 Bytes = 1 KiB; 1024 KiB = 1 MiB; 1024 MiB = 1 GiB; 1024 GiB = 1 TiB; 1024 TiB = 1 PiB; 1024 PiB = 1 EiB (see also NIST: Prefixes for Binary Multiples.
|
File System (4 KiB Block Size) |
Maximum File System Size |
Maximum File Size |
|---|---|---|
|
Btrfs |
16 EiB |
16 EiB |
|
Ext3 |
16 TiB |
2 TiB |
|
OCFS2 (a cluster-aware file system available in the High Availability Extension) |
16 TiB |
1 EiB |
|
ReiserFS v3.6 |
16 TiB |
1 EiB |
|
XFS |
8 EiB |
8 EiB |
|
NFSv2 (client side) |
8 EiB |
2 GiB |
|
NFSv3 (client side) |
8 EiB |
8 EiB |
Table 1.3, “Maximum Sizes of Files and File Systems (On-Disk Format, 4 KiB Block Size)” describes the limitations regarding the on-disk format. The Linux kernel imposes its own limits on the size of files and file systems handled by it. These are as follows:
On 32-bit systems, files cannot exceed 2 TiB (241 bytes).
File systems can be up to 273 bytes in size. However, this limit is still out of reach for the currently available hardware.
Table 1.4, “Storage Limitations” summarizes the kernel limits for storage associated with SUSE Linux Enterprise 11 Service Pack 3.
|
Storage Feature |
Limitation |
|---|---|
|
Maximum number of LUNs supported |
16384 LUNs per target. |
|
Maximum number of paths per single LUN |
No limit per se. Each path is treated as a normal LUN. The actual limit is given by the number of LUNs per target and the number of targets per HBA (16777215 for a Fibre Channel HBA). |
|
Maximum number of HBAs |
Unlimited. The actual limit is determined by the amount of PCI slots of the system. |
|
Maximum number of paths with device-mapper-multipath (in total) per operating system |
Approximately1024. The actual number depends on the length of the device number strings. It is a compile-time variable within multipath-tools, which can be raised if this limit poses to be a problem. |
|
Maximum size per block device |
For X86, up to 16 TiB. For x86_64, ia64, s390x, and ppc64, up to 8 EiB. |
You can use the YaST Partitioner to create and manage file systems and RAID devices. For information, see “Advanced Disk Setup” in the SUSE Linux Enterprise Server 11 SP4 Deployment Guide.
Each of the file system projects described above maintains its own home page on which to find mailing list information, further documentation, and FAQs:
A comprehensive multipart tutorial about Linux file systems can be found at IBM developerWorks in the Advanced File System Implementor’s Guide.
An in-depth comparison of file systems (not only Linux file systems) is available from the Wikipedia project in Comparison of File Systems.
On solid state drives (SSDs) and thinly provisioned volumes it is useful
to trim blocks not in use by the file system. SUSE Linux Enterprise Server fully supports
unmap or trim operations on all file
systems supporting these methods.
The recommended way to trim a supported file system on SUSE Linux Enterprise Server is to
run /sbin/wiper.sh. Make sure to read
/usr/share/doc/packages/hdparm/README.wiper before
running this script. For most desktop and server systems the sufficient
trimming frequency is once a week. Mounting a file system with -o
discard comes with a performance penalty and may negatively
affect the lifetime of SSDs and is not recommended.
The features and behavior changes noted in this section were made for SUSE Linux Enterprise Server 11.
In regards of storage, SUSE Linux Enterprise Server SP4 is a bugfix release, no new features were added.
In addition to bug fixes, the features and behavior changes in this section were made for the SUSE Linux Enterprise Server 11 SP3 release:
Btrfs quota support for subvolumes on the root
file system has been added in the btrfs(8) command.
YaST supports the iSCSI LIO Target Server software. For information, see Chapter 15, Mass Storage over IP Networks: iSCSI LIO Target Server.
The following enhancements were added for Linux software RAIDs:
The software RAID provides improved support on the Intel RSTe+ (Rapid Storage Technology Enterprise) platform to support RAID level 0, 1, 4, 5, 6, and 10.
The LEDMON utility supports PCIe-SSD enclosure LEDs for MD software RAIDs. For information, see Chapter 12, Storage Enclosure LED Utilities for MD Software RAIDs.
In the wizard in the YaST Partitioner, the option allows you to specify the order in which the selected devices in a Linux software RAID will be used to ensure that one half of the array resides on one disk subsystem and the other half of the array resides on a different disk subsystem. For example, if one disk subsystem fails, the system keeps running from the second disk subsystem. For information, see Step 4.d in Section 10.3.3, “Creating a Complex RAID10 with the YaST Partitioner”.
The following enhancements were added for LVM2:
LVM logical volumes can be thinly provisioned. For information, see Section 4.5, “Configuring Logical Volumes”.
Thin pool: The logical volume is a pool of space that is reserved for use with thin volumes. The thin volumes can allocate their needed space from it on demand.
Thin volume: The volume is created as a sparse volume. The volume allocates needed space on demand from a thin pool.
LVM logical volume snapshots can be thinly provisioned. Thin provisioning is assumed if you to create a snapshot without a specified size. For information, see Section 17.2, “Creating Linux Snapshots with LVM”.
The following changes and enhancements were made for multipath I/O:
The mpathpersist(8) utility is new. It can be used to
manage SCSI persistent reservations on Device Mapper Multipath devices.
For information, see
Section 7.3.5, “Linux mpathpersist(8) Utility”.
The following enhancement was added to the
multipath(8) command:
The -r option allows you to force a device map
reload.
The Device Mapper - Multipath tool added the following enhancements for
the /etc/multipath.conf file:
udev_dir.
The udev_dir attribute is deprecated. After you
upgrade to SLES 11 SP3 or a later version, you can remove the
following line from the defaults section of your
/etc/multipath.conf file:
udev_dir /dev
getuid_callout.
In the defaults section of the
/etc/multipath.conf file, the
getuid_callout attribute is deprecated and replaced
by the uid_attribute parameter. This parameter is a
udev attribute that provides a unique path identifier. The default
value is ID_SERIAL.
After you upgrade to SLES 11 SP3 or a later version, you can modify
the attributes in the defaults section of your
/etc/multipath.conf file:
Remove the following line from the defaults
section:
getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
Add the following line to the defaults section:
uid_attribute "ID_SERIAL"
path_selector.
In the defaults section of the
/etc/multipath.conf file, the default value for
the path_selector attribute was changed from
"round-robin 0" to "service-time
0". The service-time option chooses the
path for the next bunch of I/O based on the amount of outstanding I/O
to the path and its relative throughput.
After you upgrade to SLES 11 SP3 or a later version, you can modify
the attribute value in the defaults section of your
/etc/multipath.conf file to use the recommended
default:
path_selector "service-time 0"
user_friendly_names.
The user_friendly_names attribute can be configured
in the devices section and in the
multipaths section.
max_fds.
The default setting for the max_fds attribute was
changed to max. This allows the multipath daemon to
open as many file descriptors as the system allows when it is
monitoring paths.
After you upgrade to SLES 11 SP3 or a later version, you can modify
the attribute value in your /etc/multipath.conf
file:
max_fds "max"
reservation_key.
In the defaults section or
multipaths section of the
/etc/multipath.conf file, the
reservation_key attribute can be used to assign a
Service Action Reservation Key that is used with the
mpathpersist(8) utility to manage persistent
reservations for Device Mapper Multipath devices. The attribute is not
used by default. If it is not set, the multipathd
daemon does not check for persistent reservation for newly discovered
paths or reinstated paths.
reservation_key <reservation key>
For example:
multipaths {
multipath {
wwid XXXXXXXXXXXXXXXX
alias yellow
reservation_key 0x123abc
}
}For information about setting persistent reservations, see Section 7.3.5, “Linux mpathpersist(8) Utility”.
hardware_handler.
Four SCSI hardware handlers were added in the SCSI layer that can be used with DM-Multipath:
scsi_dh_alua |
scsi_dh_rdac |
scsi_dh_hp_sw |
scsi_dh_emc |
These handlers are modules created under the SCSI directory in the Linux kernel. Previously, the hardware handler in the Device Mapper layer was used.
Add the modules to the initrd image, then specify
them in the /etc/multipath.conf file as hardware
handler types alua, rdac,
hp_sw, and emc. For information
about adding the device drivers to the initrd
image, see
Section 7.4.3, “Configuring the Device Drivers in initrd for Multipathing”.
In addition to bug fixes, the features and behavior changes in this section were made for the SUSE Linux Enterprise Server 11 SP2 release:
Btrfs File System. See Section 1.2.1, “Btrfs”.
Open Fibre Channel over Ethernet. See Chapter 16, Fibre Channel Storage over Ethernet Networks: FCoE.
Tagging for LVM storage objects. See Section 4.7, “Tagging LVM2 Storage Objects”.
NFSv4 ACLs tools. See Chapter 18, Managing Access Control Lists over NFSv4.
--assume-clean option for mdadm
resize command. See
Section 11.2.2, “Increasing the Size of the RAID Array”.
In the defaults section of the
/etc/multipath.conf file, the
default_getuid parameter was obsoleted and replaced by
the getuid_callout parameter:
The line changed from this:
default_getuid "/sbin/scsi_id -g -u -s /block/%n"
to this:
getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
In addition to bug fixes, the features and behavior changes noted in this section were made for the SUSE Linux Enterprise Server 11 SP1 release.
Section 2.4.2, “Modifying Authentication Parameters in the iSCSI Initiator”
Section 2.4.3, “Allowing Persistent Reservations for MPIO Devices”
Section 2.4.5, “Boot Loader Support for MDRAID External Metadata”
Section 2.4.6, “YaST Install and Boot Support for MDRAID External Metadata”
Section 2.4.7, “Improved Shutdown for MDRAID Arrays that Contain the Root File System”
Section 2.4.11, “Updating Storage Drivers for Adapters on IBM Servers”
In the function (Section 14.2.2, “Creating iSCSI Targets with YaST”), a option was added that allows you to export the iSCSI target information. This makes it easier to provide information to consumers of the resources.
In the function (Section 14.2.2, “Creating iSCSI Targets with YaST”), you can modify the authentication parameters for connecting to a target devices. Previously, you needed to delete the entry and re-create it in order to change the authentication information.
A SCSI initiator can issue SCSI reservations for a shared storage device, which locks out SCSI initiators on other servers from accessing the device. These reservations persist across SCSI resets that might happen as part of the SCSI exception handling process.
The following are possible scenarios where SCSI reservations would be useful:
In a simple SAN environment, persistent SCSI reservations help protect against administrator errors where a LUN is attempted to be added to one server but it is already in use by another server, which might result in data corruption. SAN zoning is typically used to prevent this type of error.
In a high-availability environment with failover set up, persistent SCSI reservations help protect against errant servers connecting to SCSI devices that are reserved by other servers.
Use the latest version of the Multiple Devices Administration (MDADM,
mdadm) utility to take advantage of bug fixes and
improvements.
Support was added to use the external metadata capabilities of the MDADM utility version 3.0 to install and run the operating system from RAID volumes defined by the Intel Matrix Storage Technology metadata format. This moves the functionality from the Device Mapper RAID (DMRAID) infrastructure to the Multiple Devices RAID (MDRAID) infrastructure, which offers the more mature RAID 5 implementation and offers a wider feature set of the MD kernel infrastructure. It allows a common RAID driver to be used across all metadata formats, including Intel, DDF (common RAID disk data format), and native MD metadata.
The YaST installer tool added support for MDRAID External Metadata for
RAID 0, 1, 10, 5, and 6. The installer can detect RAID arrays and whether
the platform RAID capabilities are enabled. If multipath RAID is enabled
in the platform BIOS for Intel Matrix Storage Manager, it offers options
for DMRAID, MDRAID (recommended), or none. The initrd
was also modified to support assembling BIOS-based RAID arrays.
Shutdown scripts were modified to wait until all of the MDRAID arrays are marked clean. The operating system shutdown process now waits for a dirty-bit to be cleared until all MDRAID volumes have finished write operations.
Changes were made to the startup script, shutdown script, and the
initrd to consider whether the root
(/) file system (the system volume that contains the
operating system and application files) resides on a software RAID array.
The metadata handler for the array is started early in the shutdown
process to monitor the final root file system environment during the
shutdown. The handler is excluded from the general
killall events. The process also allows for writes to
be quiesced and for the array’s metadata dirty-bit (which indicates
whether an array needs to be resynchronized) to be cleared at the end of
the shutdown.
The YaST installer now allows MD to be configured over iSCSI devices.
If RAID arrays are needed on boot, the iSCSI initiator software is loaded
before boot.md so that the iSCSI targets are
available to be auto-configured for the RAID.
For a new install, Libstorage creates an
/etc/mdadm.conf file and adds the line AUTO
-all. During an update, the line is not added. If
/etc/mdadm.conf contains the line
AUTO -all
then no RAID arrays are auto-assembled unless they are explicitly listed
in /etc/mdadm.conf.
The MD-SGPIO utility is a standalone application that monitors RAID arrays
via sysfs(2). Events trigger an LED change request that
controls blinking for LED lights that are associated with each slot in an
enclosure or a drive bay of a storage subsystem. It supports two types of
LED systems:
2-LED systems (Activity LED, Status LED)
3-LED systems (Activity LED, Locate LED, Fail LED)
The lvresize, lvextend, and
lvreduce commands that are used to resize logical
volumes were modified to allow the resizing of LVM 2 mirrors. Previously,
these commands reported errors if the logical volume was a mirror.
Update the following storage drivers to use the latest available versions to support storage adapters on IBM servers:
Adaptec: aacraid, aic94xx
Emulex: lpfc
LSI: mptas, megaraid_sas
The mptsas driver now supports native EEH (Enhanced
Error Handler) recovery, which is a key feature for all of the IO
devices for Power platform customers.
qLogic: qla2xxx, qla3xxx,
qla4xxx
The features and behavior changes noted in this section were made for the SUSE Linux Enterprise Server 11 release.
Section 2.5.5, “OCFS2 File System Is in the High Availability Release”
Section 2.5.7, “Device Name Persistence in the /dev/disk/by-id Directory”
Section 2.5.9, “User-Friendly Names for Multipathed Devices”
Section 2.5.10, “Advanced I/O Load-Balancing Options for Multipath”
Section 2.5.11, “Location Change for Multipath Tool Callouts”
Section 2.5.12, “Change from mpath to multipath for the mkinitrd -f Option”
The Enterprise Volume Management Systems (EVMS2) storage management solution is deprecated. All EVMS management modules have been removed from the SUSE Linux Enterprise Server 11 packages. Your non-system EVMS-managed devices should be automatically recognized and managed by Linux Volume Manager 2 (LVM2) when you upgrade your system. For more information, see Evolution of Storage and Volume Management in SUSE Linux Enterprise.
If you have EVMS managing the system device (any device that contains the
root (/), /boot, or
swap), try these things to prepare the SLES 10
server before you reboot the server to upgrade:
In the /etc/fstab file, modify the boot and swap disks to the default
/dev/system/sys_lx directory:
Remove /evms/lvm2 from the path for the
swap and root (/)
partitions.
Remove /evms from the path for
/boot partition.
In the /boot/grub/menu.lst file, remove
/evms/lvm2 from the path.
In the /etc/sysconfig/bootloader file, verify that
the path for the boot device is the /dev directory.
Ensure that boot.lvm and
boot.md are enabled:
In YaST, click .
Select boot.lvm.
Click .
Select boot.md.
Click .
Click , then click .
Reboot and start the upgrade.
For information about managing storage with EVMS2 on SUSE Linux Enterprise Server 10, see the SUSE Linux Enterprise Server 10 SP3: Storage Administration Guide.
The Ext3 file system has replaced ReiserFS as the default file system recommended by the YaST tools at installation time and when you create file systems. ReiserFS is still supported. For more information, see File System Support on the SUSE Linux Enterprise 11 Tech Specs Web page.
To allow space for extended attributes and ACLs for a file on Ext3 file systems, the default inode size for Ext3 was increased from 128 bytes on SLES 10 to 256 bytes on SLES 11. For information, see Section 1.2.3.4, “Ext3 File System Inode Size and Number of Inodes”.
The JFS file system is no longer supported. The JFS utilities were removed from the distribution.
The OCFS2 file system is fully supported as part of the SUSE Linux Enterprise High Availability Extension.
The /dev/disk/by-name path is deprecated in
SUSE Linux Enterprise Server 11 packages.
In SUSE Linux Enterprise Server 11, the default multipath setup relies on
udev to overwrite the existing symbolic links in the
/dev/disk/by-id directory when multipathing is
started. Before you start multipathing, the link points to the SCSI device
by using its scsi-xxx name. When multipathing is
running, the symbolic link points to the device by using its
dm-uuid-xxx name. This ensures that the symbolic
links in the /dev/disk/by-id path persistently point
to the same device regardless of whether multipathing is started or not.
The configuration files (such as lvm.conf and
md.conf) do not need to be modified because they
automatically point to the correct device.
See the following sections for more information about how this behavior change affects other features:
The deprecation of the /dev/disk/by-name directory
(as described in Section 2.5.6, “/dev/disk/by-name Is Deprecated”)
affects how you set up filters for multipathed devices in the
configuration files. If you used the
/dev/disk/by-name device name path for the multipath
device filters in the /etc/lvm/lvm.conf file, you
need to modify the file to use the /dev/disk/by-id
path. Consider the following when setting up filters that use the
by-id path:
The /dev/disk/by-id/scsi-* device names are
persistent and created for exactly this purpose.
Do not use the /dev/disk/by-id/dm-* name in the
filters. These are symbolic links to the Device-Mapper devices, and
result in reporting duplicate PVs in response to a
pvscan command. The names appear to change from
LVM-pvuuid to dm-uuid and back
to LVM-pvuuid.
For information about setting up filters, see Section 7.2.4, “Using LVM2 on Multipath Devices”.
A change in how multipathed device names are handled in the
/dev/disk/by-id directory (as described in
Section 2.5.7, “Device Name Persistence in the /dev/disk/by-id Directory”) affects your setup
for user-friendly names because the two names for the device differ. You
must modify the configuration files to scan only the device mapper names
after multipathing is configured.
For example, you need to modify the lvm.conf file to
scan using the multipathed device names by specifying the
/dev/disk/by-id/dm-uuid-.*-mpath-.* path instead of
/dev/disk/by-id.
The following advanced I/O load-balancing options are available for Device Mapper Multipath, in addition to round-robin:
Least-pending
Length-load-balancing
Service-time
For information, see path_selector in Section 7.11.2.1, “Understanding Priority Groups and Attributes”.
The mpath_* prio_callouts for the Device Mapper
Multipath tool have been moved to shared libraries
in/lib/libmultipath/lib*. By using shared libraries,
the callouts are loaded into memory on daemon startup. This helps avoid a
system deadlock on an all-paths-down scenario where the programs need to
be loaded from the disk, which might not be available at this point.
The option for adding Device Mapper Multipath services to the
initrd has changed from -f mpath
to -f multipath.
To make a new initrd, the command is now:
mkinitrd -f multipath
The default setting for the path_grouping_policy in the
/etc/multipath.conf file has changed from
multibus to failover.
For information about configuring the path_grouping_policy, see Section 7.11, “Configuring Path Failover Policies and Priorities”.
Consider what your storage needs are and how you can effectively manage and divide your storage space to best meet your needs. Use the information in this section to help plan your storage deployment for file systems on your SUSE Linux Enterprise Server 11 server.
For information about using the YaST Expert Partitioner, see “Using the YaST Partitioner” in the SUSE Linux Enterprise Server 11 Installation and Administration Guide.
Linux supports using multiple I/O paths for fault-tolerant connections between the server and its storage devices. Linux multipath support is disabled by default. If you use a multipath solution that is provided by your storage subsystem vendor, you do not need to configure the Linux multipath separately.
Linux supports hardware and software RAID devices. If you use hardware RAID devices, software RAID devices are unnecessary. You can use both hardware and software RAID devices on the same server.
To maximize the performance benefits of software RAID devices, partitions used for the RAID should come from different physical devices. For software RAID 1 devices, the mirrored partitions cannot share any disks in common.
Linux supports file system snapshots.
Open source tools for backing up data on Linux include
tar, cpio, and
rsync. See the man pages for these tools for more
information.
PAX: POSIX File System Archiver. It supports cpio and
tar, which are the two most common forms of standard
archive (backup) files. See the man page for more information.
Amanda: The Advanced Maryland Automatic Network Disk Archiver. See www.amanda.org.
Novell Open Enterprise Server (OES) 2 for Linux is a product that includes SUSE Linux Enterprise Server (SLES) 10. Antivirus and backup software vendors who support OES 2 also support SLES 10. You can visit the vendor Web sites to find out about their scheduled support of SLES 11.
For a current list of possible backup and antivirus software vendors, see Novell Open Enterprise Server Partner Support: Backup and Antivirus Support. This list is updated quarterly.
This section briefly describes the principles behind Logical Volume Manager (LVM) and its basic features that make it useful under many circumstances. The YaST LVM configuration can be reached from the YaST Expert Partitioner. This partitioning tool enables you to edit and delete existing partitions and create new ones that should be used with LVM.
Using LVM might be associated with increased risk, such as data loss. Risks also include application crashes, power failures, and faulty commands. Save your data before implementing LVM or reconfiguring volumes. Never work without a backup.
LVM enables flexible distribution of hard disk space over several file systems. It was developed because the need to change the segmentation of hard disk space might arise only after the initial partitioning has already been done during installation. Because it is difficult to modify partitions on a running system, LVM provides a virtual pool (volume group or VG) of memory space from which logical volumes (LVs) can be created as needed. The operating system accesses these LVs instead of the physical partitions. Volume groups can span more than one disk, so that several disks or parts of them can constitute one single VG. In this way, LVM provides a kind of abstraction from the physical disk space that allows its segmentation to be changed in a much easier and safer way than through physical repartitioning.
Figure 4.1, “Physical Partitioning versus LVM” compares physical partitioning (left) with LVM segmentation (right). On the left side, one single disk has been divided into three physical partitions (PART), each with a mount point (MP) assigned so that the operating system can access them. On the right side, two disks have been divided into two and three physical partitions each. Two LVM volume groups (VG 1 and VG 2) have been defined. VG 1 contains two partitions from DISK 1 and one from DISK 2. VG 2 contains the remaining two partitions from DISK 2.
In LVM, the physical disk partitions that are incorporated in a volume group are called physical volumes (PVs). Within the volume groups in Figure 4.1, “Physical Partitioning versus LVM”, four logical volumes (LV 1 through LV 4) have been defined, which can be used by the operating system via the associated mount points. The border between different logical volumes need not be aligned with any partition border. See the border between LV 1 and LV 2 in this example.
LVM features:
Several hard disks or partitions can be combined in a large logical volume.
Provided the configuration is suitable, an LV (such as
/usr) can be enlarged when the free space is
exhausted.
Using LVM, it is possible to add hard disks or LVs in a running system. However, this requires hot-swappable hardware that is capable of such actions.
It is possible to activate a striping mode that distributes the data stream of a logical volume over several physical volumes. If these physical volumes reside on different disks, this can improve the reading and writing performance just like RAID 0.
The snapshot feature enables consistent backups (especially for servers) in the running system.
With these features, using LVM already makes sense for heavily used home PCs or small servers. If you have a growing data stock, as in the case of databases, music archives, or user directories, LVM is especially useful. It allows file systems that are larger than the physical hard disk. Another advantage of LVM is that up to 256 LVs can be added. However, keep in mind that working with LVM is different from working with conventional partitions.
Starting from kernel version 2.6, LVM version 2 is available, which is downward-compatible with the previous LVM and enables the continued management of old volume groups. When creating new volume groups, decide whether to use the new format or the downward-compatible version. LVM 2 does not require any kernel patches. It makes use of the device mapper integrated in kernel 2.6. This kernel only supports LVM version 2. Therefore, when talking about LVM, this section always refers to LVM version 2.
You can manage new or existing LVM storage objects by using the YaST Partitioner. Instructions and further information about configuring LVM is available in the official LVM HOWTO.
If you add multipath support after you have configured LVM, you must
modify the /etc/lvm/lvm.conf file to scan only the
multipath device names in the /dev/disk/by-id
directory as described in
Section 7.2.4, “Using LVM2 on Multipath Devices”, then reboot
the server.
For each disk, partition the free space that you want to use for LVM as
0x8E Linux LVM. You can create one or multiple LVM
partitions on a single device. It is not necessary for all of the
partitions on a device to be LVM partitions.
You can use the Volume Group function to group one or more LVM partitions into a logical pool of space called a volume group, then carve out one or more logical volumes from the space in the volume group.
In the YaST Partitioner, only the free space on the disk is made available to you as you are creating LVM partitions. If you want to use the entire disk for a single LVM partition and other partitions already exists on the disk, you must first remove all of the existing partitions to free the space before you can use that space in an LVM partition.
Deleting a partition destroys all of the data in the partition.
Launch YaST as the root user.
In YaST, open the .
(Optional) Remove one or more existing partitions to free that space and make it available for the LVM partition you want to create.
For information, see Section 4.12, “Deleting an LVM Partition (Physical Volume)”.
On the Partitions page, click .
Under , select or, then click .
Specify the , then click .
Maximum Size: Use all of the free available space on the disk.
Custom Size: Specify a size up the amount of free available space on the disk.
Custom Region: Specify the start and end cylinder of the free available space on the disk.
Configure the partition format:
Under , select .
From the drop-down list, select as the partition identifier.
Under , select .
Click .
The partitions are not actually created until you click and to exit the partitioner.
Repeat Step 4 through Step 8 for each Linux LVM partition you want to add.
Click , verify that the new Linux LVM partitions are listed, then click to exit the partitioner.
(Optional) Continue with the Volume Group configuration as described in Section 4.3, “Creating Volume Groups”.
An LVM volume group organizes the Linux LVM partitions into a logical pool of space. You can carve out logical volumes from the available space in the group. The Linux LVM partitions in a group can be on the same or different disks. You can add LVM partitions from the same or different disks to expand the size of the group. Assign all partitions reserved for LVM to a volume group. Otherwise, the space on the partition remains unused.
Launch YaST as the root user.
In YaST, open the .
In the left panel, click .
A list of existing Volume Groups are listed in the right panel.
At the lower left of the Volume Management page, click .
Specify the .
If you are creating a volume group at install time, the name
system is suggested for a volume group that will
contain the SUSE Linux Enterprise Server system files.
Specify the .
The defines the size of a physical block in the volume group. All the disk space in a volume group is handled in chunks of this size. Values can be from 1 KB to 16 GB in powers of 2. This value is normally set to 4 MB.
In LVM1, a 4 MB physical extent allowed a maximum LV size of 256 GB because it supports only up to 65534 extents per LV. VM2 does not restrict the number of physical extents. Having a large number of extents has no impact on I/O performance to the logical volume, but it slows down the LVM tools.
Different physical extent sizes should not be mixed in a single VG. The extent should not be modified after the initial setup.
In the list, select the Linux LVM partitions that you want to make part of this volume group, then click to move them to the list.
Click .
The new group appears in the list.
On the Volume Management page, click , verify that the new volume group is listed, then click .
When the Linux LVM partitions are assigned to a volume group, the partitions are then referred to as a physical volumes.
To add more physical volumes to an existing volume group:
Launch YaST as the root user.
In YaST, open the .
In the left panel, select Volume Management and expand the list of groups.
Under Volume Management, select the volume group, then click the tab.
At the bottom of the page, click .
Select a physical volume (LVM partitions) from the list then click Add to move it to the list.
Click .
Click , verify that the changes are listed, then click .
After a volume group has been filled with physical volumes, use the Logical Volumes dialog box (see Figure 4.3, “Logical Volume Management”) to define and manage the logical volumes that the operating system should use. This dialog lists all of the logical volumes in that volume group. You can use , , and options to manage the logical volumes. Assign at least one logical volume to each volume group. You can create new logical volumes as needed until all free space in the volume group has been exhausted.
Beginning in SLES 11 SP3, an LVM logical volume can be thinly provisioned. Thin provisioning allows you to create logical volumes with sizes that overbook the available free space. You create a thin pool that contains unused space reserved for use with an arbitrary number of thin volumes. A thin volume is created as a sparse volume and space is allocated from a thin pool as needed. The thin pool can be expanded dynamically when needed for cost-effective allocation of storage space.
To use thinly provisioned volumes in a cluster, the thin pool and the thin volumes that use it must be managed in a single cluster resource. This allows the thin volumes and thin pool to always be mounted exclusively on the same node.
It is possible to distribute the data stream in the logical volume among
several physical volumes (striping). If these physical volumes reside on
different hard disks, this generally results in a better reading and
writing performance (like RAID 0). However, a striping LV with
n stripes can only be created correctly if the hard disk
space required by the LV can be distributed evenly to n
physical volumes. For example, if only two physical volumes are available,
a logical volume with three stripes is impossible.
Launch YaST as the root user.
In YaST, open the .
In the left panel, select Volume Management and expand it to see the list of volume groups.
Under Volume Management, select the volume group, then click the tab.
In the lower left, click to open the Add Logical Volume dialog box.
Specify the for the logical volume, then click .
Specify the type of LVM volume:
Normal volume: (Default) The volume’s space is allocated immediately.
Thin pool: The logical volume is a pool of space that is reserved for use with thin volumes. The thin volumes can allocate their needed space from it on demand.
Thin volume: The volume is created as a sparse volume. The volume allocates needed space on demand from a thin pool.
Specify the size of the volume and whether to use multiple stripes.
Specify the size of the logical volume, up to the maximum size available.
The amount of free space in the current volume group is shown next to the option.
Specify the number of stripes.
YaST has no chance at this point to verify the correctness of your entries concerning striping. Any mistake made here is apparent only later when the LVM is implemented on disk.
Specify the formatting options for the logical volume:
Under , select , then select the format type from the drop-down list, such as Ext3.
Under , select , then select the mount point.
The files stored on this logical volume can be found at this mount point on the installed system.
Click to add special mounting options for the volume.
Click .
Click , verify that the changes are listed, then click .
Activation behavior for non-root LVM volume groups is controlled by
parameter settings in the /etc/sysconfig/lvm file.
By default, non-root LVM volume groups are automatically activated on
system restart by /etc/rc.d/boot.lvm, according to the
setting for the LVM_VGS_ACTIVATED_ON_BOOT parameter in
the /etc/sysconfig/lvm file. This parameter allows you
to activate all volume groups on system restart, or to activate only
specified non-root LVM volume groups.
To activate all non-root LVM volume groups on system restart, ensure that
the value for the LVM_VGS_ACTIVATED_ON_BOOT parameter in
the /etc/sysconfig/lvm file is empty
(""). This is the default setting. For almost all
standard LVM installations, it can safely stay empty.
LVM_VGS_ACTIVATED_ON_BOOT=""
To activate only a specified non-root LVM volume group on system restart,
specify the volume group name as the value for the
LVM_VGS_ACTIVATED_ON_BOOT parameter:
LVM_VGS_ACTIVATED_ON_BOOT="vg1"
By default, newly discovered LVM volume groups are not automatically
activated. The LVM_ACTIVATED_ON_DISCOVERED parameter is
disabled in the /etc/sysconfig/lvm file:
LVM_ACTIVATED_ON_DISCOVERED="disable"
You can enable the LVM_ACTIVATED_ON_DISCOVERED parameter
to allow newly discovered LVM volume groups to be activated via udev rules:
LVM_ACTIVATED_ON_DISCOVERED="enable"
A tag is an unordered keyword or term assigned to the metadata of a storage object. Tagging allows you to classify collections of LVM storage objects in ways that you find useful by attaching an unordered list of tags to their metadata.
After you tag the LVM2 storage objects, you can use the tags in commands to accomplish the following tasks:
Select LVM objects for processing according to the presence or absence of specific tags.
Use tags in the configuration file to control which volume groups and logical volumes are activated on a server.
Override settings in a global configuration file by specifying tags in the command.
A tag can be used in place of any command line LVM object reference that accepts:
a list of objects
a single object as long as the tag expands to a single object
Replacing the object name with a tag is not supported everywhere yet. After the arguments are expanded, duplicate arguments in a list are resolved by removing the duplicate arguments, and retaining the first instance of each argument.
Wherever there might be ambiguity of argument type, you must prefix a tag
with the commercial at sign (@) character, such as
@mytag. Elsewhere, using the “@” prefix is
optional.
Consider the following requirements when using tags with LVM:
An LVM tag word can contain the ASCII uppercase characters A to Z, lowercase characters a to z, numbers 0 to 9, underscore (_), plus (+), hyphen (-), and period (.). The word cannot begin with a hyphen. The maximum length is 128 characters.
You can tag LVM2 physical volumes, volume groups, logical volumes, and logical volume segments. PV tags are stored in its volume group’s metadata. Deleting a volume group also deletes the tags in the orphaned physical volume. Snapshots cannot be tagged, but their origin can be tagged.
LVM1 objects cannot be tagged because the disk format does not support it.
Add a tag to (or tag) an LVM2 storage object.
Example.
vgchange --addtag @db1 vg1
Remove a tag from (or untag) an LVM2 storage object.
Example.
vgchange --deltag @db1 vg1
Specify the tag to use to narrow the list of volume groups or logical volumes to be activated or deactivated.
Example.
Enter the following to activate it the volume if it has a tag that matches the tag provided:
lvchange -ay --tag @db1 vg1/vol2
Add the following code to the /etc/lvm/lvm.conf file
to enable host tags that are defined separately on host in a
/etc/lvm/lvm_<hostname>.conf
file.
tags {
# Enable hostname tags
hosttags = 1
}
You place the activation code in the
/etc/lvm/lvm_<hostname>.conf
file on the host. See
Section 4.7.4.3, “Defining Activation”.
tags {
tag1 { }
# Tag does not require a match to be set.
tag2 {
# If no exact match, tag is not set.
host_list = [ "hostname1", "hostname2" ]
}
}
You can modify the /etc/lvm/lvm.conf file to
activate LVM logical volumes based on tags.
In a text editor, add the following code to the file:
activation {
volume_list = [ "vg1/lvol0", "@database" ]
}
Replace @database with the your tag. Use
"@*" to match the tag against any tag set on the host.
The activation command matches against vgname, vgname/lvname, or @tag set in the metadata of volume groups and logical volumes. A volume group or logical volume is activated only if a metadata tag matches. The default if there is no match is not to activate.
If volume_list is not present and any tags are defined
on the host, then it activates the volume group or logical volumes only
if a host tag matches a metadata tag.
If volume_list is not present and no tags are defined
on the host, then it does activate.
You can use the activation code in a host’s configuration file
(/etc/lvm/lvm_<host_tag>.conf)
when host tags are enabled in the lvm.conf file. For
example, a server has two configuration files in the
/etc/lvm/ folder:
lvm.conf |
lvm_<host_tag>.conf |
At startup, load the /etc/lvm/lvm.conf file, and
process any tag settings in the file. If any host tags were defined, it
loads the related
/etc/lvm/lvm_<host_tag>.conf
file.When it searches for a specific configuration file entry, it
searches the host tag file first, then the lvm.conf
file, and stops at the first match.Within the
lvm_<host_tag>.conf
file, use the reverse order that tags were set. This allows the file for
the last tag set to be searched first. New tags set in the host tag file
will trigger additional configuration file loads.
You can set up a simple hostname activation control by enabling the
hostname_tags option in a the
/etc/lvm/lvm.conf file. Use the same file on every
machine in a cluster so that it is a global setting.
In a text editor, add the following code to the
/etc/lvm/lvm.conf file:
tags {
hostname_tags = 1
}Replicate the file to all hosts in the cluster.
From any machine in the cluster, add db1 to the list
of machines that activate vg1/lvol2:
lvchange --addtag @db1 vg1/lvol2
On the db1 server, enter the following to activate
it:
lvchange -ay vg1/vol2
The examples in this section demonstrate two methods to accomplish the following:
Activate volume group vg1 only on the database
hosts db1 and db2.
Activate volume group vg2 only on the file server
host fs1.
Activate nothing initially on the file server backup host
fsb1, but be prepared for it to take over from the
file server host fs1.
In the following solution, the single configuration file is replicated among multiple hosts.
Add the @database tag to the metadata of volume
group vg1. In a terminal console, enter
vgchange --addtag @database vg1
Add the @fileserver tag to the metadata of volume
group vg2. In a terminal console, enter
vgchange --addtag @fileserver vg2
In a text editor, modify the /etc/lvm/lvm.conf
file with the following code to define the
@database, @fileserver,
@fileserverbackup tags.
tags {
database {
host_list = [ "db1", "db2" ]
}
fileserver {
host_list = [ "fs1" ]
}
fileserverbackup {
host_list = [ "fsb1" ]
}
}
activation {
# Activate only if host has a tag that matches a metadata tag
volume_list = [ "@*" ]
}
Replicate the modified /etc/lvm/lvm.conf file to
the four hosts: db1, db2,
fs1, and fsb1.
If the file server host goes down, vg2 can be
brought up on fsb1 by entering the following
commands in a terminal console on any node:
vgchange --addtag @fileserverbackup vg2 vgchange -ay vg2
In the following solution, each host holds locally the information about which classes of volume to activate.
Add the @database tag to the metadata of volume
group vg1. In a terminal console, enter
vgchange --addtag @database vg1
Add the @fileserver tag to the metadata of volume
group vg2. In a terminal console, enter
vgchange --addtag @fileserver vg2
Enable host tags in the /etc/lvm/lvm.conf file:
In a text editor, modify the /etc/lvm/lvm.conf
file with the following code to enable host tag configuration files.
tags {
hosttags = 1
}
Replicate the modified /etc/lvm/lvm.conf file to
the four hosts: db1, db2,
fs1, and fsb1.
On host db1, create an activation configuration
file for the database host db1. In a text editor,
create a /etc/lvm/lvm_db1.conf file and add the
following code:
activation {
volume_list = [ "@database" ]
}
On host db2, create an activation configuration
file for the database host db2. In a text editor,
create a /etc/lvm/lvm_db2.conf file and add the
following code:
activation {
volume_list = [ "@database" ]
}
On host fs1, create an activation configuration file for the file
server host fs1.In a text editor, create
a /etc/lvm/lvm_fs1.conf file and add the following
code:
activation {
volume_list = [ "@fileserver" ]
}
If the file server host fs1 goes down, to bring up
a spare file server host fsb1 as a file server:
On host fsb1, create an activation configuration
file for the host fsb1. In a text editor, create
a /etc/lvm/lvm_fsb1.conf file and add the
following code:
activation {
volume_list = [ "@fileserver" ]
}In a terminal console, enter one of the following commands:
vgchange -ay vg2 vgchange -ay @fileserver
You can add and remove Linux LVM partitions from a volume group to expand or reduce its size.
Removing a partition can result in data loss if the partition is in use by a logical volume.
Launch YaST as the root user.
In YaST, open the .
In the left panel, select Volume Management and expand it to see the list of volume groups.
Under Volume Management, select the volume group, then click the tab.
At the bottom of the page, click .
Do one of the following:
Add: Expand the size of the volume group by moving one or more physical volumes (LVM partitions) from the list to the list.
Remove: Reduce the size of the volume group by moving Lone or more physical volumes (LVM partitions) from the list to the list.
Click .
Click , verify that the changes are listed, then click .
Launch YaST as the root user.
In YaST, open the .
In the left panel, select Volume Management and expand it to see the list of volume groups.
Under Volume Management, select the volume group, then click the tab.
At the bottom of the page, click to open the Resize Logical Volume dialog box.
Use the slider to expand or reduce the size of the logical volume.
Reducing the size of a logical volume that contains data can cause data corruption.
Click .
Click , verify that the change is listed, then click .
The lvresize, lvextend, and
lvreduce commands are used to resize logical volumes.
See the man pages for each of these commands for syntax and options
information.
You can also increase the size of a logical volume by using the YaST
Partitioner. YaST uses parted(8) to grow the
partition.
To extend an LV there must be enough unallocated space available on the VG.
LVs can be extended or shrunk while they are being used, but this may not be true for a file system on them. Extending or shrinking the LV does not automatically modify the size of file systems in the volume. You must use a different command to grow the file system afterwards. For information about resizing file systems, see Chapter 5, Resizing File Systems.
Ensure that you use the right sequence:
If you extend an LV, you must extend the LV before you attempt to grow the file system.
If you shrink an LV, you must shrink the file system before you attempt to shrink the LV.
To extend the size of a logical volume:
Open a terminal console, log in as the root
user.
If the logical volume contains file systems that are hosted for a virtual machine (such as a Xen VM), shut down the VM.
Dismount the file systems on the logical volume.
At the terminal console prompt, enter the following command to grow the size of the logical volume:
lvextend -L +size /dev/vgname/lvname
For size, specify the amount of space you want
to add to the logical volume, such as 10GB. Replace
/dev/vgname/lvname with
the Linux path to the logical volume, such as
/dev/vg1/v1. For example:
lvextend -L +10GB /dev/vg1/v1
For example, to extend an LV with a (mounted and active) ReiserFS on it by 10GB:
lvextend −L +10G /dev/vgname/lvname resize_reiserfs −s +10GB −f /dev/vg−name/lv−name
For example, to shrink an LV with a ReiserFS on it by 5GB:
umount /mountpoint−of−LV resize_reiserfs −s −5GB /dev/vgname/lvname lvreduce /dev/vgname/lvname mount /dev/vgname/lvname /mountpoint−of−LV
Deleting a volume group destroys all of the data in each of its member partitions.
Launch YaST as the root user.
In YaST, open the .
In the left panel, select Volume Management and expand the list of groups.
Under Volume Management, select the volume group, then click the tab.
At the bottom of the page, click , then click to confirm the deletion.
Click , verify that the deleted volume group is listed (deletion is indicated by a red colored font), then click .
Deleting a partition destroys all of the data in the partition.
Launch YaST as the root user.
In YaST, open the .
If the Linux LVM partition is in use as a member of a volume group, remove the partition from the volume group, or delete the volume group (Section 4.11, “Deleting a Volume Group”).
In the YaST Partitioner under , select the
device (such as sdc).
On the Partitions page, select a partition that you want to remove, click , then click to confirm the deletion.
Click , verify that the deleted partition is listed (deletion is indicated by a red colored font), then click .
For information about using LVM commands, see the man pages for the
commands described in Table 4.1, “LVM Commands”.
Perform the commands as the root user.
|
Command |
Description |
|---|---|
pvcreate <device> |
Initializes a device (such as |
pvdisplay <device> |
Displays information about the LVM physical volume, such as whether it is currently being used in a logical volume. |
vgcreate -c y <vg_name> <dev1> [dev2...] |
Creates a clustered volume group with one or more specified devices. |
vgchange -a [ey | n] <vg_name> |
Activates ( Important
Ensure that you use the |
vgremove <vg_name> |
Removes a volume group. Before using this command, remove the logical volumes, then deactivate the volume group. |
vgdisplay <vg_name> |
Displays information about a specified volume group. To find the total physical extent of a volume group, enter vgdisplay vg_name | grep "Total PE" |
lvcreate -L size -n <lv_name> <vg_name> |
Creates a logical volume of the specified size. |
lvcreate -L <size> <-T|--thinpool> <vg_name/thin_pool_name> lvcreate -virtualsize <size> <-T|--thin> <vg_name/thin_pool_name> -n <thin_lv_name> lvcreate -L <size> -T <vg_name/thin_pool_name> -virtualsize <size> -n <thin_lv_name> |
Creates a thin logical volume or a thin pool of the specified size. Creates thin pool or thin logical volume or both. Specifying the optional argument --size will cause the creation of the thin pool logical volume. Specifying the optional argument --virtualsize will cause the creation of the thin logical volume from given thin pool volume. Specifying both arguments will cause the creation of both thin pool and thin volume using this pool. |
lvcreate -s [-L size] -n <snap_volume> <source_volume_path> |
Creates a snapshot volume for the specified logical volume. If the size option (-L, --size) is not included, the snapshot is created as a thin snapshot. |
lvremove</dev/vg_name/lv_name> |
Removes a logical volume, such as
Before using this command, close the logical volume by dismounting it
with the |
lvremove snap_volume_path |
Removes a snapshot volume. |
lvconvert --merge [-b] [-i <seconds>] [<snap_volume_path>[...<snapN>]|@<volume_tag>] |
Reverts (rolls back or merges) the snapshot data into the original volume. |
vgextend <vg_name><device> |
Adds a specified physical volume to an existing volume group |
vgreduce <vg_name> <device> |
Removes a specified physical volume from an existing volume group. Important
Ensure that the physical volume is not currently being used by a
logical volume. If it is, you must move the data to another physical
volume by using the |
lvextend -L size</dev/vg_name/lv_name> |
Extends the size of a specified logical volume. Afterwards, you must also expand the file system to take advantage of the newly available space. |
lvreduce -L size </dev/vg_name/lv_name> |
Reduces the size of a specified logical volume. ImportantEnsure that you reduce the size of the file system first before shrinking the volume, otherwise you risk losing data. |
lvrename </dev/vg_name/old_lv_name> </dev/vg_name/new_lv_name> |
Renames an existing LVM logical volume in a volume group from the old volume name to the new volume name. It does not change the volume group name. |
When your data needs grow for a volume, you might need to increase the amount of space allocated to its file system.
Resizing any partition or file system involves some risks that can potentially result in losing data.
To avoid data loss, ensure that you back up your data before you begin any resizing task.
Consider the following guidelines when planning to resize a file system.
The file system must support resizing in order to take advantage of increases in available space for the volume. In SUSE Linux Enterprise Server 11, file system resizing utilities are available for file systems Ext2, Ext3, Ext4, and ReiserFS. The utilities support increasing and decreasing the size as follows:
|
File System |
Utility |
Increase Size (Grow) |
Decrease Size (Shrink) |
|---|---|---|---|
|
Ext2 |
resize2fs |
Offline only |
Offline only |
|
Ext3 |
resize2fs |
Online or offline |
Offline only |
|
Ext4 |
resize2fs |
Offline only |
Offline only |
|
ReiserFS |
resize_reiserfs |
Online or offline |
Offline only |
You can grow a file system to the maximum space available on the device, or specify an exact size. Ensure that you grow the size of the device or logical volume before you attempt to increase the size of the file system.
When specifying an exact size for the file system, ensure that the new size satisfies the following conditions:
The new size must be greater than the size of the existing data; otherwise, data loss occurs.
The new size must be equal to or less than the current device size because the file system size cannot extend beyond the space available.
When decreasing the size of the file system on a device, ensure that the new size satisfies the following conditions:
The new size must be greater than the size of the existing data; otherwise, data loss occurs.
The new size must be equal to or less than the current device size because the file system size cannot extend beyond the space available.
If you plan to also decrease the size of the logical volume that holds the file system, ensure that you decrease the size of the file system before you attempt to decrease the size of the device or logical volume.
The size of Ext2, Ext3, and Ext4 file systems can be increased by using the
resize2fs command when the file system is mounted. The
size of an Ext3 file system can also be increased by using the
resize2fs command when the file system is unmounted.
Open a terminal console, then log in as the root
user or equivalent.
If the file system is Ext2 or Ext4, you must unmount the file system. The Ext3 file system can be mounted or unmounted.
Increase the size of the file system using one of the following methods:
To extend the file system size to the maximum available size of the
device called /dev/sda1, enter
resize2fs /dev/sda1
If a size parameter is not specified, the size defaults to the size of the partition.
To extend the file system to a specific size, enter
resize2fs /dev/sda1 size
The size parameter specifies the requested new size of the file system. If no units are specified, the unit of the size parameter is the block size of the file system. Optionally, the size parameter can be suffixed by one of the following the unit designators: s for 512 byte sectors; K for kilobytes (1 kilobyte is 1024 bytes); M for megabytes; or G for gigabytes.
Wait until the resizing is completed before continuing.
If the file system is not mounted, mount it now.
For example, to mount an Ext2 file system for a device named
/dev/sda1 at mount point /home,
enter
mount -t ext2 /dev/sda1 /home
Check the effect of the resize on the mounted file system by entering
df -h
The Disk Free (df) command shows the total size of the
disk, the number of blocks used, and the number of blocks available on
the file system. The -h option print sizes in human-readable format, such
as 1K, 234M, or 2G.
A ReiserFS file system can be increased in size while mounted or unmounted.
Open a terminal console, then log in as the root
user or equivalent.
Increase the size of the file system on the device called
/dev/sda2, using one of the following methods:
To extend the file system size to the maximum available size of the device, enter
resize_reiserfs /dev/sda2
When no size is specified, this increases the volume to the full size of the partition.
To extend the file system to a specific size, enter
resize_reiserfs -s size /dev/sda2
Replace size with the desired size in bytes.
You can also specify units on the value, such as 50000K (kilobytes),
250M (megabytes), or 2G (gigabytes). Alternatively, you can specify an
increase to the current size by prefixing the value with a plus (+)
sign. For example, the following command increases the size of the file
system on /dev/sda2 by 500 MB:
resize_reiserfs -s +500M /dev/sda2
Wait until the resizing is completed before continuing.
If the file system is not mounted, mount it now.
For example, to mount an ReiserFS file system for device
/dev/sda2 at mount point /home,
enter
mount -t reiserfs /dev/sda2 /home
Check the effect of the resize on the mounted file system by entering
df -h
The Disk Free (df) command shows the total size of the
disk, the number of blocks used, and the number of blocks available on
the file system. The -h option print sizes in human-readable format, such
as 1K, 234M, or 2G.
You can shrink the size of the Ext2, Ext3, or Ext4 file systems when the volume is unmounted.
Open a terminal console, then log in as the root
user or equivalent.
Unmount the file system.
Decrease the size of the file system on the device such as
/dev/sda1 by entering
resize2fs /dev/sda1 <size>
Replace size with an integer value in kilobytes for the desired size. (A kilobyte is 1024 bytes.)
Wait until the resizing is completed before continuing.
Mount the file system. For example, to mount an Ext2 file system for a
device named /dev/sda1 at mount point
/home, enter
mount -t ext2 /dev/md0 /home
Check the effect of the resize on the mounted file system by entering
df -h
The Disk Free (df) command shows the total size of the
disk, the number of blocks used, and the number of blocks available on
the file system. The -h option print sizes in human-readable format, such
as 1K, 234M, or 2G.
Reiser file systems can be reduced in size only if the volume is unmounted.
Open a terminal console, then log in as the root
user or equivalent.
Unmount the device by entering
umount /mnt/point
If the partition you are attempting to decrease in size contains system
files (such as the root (/) volume), unmounting is
possible only when booting from a bootable CD or floppy.
Decrease the size of the file system on a device called
/dev/sda1 by entering
resize_reiserfs -s size /dev/sda2
Replace size with the desired size in bytes.
You can also specify units on the value, such as 50000K (kilobytes), 250M
(megabytes), or 2G (gigabytes). Alternatively, you can specify a decrease
to the current size by prefixing the value with a minus (-) sign. For
example, the following command reduces the size of the file system on
/dev/md0 by 500 MB:
resize_reiserfs -s -500M /dev/sda2
Wait until the resizing is completed before continuing.
Mount the file system by entering
mount -t reiserfs /dev/sda2 /mnt/point
Check the effect of the resize on the mounted file system by entering
df -h
The Disk Free (df) command shows the total size of the
disk, the number of blocks used, and the number of blocks available on
the file system. The -h option print sizes in human-readable format, such
as 1K, 234M, or 2G.
This section describes the optional use of UUIDs instead of device names to
identify file system devices in the boot loader file and the
/etc/fstab file.
In the Linux 2.6 and later kernel, udev provides a
userspace solution for the dynamic /dev directory,
with persistent device naming. As part of the hotplug system,
udev is executed if a device is added to or removed from
the system.
A list of rules is used to match against specific device attributes. The
udev rules infrastructure (defined in the
/etc/udev/rules.d directory) provides stable names for
all disk devices, regardless of their order of recognition or the
connection used for the device. The udev tools examine
every appropriate block device that the kernel creates to apply naming
rules based on certain buses, drive types, or file systems. For information
about how to define your own rules for udev, see
Writing
udev Rules.
Along with the dynamic kernel-provided device node name,
udev maintains classes of persistent symbolic links
pointing to the device in the /dev/disk directory,
which is further categorized by the by-id,
by-label, by-path, and
by-uuid subdirectories.
Other programs besides udev, such as LVM or
md, might also generate UUIDs, but they are not listed
in /dev/disk.
A UUID (Universally Unique Identifier) is a 128-bit number for a file system that is unique on both the local system and across other systems. It is a randomly generated with system hardware information and time stamps as part of its seed. UUIDs are commonly used to uniquely tag devices.
The UUID is always unique to the partition and does not depend on the
order in which it appears or where it is mounted. With certain SAN devices
attached to the server, the system partitions are renamed and moved to be
the last device. For example, if root (/) is assigned
to /dev/sda1 during the install, it might be assigned
to /dev/sdg1 after the SAN is connected. One way to
avoid this problem is to use the UUID in the boot loader and
/etc/fstab files for the boot device.
The device ID assigned by the manufacturer for a drive never changes, no
matter where the device is mounted, so it can always be found at boot. The
UUID is a property of the file system and can change if you reformat the
drive. In a boot loader file, you typically specify the location of the
device (such as /dev/sda1) to mount it at system
boot. The boot loader can also mount devices by their UUIDs and
administrator-specified volume labels. However, if you use a label and
file location, you cannot change the label name when the partition is
mounted.
You can use the UUID as criterion for assembling and activating software
RAID devices. When a RAID is created, the md driver
generates a UUID for the device, and stores the value in the
md superblock.
You can find the UUID for any block device in the
/dev/disk/by-uuid directory. For example, a UUID
looks like this:
e014e482-1c2d-4d09-84ec-61b3aefde77a
After the install, you can optionally use the following procedure to
configure the UUID for the system device in the boot loader and
/etc/fstab files for your x86 system.
Before you begin, make a copy of /boot/grub/menu.1st
file and the /etc/fstab file.
Install the SUSE Linux Enterprise Server for x86 with no SAN devices connected.
After the install, boot the system.
Open a terminal console as the root user or
equivalent.
Navigate to the /dev/disk/by-uuid directory to find
the UUID for the device where you installed /boot,
/root, and swap.
At the terminal console prompt, enter
cd /dev/disk/by-uuid
List all partitions by entering
ll
Find the UUID, such as
e014e482-1c2d-4d09-84ec-61b3aefde77a —> /dev/sda1
Edit /boot/grub/menu.1st file, using the Boot Loader
option in YaST or using a text editor.
For example, change
kernel /boot/vmlinuz root=/dev/sda1
to
kernel /boot/vmlinuz root=/dev/disk/by-uuid/e014e482-1c2d-4d09-84ec-61b3aefde77a
If you make a mistake, you can boot the server without the SAN
connected, and fix the error by using the backup copy of the
/boot/grub/menu.1st file as a guide.
If you use the Boot Loader option in YaST, there is a defect where it adds some duplicate lines to the boot loader file when you change a value. Use an editor to remove the following duplicate lines:
color white/blue black/light-gray
default 0
timeout 8
gfxmenu (sd0,1)/boot/message
When you use YaST to change the way that the root
(/) device is mounted (such as by UUID or by label),
the boot loader configuration needs to be saved again to make the change
effective for the boot loader.
As the root user or equivalent, do one of the
following to place the UUID in the /etc/fstab file:
Launch YaST as the root user, select
› ,
select the device of interest, then modify .
Edit the /etc/fstab file to modify the system
device from the location to the UUID.
For example, if the root (/) volume has a device
path of /dev/sda1 and its UUID is
e014e482-1c2d-4d09-84ec-61b3aefde77a, change line
entry from
/dev/sda1 / reiserfs acl,user_xattr 1 1
to
UUID=e014e482-1c2d-4d09-84ec-61b3aefde77a / reiserfs acl,user_xattr 1 1
Do not leave stray characters or spaces in the file.
After the install, use the following procedure to configure the UUID for
the system device in the boot loader and /etc/fstab
files for your IA64 system. IA64 uses the EFI BIOS. Its file system
configuration file is /boot/efi/SuSE/elilo.conf
instead of /etc/fstab.
Before you begin, make a copy of the
/boot/efi/SuSE/elilo.conf file.
Install the SUSE Linux Enterprise Server for IA64 with no SAN devices connected.
After the install, boot the system.
Open a terminal console as the root user or
equivalent.
Navigate to the /dev/disk/by-uuid directory to find
the UUID for the device where you installed /boot,
/root, and swap.
At the terminal console prompt, enter
cd /dev/disk/by-uuid
List all partitions by entering
ll
Find the UUID, such as
e014e482-1c2d-4d09-84ec-61b3aefde77a —> /dev/sda1
Edit the boot loader file, using the Boot Loader option in YaST.
For example, change
root=/dev/sda1
to
root=/dev/disk/by-uuid/e014e482-1c2d-4d09-84ec-61b3aefde77a
Edit the /boot/efi/SuSE/elilo.conf file to modify
the system device from the location to the UUID.
For example, change
/dev/sda1 / reiserfs acl,user_xattr 1 1
to
UUID=e014e482-1c2d-4d09-84ec-61b3aefde77a / reiserfs acl,user_xattr 1 1
Do not leave stray characters or spaces in the file.
For more information about using udev(8) for managing
devices, see
“Dynamic
Kernel Device Management with udev” in the
SUSE Linux Enterprise Server 11 Administration Guide.
For more information about udev(8) commands, see its man
page. Enter the following at a terminal console prompt:
man 8 udev
This section describes how to manage failover and path load balancing for multiple paths between the servers and block storage devices.
Section 7.6, “Creating or Modifying the /etc/multipath.conf File”
Section 7.7, “Configuring Default Policies for Polling, Queueing, and Failback”
Section 7.9, “Configuring User-Friendly Names or Alias Names”
Section 7.10, “Configuring Default Settings for zSeries Devices”
Section 7.11, “Configuring Path Failover Policies and Priorities”
Section 7.12, “Configuring Multipath I/O for the Root Device”
Section 7.13, “Configuring Multipath I/O for an Existing Software RAID”
Section 7.15, “Scanning for New Partitioned Devices without Rebooting”
Multipathing is the ability of a server to communicate with the same physical or logical block storage device across multiple physical paths between the host bus adapters in the server and the storage controllers for the device, typically in Fibre Channel (FC) or iSCSI SAN environments. You can also achieve multiple connections with direct attached storage when multiple channels are available.
Linux multipathing provides connection fault tolerance and can provide load balancing across the active connections. When multipathing is configured and running, it automatically isolates and identifies device connection failures, and reroutes I/O to alternate connections.
Typical connection problems involve faulty adapters, cables, or controllers. When you configure multipath I/O for a device, the multipath driver monitors the active connection between devices. When the multipath driver detects I/O errors for an active path, it fails over the traffic to the device’s designated secondary path. When the preferred path becomes healthy again, control can be returned to the preferred path.
Use the guidelines in this section when planning your multipath I/O solution.
Multipathing is managed at the device level.
The storage array you use for the multipathed device must support multipathing. For more information, see Section 7.2.11, “Supported Storage Arrays for Multipathing”.
You need to configure multipathing only if multiple physical paths exist between host bus adapters in the server and host bus controllers for the block storage device. You configure multipathing for the logical device as seen by the server.
For some storage arrays, the vendor provides its own multipathing software to manage multipathing for the array’s physical and logical devices. In this case, you should follow the vendor’s instructions for configuring multipathing for those devices.
Perform the following disk management tasks before you attempt to configure multipathing for a physical or logical device that has multiple paths:
Use third-party tools to carve physical disks into smaller logical disks.
Use third-party tools to partition physical or logical disks. If you change the partitioning in the running system, the Device Mapper Multipath (DM-MP) module does not automatically detect and reflect these changes. DM-MPIO must be re-initialized, which usually requires a reboot.
Use third-party SAN array management tools to create and configure hardware RAID devices.
Use third-party SAN array management tools to create logical devices such as LUNs. Logical device types that are supported for a given array depend on the array vendor.
The Linux software RAID management software runs on top of multipathing. For each device that has multiple I/O paths and that you plan to use in a software RAID, you must configure the device for multipathing before you attempt to create the software RAID device. Automatic discovery of multipathed devices is not available. The software RAID is not aware of the multipathing management running underneath.
For information about setting up multipathing for existing software RAIDs, see Section 7.13, “Configuring Multipath I/O for an Existing Software RAID”.
High-availability solutions for clustering storage resources run on top
of the multipathing service on each node. Ensure that the configuration
settings in the /etc/multipath.conf file on each
node are consistent across the cluster.
Ensure that multipath devices the same name across all devices by doing the following:
Use UUID and alias names to ensure that multipath device names are
consistent across all nodes in the cluster. Alias names must be unique
across all nodes. Copy the /etc/multipath.conf
file from the node to the /etc/
directory all of the other nodes in the cluster.
When using links to multipath-mapped devices, ensure that you specify
the dm-uuid* name or alias name in the
/dev/disk/by-id directory, and not a fixed path
instance of the device. For information, see
Section 7.2.3, “Using WWID, User-Friendly, and Alias Names for Multipathed Devices”.
Set the user_friendly_names configuration option to
no to disable it. A user-friendly name is unique to a node, but a
device might not be assigned the same user-friendly name on every node
in the cluster.
You can force the system-defined user-friendly names to be consistent across all nodes in the cluster by doing the following:
In the /etc/multipath.conf file on one node:
Set the user_friendly_names configuration option
to yes to enable it.
Multipath uses the /var/lib/multipath/bindings
file to assign a persistent and unique name to the device in the
form of
mpath<n> in
the /dev/mapper directory.
(Optional) Set the bindings_file option in the
defaults section of the
/etc/multipath.conf file to specify an alternate
location for the bindings file.
The default location is
/var/lib/multipath/bindings.
Set up all of the multipath devices on the node.
Copy the /etc/multipath.conf file from the node
to the /etc/ directory all of the other nodes in
the cluster.
Copy the bindings file from the node to the
bindings_file path on all of the other nodes in
the cluster.
The Distributed Replicated Block Device (DRBD) high-availability solution for mirroring devices across a LAN runs on top of multipathing. For each device that has multiple I/O paths and that you plan to use in a DRDB solution, you must configure the device for multipathing before you configure DRBD.
Volume managers such as LVM2 and Clustered LVM2 run on top of multipathing. You must configure multipathing for a device before you use LVM2 or cLVM2 to create segment managers and file systems on it. For information, see Section 7.2.4, “Using LVM2 on Multipath Devices”.
When using multipathing in a virtualization environment, the multipathing is controlled in the host server environment. Configure multipathing for the device before you assign it to a virtual guest machine.
SLES 11 SP2 upgrades the multipath-tools from 0.4.8 to 0.4.9. Some
changes in PRIO syntax require that you manually modify the
/etc/multipath.conf file as needed to comply with the
new syntax.
The syntax for the prio keyword in the
/etc/multipath.conf file is changed in
multipath-tools-0.4.9. The prio
line specifies the prioritizer. If the prioritizer requires an argument,
you specify the argument by using the prio_args keyword
on a second line. Previously, the prioritizer and its arguments were
included on the prio line.
Multipath Tools 0.4.9 and later uses the prio setting
in the defaults{} or devices{}
section of the /etc/multipath.conf file. It silently
ignores the keyword prio when it is specified for an
individual multipath definition in the
multipaths{) section. Multipath Tools 0.4.8 for SLES
11 SP1 and earlier allows the prio setting in the individual
multipath definition in the
multipaths{) section to override the
prio settings in the defaults{} or
devices{} section.
A multipath device can be uniquely identified by its WWID, by a
user-friendly name, or by an alias that you assign for it. Device node
names in the form of /dev/sdn and
/dev/dm-n can change on reboot and might be assigned
to different devices each time. A device’s WWID, user-friendly name, and
alias name persist across reboots, and are the preferred way to identify
the device.
If you want to use the entire LUN directly (for example, if you are using
the SAN features to partition your storage), you can use the
/dev/disk/by-id/xxx names for
mkfs, fstab, your application, and
so on. Partitioned devices have _part<n>
appended to the device name, such as
/dev/disk/by-id/xxx_part1.
In the /dev/disk/by-id directory, the
multipath-mapped devices are represented by the device’s
dm-uuid* name or alias name (if you assign an alias
for it in the /etc/multipath.conf file). The
scsi- and wwn- device names
represent physical paths to the devices.
When using links to multipath-mapped devices, ensure that you specify the
dm-uuid* name or alias name in the
/dev/disk/by-id directory, and not a fixed path
instance of the device.
When you define device aliases in the
/etc/multipath.conf file, ensure that you use each
device’s WWID (such as
3600508e0000000009e6baa6f609e7908) and not its WWN,
which replaces the first character of a device ID with
0x, such as
0x600508e0000000009e6baa6f609e7908.
For information about using user-friendly names and aliases for multipathed devices, see Section 7.9, “Configuring User-Friendly Names or Alias Names”.
Ensure that the configuration file for lvm.conf
points to the multipath-device names instead of fixed path names. This
should happen automatically if boot.multipath is
enabled and loads before boot.lvm.
By default, LVM2 does not recognize multipathed devices. To make LVM2
recognize the multipathed devices as possible physical volumes, you must
modify /etc/lvm/lvm.conf to scan multipathed devices
through the multipath I/O layer.
Adding a multipath filter prevents LVM from scanning and using the
physical paths for raw device nodes that represent individual paths to
the SAN (/dev/sd*). Ensure that you specify the filter path so that LVM
scans only the device mapper names for the device
(/dev/disk/by-id/dm-uuid-.*-mpath-.*) after
multipathing is configured.
To modify /etc/lvm/lvm.conf for multipath use:
Open the /etc/lvm/lvm.conf file in a text editor.
If /etc/lvm/lvm.conf does not exist, you can
create one based on your current LVM configuration by entering the
following at a terminal console prompt:
lvm dumpconfig > /etc/lvm/lvm.conf
Change the filter and types
entries in /etc/lvm/lvm.conf as follows:
filter = [ "a|/dev/disk/by-id/.*|", "r|.*|" ] types = [ "device-mapper", 1 ]
This allows LVM2 to scan only the by-id paths and reject everything else.
If you are using user-friendly names, specify the filter path so that only the Device Mapper names are scanned after multipathing is configured. The following filter path accepts only partitions on a multipathed device:
filter = [ "a|/dev/disk/by-id/dm-uuid-.*-mpath-.*|", "r|.*|" ]
To accept both raw disks and partitions for Device Mapper names,
specify the path as follows, with no hyphen (-) before
mpath:
filter = [ "a|/dev/disk/by-id/dm-uuid-.*mpath-.*|", "r|.*|" ]
If you are also using LVM2 on non-multipathed devices, make the
necessary adjustments in the filter and
types entries to suit your setup. Otherwise, the
other LVM devices are not visible with a pvscan
after you modify the lvm.conf file for
multipathing.
You want only those devices that are configured with LVM to be included in the LVM cache, so ensure that you are specific about which other non-multipathed devices are included by the filter.
For example, if your local disk is /dev/sda and
all SAN devices are /dev/sdb and above, specify
the local and multipathing paths in the filter as follows:
filter = [ "a|/dev/sda.*|", "a|/dev/disk/by-id/.*|", "r|.*|" ] types = [ "device-mapper", 253 ]
Save the file.
Add dm-multipath to
/etc/sysconfig/kernel:INITRD_MODULES.
Make a new initrd to ensure that the Device Mapper
Multipath services are loaded with the changed settings. Running
mkinitrd is needed only if the root (/) device or
any parts of it (such as /var,
/etc, /log) are on the SAN
and multipath is needed to boot.
Enter the following at a terminal console prompt:
mkinitrd -f multipath
Reboot the server to apply the changes.
Multipath must be loaded before LVM to ensure that multipath maps are built correctly. Loading multipath after LVM can result in incomplete device maps for a multipath device because LVM locks the device, and MPIO cannot create the maps properly.
If the system device is a local device that does not use MPIO and LVM,
you can disable both boot.multipath and
boot.lvm. After the server starts, you can manually
start multipath before you start LVM, then run a
pvscan command to recognize the LVM objects.
Timing is important for starting the LVM process. If LVM starts before
MPIO maps are done, LVM might use a fixed path for the device instead of
its multipath. The device works, so you might not be aware that the
device’s MPIO map is incomplete until that fixed path fails. You can
help prevent the problem by enabling boot.multipath
and following the instructions in
Section 7.2.4.1, “Adding a Multipath Device Filter in the /etc/lvm/lvm.conf File”.
To troubleshoot a mapping problem, you can use dmsetup
to check that the expected number of paths are present for each multipath
device. As the root user, enter the following at
a command prompt:
dmsetup ls --tree
In the following sample response, the first device has four paths. The second device is a local device with a single path. The third device has two paths. The distinction between active and passive paths is not reported through this tool.
vg910-lv00 (253:23)
└─ 360a980006465576657346d4b6c593362 (253:10)
|- (65:96)
|- (8:128)
|- (8:240)
└─ (8:16)
vg00-lv08 (253:9)
└─ (8:3)
system_vg-data_lv (253:1)
└─36006016088d014007e0d0d2213ecdf11 (253:0)
├─ (8:32)
└─ (8:48)An incorrect mapping typically returns too few paths and does not have a major number of 253. For example, the following shows what an incorrect mapping looks like for the third device:
system_vg-data_lv (8:31)
└─ (8:32)
The mdadm tool requires that the devices be accessed by
the ID rather than by the device node path. Therefore, the
DEVICE entry in
/etc/mdadm.conf file should be set as follows to
ensure that only device mapper names are scanned after multipathing is
configured:
DEVICE /dev/disk/by-id/dm-uuid-.*mpath-.*
If you are using user-friendly names or multipath aliases, specify the path as follows:
DEVICE /dev/disk/by-id/dm-name-.*
When using multipath for NetApp devices, we recommend the following
settings in the /etc/multipath.conf file:
Set the default values for the following parameters globally for NetApp devices:
max_fds max queue_without_daemon no
Set the default values for the following parameters for NetApp devices in the hardware table:
dev_loss_tmo infinity fast_io_fail_tmo 5 features "3 queue_if_no_path pg_init_retries 50"
The --noflush option should always be used when running
on multipath devices.
For example, in scripts where you perform a table reload, you use the
--noflush option on resume to ensure that any
outstanding I/O is not flushed, because you need the multipath topology
information.
load resume --noflush
A system with root (/) on a multipath device might
stall when all paths have failed and are removed from the system because a
dev_loss_tmo time-out is received from the storage
subsystem (such as Fibre Channel storage arrays).
If the system device is configured with multiple paths and the multipath
no_path_retry setting is active, you should modify the
storage subsystem’s dev_loss_tmo setting accordingly
to ensure that no devices are removed during an all-paths-down scenario.
We strongly recommend that you set the dev_loss_tmo
value to be equal to or higher than the no_path_retry
setting from multipath.
The recommended setting for the storage subsystem’s
dev_los_tmo is:
<dev_loss_tmo> = <no_path_retry> * <polling_interval>
where the following definitions apply for the multipath values:
no_path_retry is the number of retries for multipath
I/O until the path is considered to be lost, and queuing of IO is
stopped.
polling_interval is the time in seconds between path
checks.
Each of these multipath values should be set from the
/etc/multipath.conf configuration file. For
information, see Section 7.6, “Creating or Modifying the /etc/multipath.conf File”.
Behavior changes for how multipathed devices are partitioned might affect your configuration if you are upgrading.
In SUSE Linux Enterprise Server 11, the default multipath setup relies on
udev to overwrite the existing symbolic links in the
/dev/disk/by-id directory when multipathing is
started. Before you start multipathing, the link points to the SCSI
device by using its scsi-xxx name. When multipathing
is running, the symbolic link points to the device by using its
dm-uuid-xxx name. This ensures that the symbolic
links in the /dev/disk/by-id path persistently point
to the same device regardless of whether multipath is started or not.
Ensure that the configuration files for lvm.conf and
md.conf point to the multipath-device names. This
should happen automatically if boot.multipath is
enabled and loads before boot.lvm and
boot.md. Otherwise, the LVM and MD configuration
files might contain fixed paths for multipath-devices, and you must
correct those paths to use the multipath-device names. For LVM2 and cLVM
information, see
Section 7.2.4, “Using LVM2 on Multipath Devices”. For software
RAID information, see
Section 7.13, “Configuring Multipath I/O for an Existing Software RAID”.
In SUSE Linux Enterprise Server 10, the kpartx software is used in
the /etc/init.d/boot.multipath to add symlinks to
the /dev/dm-* line in the
multipath.conf configuration file for any newly
created partitions without requiring a reboot. This triggers
udevd to fill in the
/dev/disk/by-* symlinks. The main benefit is that
you can call kpartx with the new parameters without
rebooting the server.
In SUSE Linux Enterprise Server 9, it is not possible to partition multipath I/O devices
themselves. If the underlying physical device is already partitioned, the
multipath I/O device reflects those partitions and the layer provides
/dev/disk/by-id/<name>p1 ... pN devices so you
can access the partitions through the multipath I/O layer. As a
consequence, the devices need to be partitioned prior to enabling
multipath I/O. If you change the partitioning in the running system,
DM-MPIO does not automatically detect and reflect these changes. The
device must be re-initialized, which usually requires a reboot.
The multipathing drivers and tools support all seven of the supported processor architectures: IA32, AMD64/EM64T, IPF/IA64, p-Series (32-bit and 64-bit), and z-Series (31-bit and 64-bit).
The multipathing drivers and tools support most storage arrays. The storage array that houses the multipathed device must support multipathing in order to use the multipathing drivers and tools. Some storage array vendors provide their own multipathing management tools. Consult the vendor’s hardware documentation to determine what settings are required.
The multipath-tools package automatically detects
the following storage arrays:
| 3PARdata VV |
| AIX NVDISK |
| AIX VDASD |
| APPLE Xserve RAID |
| COMPELNT Compellent Vol |
| COMPAQ/HP HSV101, HSV111, HSV200, HSV210, HSV300, HSV400, HSV 450 |
| COMPAQ/HP MSA, HSV |
| COMPAQ/HP MSA VOLUME |
| DataCore SANmelody |
| DDN SAN DataDirector |
| DEC HSG80 |
| DELL MD3000 |
| DELL MD3000i |
| DELL MD32xx |
| DELL MD32xxi |
| DGC |
| EMC Clariion |
| EMC Invista |
| EMC SYMMETRIX |
| EUROLOGC FC2502 |
| FSC CentricStor |
| FUJITSU ETERNUS_DX, DXL, DX400, DX8000 |
| HITACHI DF |
| HITACHI/HP OPEN |
| HP A6189A |
| HP HSVX700 |
| HP LOGICAL VOLUME |
| HP MSA2012fc, MSA 2212fc, MSA2012i |
| HP MSA2012sa, MSA2312 fc/i/sa, MCA2324 fc/i/sa, MSA2000s VOLUME |
| HP P2000 G3 FC|P2000G3 FC/iSCSI|P2000 G3 SAS|P2000 G3 iSCSI |
| IBM 1722-600 |
| IBM 1724 |
| IBM 1726 |
| IBM 1742 |
| IBM 1745, 1746 |
| IBM 1750500 |
| IBM 1814 |
| IBM 1815 |
| IBM 1818 |
| IBM 1820N00 |
| IBM 2105800 |
| IBM 2105F20 |
| IBM 2107900 |
| IBM 2145 |
| IBM 2810XIV |
| IBM 3303 NVDISK |
| IBM 3526 |
| IBM 3542 |
| IBM IPR |
| IBM Nseries |
| IBM ProFibre 4000R |
| IBM S/390 DASD ECKD |
| IBM S/390 DASD FBA |
| Intel Multi-Flex |
| LSI/ENGENIO INF-01-00 |
| NEC DISK ARRAY |
| NETAPP LUN |
| NEXENTA COMSTAR |
| Pillar Axiom |
| PIVOT3 RAIGE VOLUME |
| SGI IS |
| SGI TP9100, TP 9300 |
| SGI TP9400, TP9500 |
| STK FLEXLINE 380 |
| STK OPENstorage D280 |
| SUN CSM200_R |
| SUN LCSM100_[IEFS] |
| SUN STK6580, STK6780 |
| SUN StorEdge 3510, T4 |
| SUN SUN_6180 |
In general, most other storage arrays should work. When storage arrays
are automatically detected, the default settings for multipathing apply.
If you want non-default settings, you must manually create and configure
the /etc/multipath.conf file. For information, see
Section 7.6, “Creating or Modifying the /etc/multipath.conf File”.
Testing of the IBM zSeries device with multipathing has shown that the
dev_loss_tmo parameter should be set to 90 seconds,
and the fast_io_fail_tmo parameter should be set to 5
seconds. If you are using zSeries devices, you must manually create and
configure the /etc/multipath.conf file to specify
the values. For information, see
Section 7.10, “Configuring Default Settings for zSeries Devices”.
Hardware that is not automatically detected requires an appropriate entry
for configuration in the devices section of the
/etc/multipath.conf file. In this case, you must
manually create and configure the configuration file. For information,
see Section 7.6, “Creating or Modifying the /etc/multipath.conf File”.
Consider the following caveats:
Not all of the storage arrays that are automatically detected have been tested on SUSE Linux Enterprise Server. For information, see Section 7.2.11.2, “Tested Storage Arrays for Multipathing Support”.
Some storage arrays might require specific hardware handlers. A hardware handler is a kernel module that performs hardware-specific actions when switching path groups and dealing with I/O errors. For information, see Section 7.2.11.3, “Storage Arrays that Require Specific Hardware Handlers”.
After you modify the /etc/multipath.conf file, you
must run mkinitrd to re-create the INITRD on your
system, then reboot in order for the changes to take effect.
The following storage arrays have been tested with SUSE Linux Enterprise Server:
| EMC |
| Hitachi |
| Hewlett-Packard/Compaq |
| IBM |
| NetApp |
| SGI |
Most other vendor storage arrays should also work. Consult your
vendor’s documentation for guidance. For a list of the default storage
arrays recognized by the multipath-tools package,
see Section 7.2.11.1, “Storage Arrays That Are Automatically Detected for Multipathing”.
Storage arrays that require special commands on failover from one path to the other or that require special nonstandard error handling might require more extensive support. Therefore, the Device Mapper Multipath service has hooks for hardware handlers. For example, one such handler for the EMC CLARiiON CX family of arrays is already provided.
Consult the hardware vendor’s documentation to determine if its hardware handler must be installed for Device Mapper Multipath.
The multipath -t command shows an internal table of
storage arrays that require special handling with specific hardware
handlers. The displayed list is not an exhaustive list of supported
storage arrays. It lists only those arrays that require special handling
and that the multipath-tools developers had access
to during the tool development.
Arrays with true active/active multipath support do not require special
handling, so they are not listed for the multipath -t
command.
A listing in the multipath -t table does not
necessarily mean that SUSE Linux Enterprise Server was tested on that specific hardware.
For a list of tested storage arrays, see
Section 7.2.11.2, “Tested Storage Arrays for Multipathing Support”.
The multipathing support in SUSE Linux Enterprise Server 10 and later is based on the
Device Mapper Multipath module of the Linux 2.6 kernel and the
multipath-tools userspace package. You can use the
Multiple Devices Administration utility (MDADM, mdadm)
to view the status of multipathed devices.
The Device Mapper Multipath (DM-MP) module provides the multipathing capability for Linux. DM-MPIO is the preferred solution for multipathing on SUSE Linux Enterprise Server 11. It is the only multipathing option shipped with the product that is completely supported by Novell and SUSE.
DM-MPIO features automatic configuration of the multipathing subsystem for a large variety of setups. Configurations of up to 8 paths to each device are supported. Configurations are supported for active/passive (one path active, others passive) or active/active (all paths active with round-robin load balancing).
The DM-MPIO framework is extensible in two ways:
Using specific hardware handlers. For information, see Section 7.2.11.3, “Storage Arrays that Require Specific Hardware Handlers”.
Using load-balancing algorithms that are more sophisticated than the round-robin algorithm
The user-space component of DM-MPIO takes care of automatic path discovery and grouping, as well as automated path retesting, so that a previously failed path is automatically reinstated when it becomes healthy again. This minimizes the need for administrator attention in a production environment.
DM-MPIO protects against failures in the paths to the device, and not failures in the device itself. If one of the active paths is lost (for example, a network adapter breaks or a fiber-optic cable is removed), I/O is redirected to the remaining paths. If the configuration is active/passive, then the path fails over to one of the passive paths. If you are using the round-robin load-balancing configuration, the traffic is balanced across the remaining healthy paths. If all active paths fail, inactive secondary paths must be waked up, so failover occurs with a delay of approximately 30 seconds.
If a disk array has more than one storage processor, ensure that the SAN switch has a connection to the storage processor that owns the LUNs you want to access. On most disk arrays, all LUNs belong to both storage processors, so both connections are active.
On some disk arrays, the storage array manages the traffic through storage processors so that it presents only one storage processor at a time. One processor is active and the other one is passive until there is a failure. If you are connected to the wrong storage processor (the one with the passive path) you might not see the expected LUNs, or you might see the LUNs but get errors when you try to access them.
|
Features of Storage Arrays |
Description |
|---|---|
|
Active/passive controllers |
One controller is active and serves all LUNs. The second controller acts as a standby. The second controller also presents the LUNs to the multipath component so that the operating system knows about redundant paths. If the primary controller fails, the second controller takes over, and it serves all LUNs. In some arrays, the LUNs can be assigned to different controllers. A given LUN is assigned to one controller to be its active controller. One controller does the disk I/O for any given LUN at a time, and the second controller is the standby for that LUN. The second controller also presents the paths, but disk I/O is not possible. Servers that use that LUN are connected to the LUN’s assigned controller. If the primary controller for a set of LUNs fails, the second controller takes over, and it serves all LUNs. |
|
Active/active controllers |
Both controllers share the load for all LUNs, and can process disk I/O for any given LUN. If one controller fails, the second controller automatically handles all traffic. |
|
Load balancing |
The Device Mapper Multipath driver automatically load balances traffic across all active paths. |
|
Controller failover |
When the active controller fails over to the passive, or standby, controller, the Device Mapper Multipath driver automatically activates the paths between the host and the standby, making them the primary paths. |
|
Boot/Root device support |
Multipathing is supported for the root (
Multipathing is supported for the |
Device Mapper Multipath detects every path for a multipathed device as a
separate SCSI device. The SCSI device names take the form
/dev/sdN, where
N is an autogenerated
letter for the device, beginning with a and issued sequentially as the
devices are created, such as /dev/sda,
/dev/sdb, and so on. If the number of devices exceeds
26, the letters are duplicated so that the next device after
/dev/sdz will be named
/dev/sdaa, /dev/sdab, and so on.
If multiple paths are not automatically detected, you can configure them
manually in the /etc/multipath.conf file. The
multipath.conf file does not exist until you create
and configure it. For information, see
Section 7.6, “Creating or Modifying the /etc/multipath.conf File”.
The multipath-tools user-space package takes care of
automatic path discovery and grouping. It automatically tests the path
periodically, so that a previously failed path is automatically reinstated
when it becomes healthy again. This minimizes the need for administrator
attention in a production environment.
|
Tool |
Description |
|---|---|
|
multipath |
Scans the system for multipathed devices and assembles them. |
|
multipathd |
Waits for maps events, then executes |
|
devmap-name |
Provides a meaningful device name to |
|
kpartx |
Maps linear devmaps to partitions on the multipathed device, which makes it possible to create multipath monitoring for partitions on the device. |
|
mpathpersist |
Manages SCSI persistent reservations on Device Mapper Multipath devices. |
The file list for a package can vary for different server architectures. For a list of files included in the multipath-tools package, go to the SUSE Linux Enterprise Server Technical Specifications > Package Descriptions Web page, find your architecture and select , then search on “multipath-tools” to find the package list for that architecture.
You can also determine the file list for an RPM file by querying the
package itself: using the rpm -ql or rpm
-qpl command options.
To query an installed package, enter
rpm -ql <package_name>
To query a package not installed, enter
rpm -qpl <URL_or_path_to_package>
To check that the multipath-tools package is
installed, do the following:
Enter the following at a terminal console prompt:
rpm -q multipath-tools
If it is installed, the response repeats the package name and provides the version information, such as:
multipath-tools-04.7-34.23
If it is not installed, the response reads:
package multipath-tools is not installed
Udev is the default device handler, and devices are automatically known to
the system by the Worldwide ID instead of by the device node name. This
resolves problems in previous releases of MDADM and LVM where the
configuration files (mdadm.conf and
lvm.conf) did not properly recognize multipathed
devices.
Just as for LVM2, MDADM requires that the devices be accessed by the ID
rather than by the device node path. Therefore, the
DEVICE entry in
/etc/mdadm.conf should be set as follows:
DEVICE /dev/disk/by-id/*
If you are using user-friendly names, specify the path as follows so that only the device mapper names are scanned after multipathing is configured:
DEVICE /dev/disk/by-id/dm-uuid-.*-mpath-.*
To verify that MDADM is installed:
Ensure that the mdadm package is installed by
entering the following at a terminal console prompt:
rpm -q mdadm
If it is installed, the response repeats the package name and provides the version information. For example:
mdadm-2.6-0.11
If it is not installed, the response reads:
package mdadm is not installed
For information about modifying the /etc/lvm/lvm.conf
file, see
Section 7.2.4, “Using LVM2 on Multipath Devices”.
Use the Linux multipath(8) command to configure and
manage multipathed devices.
General syntax for the multipath(8) command:
multipath [-v verbosity_level] [-b bindings_file] [-d] [-h|-l|-ll|-f|-F|-B|-c|-q|-r|-w|-W] [-p failover|multibus|group_by_serial|group_by_prio|group_by_node_name] [devicename]
Prints all paths and multipaths.
No output.
Prints only the created or updated multipath names. Used to feed
other tools like kpartx.
Print all information: Detected paths, multipaths, and device maps.
Print usage text.
Dry run; do not create or update device maps.
Show the current multipath topology from information fetched in
sysfs and the device mapper.
Show the current multipath topology from all available information (sysfs, the device mapper, path checkers, and so on).
Flush a multipath device map specified as a parameter, if unused.
Flush all unused multipath device maps.
Print internal hardware table to stdout.
Force device map reload.
Treat the bindings file as read only.
Set the user_friendly_names bindings file location. The default is
/etc/multipath/bindings.
Check if a block device should be a path in a multipath device.
Allow device tables with queue_if_no_path when
multipathd is not running.
Remove the WWID for the specified device from the WWIDs file.
Reset the WWIDs file to only include the current multipath devices.
Force new maps to use the specified policy. Existing maps are not modified.
One path per priority group.
All paths in one priority group.
One priority group per serial.
One priority group per priority value. Priorities are determined by callout programs specified as a global, per-controller or per-multipath option in the configuration file.
One priority group per target node name. Target node names are
fetched in
/sys/class/fc_transport/target*/node_name.
Update only the device map for the specified device. Specify the name
as the device path such as /dev/sdb, or in
major:minor
format, or the multipath map name.
Configure all multipath devices.
Configures a specific multipath device.
Replace devicename with the device node name
such as /dev/sdb (as shown by udev in the $DEVNAME
variable), or in the major:minor format. The device
may alternatively be a multipath map name.
Selectively suppresses a multipath map, and its device-mapped partitions.
Dry run. Displays potential multipath devices, but does not create any devices and does not update device maps.
Displays multipath map information for potential multipath devices in a dry run. The -v2 option shows only local disks. This verbosity level prints the created or updated multipath names only for use to feed other tools like kpartx.
There is no output if the devices already exists and there are no
changes. Use multipath -ll to see the status of
configured multipath devices.
Configures a specific potential multipath device and displays multipath
map information for it. This verbosity level prints only the created or
updated multipath names for use to feed other tools like
kpartx.
There is no output if the device already exists and there are no
changes. Use multipath -ll to see the status of
configured multipath devices.
Replace devicename with the device node name
such as /dev/sdb (as shown by
udev in the $DEVNAME variable), or in the
major:minor format. The device may alternatively be
a multipath map name.
Configures potential multipath devices and displays multipath map information for them. This verbosity level prints all detected paths, multipaths, and device maps. Both wwid and devnode blacklisted devices are displayed.
Configures a specific potential multipath device and displays information for it. The -v3 option shows the full path list. This verbosity level prints all detected paths, multipaths, and device maps. Both wwid and devnode blacklisted devices are displayed.
Replace devicename with the device node name
such as /dev/sdb (as shown by
udev in the $DEVNAME variable), or in the
major:minor format. The device may alternatively be
a multipath map name.
Display the status of all multipath devices.
Displays the status of a specified multipath device.
Replace devicename with the device node name
such as /dev/sdb (as shown by
udev in the $DEVNAME variable), or in the
major:minor format. The device may alternatively be
a multipath map name.
Flushes all unused multipath device maps. This unresolves the multiple paths; it does not delete the devices.
Flushes unused multipath device maps for a specified multipath device. This unresolves the multiple paths; it does not delete the device.
Replace devicename with the device node name
such as /dev/sdb (as shown by
udev in the $DEVNAME variable), or in the
major:minor format. The device may alternatively be
a multipath map name.
Sets the group policy by specifying one of the group policy options that are described in Table 7.3, “Group Policy Options for the multipath -p Command”:
|
Policy Option |
Description |
|---|---|
|
failover |
(Default) One path per priority group. You can use only one path at a time. |
|
multibus |
All paths in one priority group. |
|
group_by_serial |
One priority group per detected SCSI serial number (the controller node worldwide number). |
|
group_by_prio |
One priority group per path priority value. Paths with the same
priority are in the same priority group. Priorities are determined
by callout programs specified as a global, per-controller, or
per-multipath option in the
|
|
group_by_node_name |
One priority group per target node name. Target node names are
fetched in the |
The mpathpersist(8) utility can be used to manage SCSI
persistent reservations on Device Mapper Multipath devices.
General syntax for the mpathpersist(8) command:
mpathpersist [options] [device]
Use this utility with the service action reservation key
(reservation_key attribute) in the
/etc/multipath.conf file to set persistent
reservations for SCSI devices. The attribute is not used by default. If it
is not set, the multipathd daemon does not check for
persistent reservation for newly discovered paths or reinstated paths.
reservation_key <reservation key>
You can add the attribute to the defaults section or
the multipaths section. For example:
multipaths {
multipath {
wwid XXXXXXXXXXXXXXXX
alias yellow
reservation_key 0x123abc
}
}
Set the reservation_key parameter for all mpath devices
applicable for persistent management, then restart the
multipathd daemon. After it is set up, you can specify
the reservation key in the mpathpersist commands.
Outputs the command usage information, then exits.
Query or change the device.
Display the output response in hexadecimal.
Transport IDs can be mentioned in several forms.
Specifies the verbosity level.
Critical and error messages.
Warning messages.
Informational messages.
Informational messages with trace enabled.
Request PR In command.
PR In: Read Keys.
PR In: Read Full Status.
PR In: Read Reservation.
PR In: Report Capabilities.
Request PR Out command.
PR Out: Clear
PR Out parameter APTPL.
PR Out parameter Service Action Reservation Key (SARK) in hexadecimal.
PR Out: Preempt.
PR Out: Preempt and Abort.
PR Out command type.
PR Out: Register.
PR Out: Register and Ignore.
PR Out: Release
PR Out: Reserve.
Register the Service Action Reservation Key for the
/dev/mapper/mpath9 device.
mpathpersist --out --register --param-sark=123abc --prout-type=5 -d /dev/mapper/mpath9
Read the Service Action Reservation Key for the
/dev/mapper/mpath9 device.
mpathpersisst -i -k -d /dev/mapper/mpath9
Reserve the Service Action Reservation Key for the
/dev/mapper/mpath9 device.
mpathpersist --out --reserve --param-sark=123abc --prout-type=8 -d /dev/mapper/mpath9
Read the reservation status of the /dev/mapper/mpath9
device.
mpathpersist -i -s -d /dev/mapper/mpath9
Before configuring multipath I/O for your SAN devices, prepare the SAN devices, as necessary, by doing the following:
Configure and zone the SAN with the vendor’s tools.
Configure permissions for host LUNs on the storage arrays with the vendor’s tools.
Install the Linux HBA driver module. Upon module installation, the driver automatically scans the HBA to discover any SAN devices that have permissions for the host. It presents them to the host for further configuration.
Ensure that the HBA driver you are using does not have native multipathing enabled.
See the vendor’s specific instructions for more details.
After the driver module is loaded, discover the device nodes assigned to specific array LUNs or partitions.
If the SAN device will be used as the root device on the server, modify the timeout settings for the device as described in Section 7.2.8, “SAN Timeout Settings When the Root Device Is Multipathed”.
If the LUNs are not seen by the HBA driver, lsscsi can
be used to check whether the SCSI devices are seen correctly by the
operating system. When the LUNs are not seen by the HBA driver, check the
zoning setup of the SAN. In particular, check whether LUN masking is
active and whether the LUNs are correctly assigned to the server.
If the LUNs are seen by the HBA driver, but there are no corresponding block devices, additional kernel parameters are needed to change the SCSI device scanning behavior, such as to indicate that LUNs are not numbered consecutively. For information, see TID 3955167: Troubleshooting SCSI (LUN) Scanning Issues in the Novell Support Knowledgebase.
Partitioning devices that have multiple paths is not recommended, but it
is supported. You can use the kpartx tool to create
partitions on multipath devices without rebooting. You can also partition
the device before you attempt to configure multipathing by using the
Partitioner function in YaST, or by using a third-party partitioning tool.
Multipath devices are device-mapper devices. Modifying device-mapper devices with command line tools (such as parted, kpartx, or fdisk) works, but it does not necessarily generate the udev events that are required to update other layers. After you partition the device-mapper device, you should check the multipath map to make sure the device-mapper devices were mapped. If they are missing, you can remap the multipath devices or reboot the server to pick up all of the new partitions in the multipath map.
The device-mapper device for a partition on a multipath device is not the same as an independent device. When you create an LVM logical volume using the whole device, you must specify a device that contains no partitions. If you specify a multipath partition as the target device for the LVM logical volume, LVM recognizes that the underlying physical device is partitioned and the create fails. If you need to subdivide a SAN device, you can carve LUNs on the SAN device and present each LUN as a separate multipath device to the server.
The server must be manually configured to automatically load the device
drivers for the controllers to which the multipath I/O devices are
connected within the initrd. You need to add the
necessary driver module to the variable INITRD_MODULES in the file
/etc/sysconfig/kernel.
For example, if your system contains a RAID controller accessed by the
cciss driver and multipathed devices connected to a
QLogic controller accessed by the driver qla2xxx, this entry would look
like:
INITRD_MODULES="cciss"
Because the QLogic driver is not automatically loaded on startup, add it here:
INITRD_MODULES="cciss qla23xx"
After changing /etc/sysconfig/kernel, you must
re-create the initrd on your system with the
mkinitrd command, then reboot in order for the changes
to take effect.
When you are using LILO as a boot manager, reinstall it with the
/sbin/lilo command. No further action is required if
you are using GRUB.
In SUSE Linux Enterprise Server 11 SP3 and later, four SCSI hardware handlers were added in the SCSI layer that can be used with DM-Multipath:
scsi_dh_alua |
scsi_dh_rdac |
scsi_dh_hp_sw |
scsi_dh_emc |
Add the modules to the initrd image, then specify
them in the /etc/multipath.conf file as hardware
handler types alua, rdac,
hp_sw, and emc. For example, add one
of these lines for a device definition:
hardware_handler "1 alua" hardware_handler "1 rdac" hardware_handler "1 hp_sw" hardware_handler "1 emc"
To include the modules in the initrd image:
Add the device handler modules to the INITRD_MODULES variable in
/etc/sysconfig/kernel.
Create a new initrd:
mkinitrd -k /boot/vmlinux-<flavour> \ -i /boot/initrd-<flavour>-scsi-dh \ -M /boot/System.map-<flavour>
Update the boot configuration file (grub.conf,
lilo.conf, yaboot.conf) with
the newly built initrd.
Restart the server.
Use either of the methods in this section to add multipath I/O services
(multipathd) to the boot sequence.
In YaST, click > > .
Select , then click .
Click to acknowledge the service startup message.
Click , then click .
The changes do not take affect until the server is restarted.
Open a terminal console, then log in as the
root user or equivalent.
At the terminal console prompt, enter
insserv multipathd
To start multipath services and enable them to start at reboot:
Open a terminal console, then log in as the root
user or equivalent.
At the terminal console prompt, enter
chkconfig multipathd on
chkconfig boot.multipath on
If the boot.multipath service does not start automatically on system boot, do the following to start them manually:
Open a terminal console, then log in as the root
user or equivalent.
Enter
/etc/init.d/boot.multipath start
/etc/init.d/multipathd start
The /etc/multipath.conf file does not exist unless you
create it. Default multipath device settings are applied automatically when
the multipathd daemon runs unless you create the
multipath configuration file and personalize the settings. The
/usr/share/doc/packages/multipath-tools/multipath.conf.synthetic
file contains a sample /etc/multipath.conf file that
you can use as a guide for multipath settings.
Whenever you create or modify the /etc/multipath.conf
file, the changes are not automatically applied when you save the file.
This allows you time to perform a dry run to verify your changes before
they are committed. When you are satisfied with the revised settings, you
can update the multipath maps for the running multipathd daemon to use, or
the changes will be applied the next time that the multipathd daemon is
restarted, such as on a system restart.
If the /etc/multipath.conf file does not exist, copy
the example to create the file:
In a terminal console, log in as the root user.
Enter the following command (all on one line, of course) to copy the template:
cp /usr/share/doc/packages/multipath-tools/multipath.conf.synthetic /etc/multipath.conf
Use the
/usr/share/doc/packages/multipath-tools/multipath.conf.annotated
file as a reference to determine how to configure multipathing for your
system.
Ensure that there is an appropriate device entry for
your SAN. Most vendors provide documentation on the proper setup of the
device section.
The /etc/multipath.conf file requires a different
device section for different SANs. If you are using a
storage subsystem that is automatically detected (see
Section 7.2.11.2, “Tested Storage Arrays for Multipathing Support”), the default
entry for that device can be used; no further configuration of the
/etc/multipath.conf file is required.
Save the file.
The /etc/multipath.conf file is organized in the
following sections. See
/usr/share/doc/packages/multipath-tools/multipath.conf.annotated
for a template with extensive comments for each of the attributes and
their options.
General default settings for multipath I/0. These values are used if no values are given in the appropriate device or multipath sections. For information, see Section 7.7, “Configuring Default Policies for Polling, Queueing, and Failback”.
Lists the device names to discard as not multipath candidates. Devices
can be identified by their device node name
(devnode), their WWID (wwid), or
their vendor or product strings (device). For
information, see Section 7.8, “Blacklisting Non-Multipath Devices”.
You typically ignore non-multipathed devices, such as
cciss, fd,
hd, md,
dm, sr,
scd, st,
ram, raw, and
loop.
Values.
For an example, see Section 7.8, “Blacklisting Non-Multipath Devices”.
Lists the device names of devices to be treated as multipath candidates
even if they are on the blacklist. Devices can be identified by their
device node name (devnode), their WWID
(wwid), or their vendor or product strings
(device). You must specify the excepted devices by
using the same keyword that you used in the blacklist. For example, if
you used the devnode keyword for devices in the blacklist, you use the
devnode keyword to exclude some of the devices in the blacklist
exceptions. It is not possible to blacklist devices by using the
devnode keyword and to exclude some devices of them
by using the wwid keyword.
Values.
For examples, see Section 7.8, “Blacklisting Non-Multipath Devices”
and the
/usr/share/doc/packages/multipath-tools/multipath.conf.annotated
file.
Specifies settings for individual multipath devices. Except for
settings that do not support individual settings, these values
overwrite what is specified in the defaults and
devices sections of the configuration file.
Specifies settings for individual storage controllers. These values
overwrite values specified in the defaults section
of the configuration file. If you use a storage array that is not
supported by default, you can create a devices
subsection to specify the default settings for it. These values can be
overwritten by settings for individual multipath devices if the keyword
allows it.
For information, see the following:
Whenever you create or modify the /etc/multipath.conf
file, the changes are not automatically applied when you save the file.
You can perform a “dry run” of the setup to verify the
multipath setup before you update the multipath maps.
At the server command prompt, enter
multipath -v2 -d
This command scans the devices, then displays what the setup would look
like if you commit the changes. It is assumed that the
multipathd daemon is already running with the old (or
default) multipath settings when you modify the
/etc/multipath.conf file and perform the dry run. If
the changes are acceptable, continue with
Section 7.6.4, “Applying the /etc/multipath.conf File Changes to Update the Multipath Maps”.
The output is similar to the following:
26353900f02796769 [size=127 GB] [features="0"] [hwhandler="1 emc"]
\_ round-robin 0 [first] \_ 1:0:1:2 sdav 66:240 [ready ] \_ 0:0:1:2 sdr 65:16 [ready ]
\_ round-robin 0 \_ 1:0:0:2 sdag 66:0 [ready ] \_ 0:0:0:2 sdc 8:32 [ready ]
Paths are grouped into priority groups. Only one priority group is in active use at a time. To model an active/active configuration, all paths end in the same group. To model active/passive configuration, the paths that should not be active in parallel are placed in several distinct priority groups. This normally happens automatically on device discovery.
The output shows the order, the scheduling policy used to balance I/O within the group, and the paths for each priority group. For each path, its physical address (host:bus:target:lun), device node name, major:minor number, and state is shown.
By using a verbosity level of -v3 in the dry run, you can see all detected paths, multipaths, and device maps. Both WWID and device node blacklisted devices are displayed.
multipath -v3 d
The following is an example of -v3 output on a 64-bit SLES 11 SP2 server with two Qlogic HBA connected to a Xiotech Magnitude 3000 SAN. Some multiple entries have been omitted to shorten the example.
dm-22: device node name blacklisted < content omitted > loop7: device node name blacklisted < content omitted > md0: device node name blacklisted < content omitted > dm-0: device node name blacklisted sdf: not found in pathvec sdf: mask = 0x1f sdf: dev_t = 8:80 sdf: size = 105005056 sdf: subsystem = scsi sdf: vendor = XIOtech sdf: product = Magnitude 3D sdf: rev = 3.00 sdf: h:b:t:l = 1:0:0:2 sdf: tgt_node_name = 0x202100d0b2028da sdf: serial = 000028DA0014 sdf: getuid= "/lib/udev/scsi_id --whitelisted --device=/dev/%n" (config file default) sdf: uid = 200d0b2da28001400 (callout) sdf: prio = const (config file default) sdf: const prio = 1 < content omitted > ram15: device node name blacklisted < content omitted > ===== paths list ===== uuid hcil dev dev_t pri dm_st chk_st vend/prod/rev 200d0b2da28001400 1:0:0:2 sdf 8:80 1 [undef][undef] XIOtech,Magnitude 3D 200d0b2da28005400 1:0:0:1 sde 8:64 1 [undef][undef] XIOtech,Magnitude 3D 200d0b2da28004d00 1:0:0:0 sdd 8:48 1 [undef][undef] XIOtech,Magnitude 3D 200d0b2da28001400 0:0:0:2 sdc 8:32 1 [undef][undef] XIOtech,Magnitude 3D 200d0b2da28005400 0:0:0:1 sdb 8:16 1 [undef][undef] XIOtech,Magnitude 3D 200d0b2da28004d00 0:0:0:0 sda 8:0 1 [undef][undef] XIOtech,Magnitude 3D params = 0 0 2 1 round-robin 0 1 1 8:80 1000 round-robin 0 1 1 8:32 1000 status = 2 0 0 0 2 1 A 0 1 0 8:80 A 0 E 0 1 0 8:32 A 0 sdf: mask = 0x4 sdf: path checker = directio (config file default) directio: starting new request directio: async io getevents returns 1 (errno=Success) directio: io finished 4096/0 sdf: state = 2 < content omitted >
Changes to the /etc/multipath.conf file cannot take
effect when multipathd is running. After you make
changes, save and close the file, then do the following to apply the
changes and update the multipath maps:
Stop the multipathd service.
Clear old multipath bindings by entering
/sbin/multipath -F
Create new multipath bindings by entering
/sbin/multipath -v2 -l
Start the multipathd service.
Run mkinitrd to re-create the
initrd image on your system, then reboot in order
for the changes to take effect.
The goal of multipath I/O is to provide connectivity fault tolerance between the storage system and the server. The desired default behavior depends on whether the server is a standalone server or a node in a high-availability cluster.
When you configure multipath I/O for a stand-alone server, the
no_path_retry setting protects the server operating
system from receiving I/O errors as long as possible. It queues messages
until a multipath failover occurs and provides a healthy connection.
When you configure multipath I/O for a node in a high-availability cluster,
you want multipath to report the I/O failure in order to trigger the
resource failover instead of waiting for a multipath failover to be
resolved. In cluster environments, you must modify the
no_path_retry setting so that the cluster node receives
an I/O error in relation to the cluster verification process (recommended
to be 50% of the heartbeat tolerance) if the connection is lost to the
storage system. In addition, you want the multipath I/O fail back to be set
to manual in order to avoid a ping-pong of resources because of path
failures.
The /etc/multipath.conf file should contain a
defaults section where you can specify default behaviors
for polling, queueing, and failback. If the field is not otherwise
specified in a device section, the default setting is
applied for that SAN configuration.
The following are the compiled in default settings. They will be used
unless you overwrite these values by creating and configuring a
personalized /etc/multipath.conf file.
defaults {
verbosity 2
# udev_dir is deprecated in SLES 11 SP3 and later
# udev_dir /dev
polling_interval 5
# path_selector default value is service-time in SLES 11 SP3 and later
# path_selector "round-robin 0"
path selector "service-time 0"
path_grouping_policy failover
# getuid_callout is deprecated in SLES 11 SP3 and later and replaced with
# uid_attribute
# getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
# uid_attribute is new in SLES 11 SP3
uid_attribute "ID_SERIAL"
prio "const"
prio_args ""
features "0"
path_checker "directio"
alias_prefix "mpath"
rr_min_io_rq 1
max_fds "max"
rr_weight "uniform"
queue_without_daemon "yes"
flush_on_last_del "no"
user_friendly_names "no"
fast_io_fail_tmo 5
bindings_file "/etc/multipath/bindings"
wwids_file "/etc/multipath/wwids"
log_checker_err "always"
retain_attached_hw_handler "no"
detect_prio "no"
failback "manual"
no_path_retry "fail"
}For information about setting the polling, queuing, and failback policies, see the following parameters in Section 7.11, “Configuring Path Failover Policies and Priorities”:
If you modify the settings in the defaults section, the
changes are not applied until you update the multipath maps, or until the
multipathd daemon is restarted, such as at system restart.
The /etc/multipath.conf file should contain a
blacklist section where all non-multipath devices are
listed. You can blacklist devices by WWID (wwid
keyword), device name (devnode keyword), or device type
(device section). You can also use the
blacklist_exceptions section to enable multipath for
some devices that are blacklisted by the regular expressions used in the
blacklist section.
You typically ignore non-multipathed devices, such as
cciss, fd,
hd, md, dm,
sr, scd, st,
ram, raw, and
loop. For example, local IDE hard drives and floppy
drives do not normally have multiple paths. If you want
multipath to ignore single-path devices, put them in the
blacklist section.
The keyword devnode_blacklist has been deprecated and
replaced with the keyword blacklist.
For example, to blacklist local devices and all arrays from the
cciss driver from being managed by multipath, the
blacklist section looks like this:
blacklist {
wwid "26353900f02796769"
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st|sda)[0-9]*"
devnode "^hd[a-z][0-9]*"
devnode "^cciss.c[0-9]d[0-9].*"
}
You can also blacklist only the partitions from a driver instead of the
entire array. For example, you can use the following regular expression to
blacklist only partitions from the cciss driver and
not the entire array:
blacklist {
devnode "^cciss.c[0-9]d[0-9]*[p[0-9]*]"
}
You can blacklist by specific device types by adding a
device section in the blacklist, and using the
vendor and product keywords.
blacklist {
device {
vendor "DELL"
product "*"
}
}
You can use a blacklist_exceptions section to enable
multipath for some devices that were blacklisted by the regular expressions
used in the blacklist section. You add exceptions by
WWID (wwid keyword), device name
(devnode keyword), or device type
(device section). You must specify the exceptions in the
same way that you blacklisted the corresponding devices. That is,
wwid exceptions apply to a wwid
blacklist, devnode exceptions apply to a
devnode blacklist, and device type exceptions apply to a
device type blacklist.
For example, you can enable multipath for a desired device type when you
have different device types from the same vendor. Blacklist all of the
vendor’s device types in the blacklist section, and
then enable multipath for the desired device type by adding a
device section in a
blacklist_exceptions section.
blacklist {
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st|sda)[0-9]*"
device {
vendor "DELL"
product "*"
}
}
blacklist_exceptions {
device {
vendor "DELL"
product "MD3220i"
}
}You can also use the blacklist_exceptions to enable multipath only for specific devices. For example:
blacklist {
wwid "*"
}
blacklist_exceptions {
wwid "3600d0230000000000e13955cc3751234"
wwid "3600d0230000000000e13955cc3751235"
}
After you modify the /etc/multipath.conf file, you
must run mkinitrd to re-create the
initrd on your system, then restart the server in
order for the changes to take effect.
After you do this, the local devices should no longer be listed in the
multipath maps when you issue the multipath -ll command.
A multipath device can be identified by its WWID, by a user-friendly name, or by an alias that you assign for it. Before you begin, review the requirements in Section 7.2.3, “Using WWID, User-Friendly, and Alias Names for Multipathed Devices”.
Because device node names in the form of /dev/sdn and
/dev/dm-n can change on reboot, referring to
multipath devices by their WWID is preferred. You can also use a
user-friendly name or alias that is mapped to the WWID in order to
identify the device uniquely across reboots.
Table 7.4, “Comparison of Multipath Device Name Types” describes the types of
device names that can be used for a device in the
/etc/multipath.conf file. For an example of
multipath.conf settings, see the
/usr/share/doc/packages/multipath-tools/multipath.conf.synthetic
file.
|
Name Types |
Description |
|---|---|
|
WWID (default) |
The serial WWID (Worldwide Identifier) is an identifier for the
multipath device that is guaranteed to be globally unique and
unchanging. The default name used in multipathing is the ID of the
logical unit as found in the |
|
User-friendly |
The Device Mapper Multipath device names in the
|
|
Alias |
An alias name is a globally unique name that the administrator
provides for a multipath device. Alias names override the WWID and the
user-friendly If you are using user_friendly_names, do not set the alias to mpathN format. This may conflict with an automatically assigned user friendly name, and give you incorrect device node names. |
The global multipath user_friendly_names option in the
/etc/multipath.conf file is used to enable or disable
the use of user-friendly names for multipath devices. If it is set to
“no” (the default), multipath uses the WWID as the name of the device.
If it is set to “yes”, multipath uses the
/var/lib/multipath/bindings file to assign a
persistent and unique name to the device in the form of
mpath<n> in the
/dev/mapper directory. The bindings
file option in the /etc/multipath.conf file
can be used to specify an alternate location for the
bindings file.
The global multipath alias option in the
/etc/multipath.conf file is used to explicitly assign
a name to the device. If an alias name is set up for a multipath device,
the alias is used instead of the WWID or the user-friendly name.
Using the user_friendly_names option can be problematic
in the following situations:
Root Device Is Using Multipath:
If the system root device is using multipath and you use the
user_friendly_names option, the user-friendly
settings in the /var/lib/multipath/bindings file
are included in the initrd. If you later change the
storage setup, such as by adding or removing devices, there is a
mismatch between the bindings setting inside the
initrd and the bindings settings in
/var/lib/multipath/bindings.
A bindings mismatch between initrd and
/var/lib/multipath/bindings can lead to a wrong
assignment of mount points to devices, which can result in file system
corruption and data loss.
To avoid this problem, we recommend that you use the default WWID settings for the system root device. You should not use aliases for the system root device. Because the device name would differ, using an alias causes you to lose the ability to seamlessly switch off multipathing via the kernel command line.
Mounting /var from Another Partition:
The default location of the user_friendly_names
configuration file is /var/lib/multipath/bindings.
If the /var data is not located on the system root
device but mounted from another partition, the
bindings file is not available when setting up
multipathing.
Ensure that the /var/lib/multipath/bindings file is
available on the system root device and multipath can find it. For
example, this can be done as follows:
Move the /var/lib/multipath/bindings file to
/etc/multipath/bindings.
Set the bindings_file option in the
defaults section of
/etc/multipath.conf to this new location. For
example:
defaults {
user_friendly_names yes
bindings_file "/etc/multipath/bindings"
}Multipath Is in the initrd:
Even if the system root device is not on multipath, it is possible for
multipath to be included in the initrd. For
example, this can happen of the system root device is on LVM. If you use
the user_friendly_names option and multipath is in
the initrd, you should boot with the parameter
multipath=off to avoid problems.
This disables multipath only in the initrd during
system boots. After the system boots, the
boot.multipath and multipathd
boot scripts are able to activate multipathing.
To enable user-friendly names or to specify aliases:
In a terminal console, log in as the root user.
Open the /etc/multipath.conf file in a text editor.
(Optional) Modify the location of the
/var/lib/multipath/bindings file.
The alternate path must be available on the system root device where multipath can find it.
Move the /var/lib/multipath/bindings file to
/etc/multipath/bindings.
Set the bindings_file option in the
defaults section of
/etc/multipath.conf to this new location. For
example:
defaults {
user_friendly_names yes
bindings_file "/etc/multipath/bindings"
}(Optional, not recommended) Enable user-friendly names:
Uncomment the defaults section and its ending
bracket.
Uncomment the user_friendly_names option, then
change its value from No to Yes.
For example:
## Use user friendly names, instead of using WWIDs as names.
defaults {
user_friendly_names yes
}
(Optional) Specify your own names for devices by using the
alias option in the multipath
section.
For example:
## Use alias names, instead of using WWIDs as names.
multipaths {
multipath {
wwid 36006048000028350131253594d303030
alias blue1
}
multipath {
wwid 36006048000028350131253594d303041
alias blue2
}
multipath {
wwid 36006048000028350131253594d303145
alias yellow1
}
multipath {
wwid 36006048000028350131253594d303334
alias yellow2
}
}Save your changes, then close the file.
The changes are not applied until you update the multipath maps, or until the multipathd daemon is restarted, such as at system restart.
Testing of the IBM zSeries device with multipathing has shown that the
dev_loss_tmo parameter should be set to 90 seconds, and the
fast_io_fail_tmo parameter should be set to 5 seconds. If you are using
zSeries devices, modify the /etc/multipath.conf file
to specify the values as follows:
defaults {
dev_loss_tmo 90
fast_io_fail_tmo 5
}The dev_loss_tmo parameter sets the number of seconds to wait before marking a multipath link as bad. When the path fails, any current I/O on that failed path fails. The default value varies according to the device driver being used. The valid range of values is 0 to 600 seconds. To use the driver’s internal timeouts, set the value to zero (0) or to any value greater than 600.
The fast_io_fail_tmo parameter sets the length of time to wait before failing I/O when a link problem is detected. I/O that reaches the driver fails. If I/O is in a blocked queue, the I/O does not fail until the dev_loss_tmo time elapses and the queue is unblocked.
If you modify the /etc/multipath.conf file, the
changes are not applied until you update the multipath maps, or until the
multipathd daemon is restarted, such as at system restart.
In a Linux host, when there are multiple paths to a storage controller,
each path appears as a separate block device, and results in multiple block
devices for single LUN. The Device Mapper Multipath service detects
multiple paths with the same LUN ID, and creates a new multipath device
with that ID. For example, a host with two HBAs attached to a storage
controller with two ports via a single unzoned Fibre Channel switch sees
four block devices: /dev/sda,
/dev/sdb, /dev/sdc, and
/dev/sdd. The Device Mapper Multipath service creates
a single block device, /dev/mpath/mpath1 that reroutes
I/O through those four underlying block devices.
This section describes how to specify policies for failover and configure priorities for the paths.
Use the multipath command with the -p option to set the
path failover policy:
multipath devicename -p policy
Replace policy with one of the following policy options:
|
Policy Option |
Description |
|---|---|
|
failover |
(Default) One path per priority group. |
|
multibus |
All paths in one priority group. |
|
group_by_serial |
One priority group per detected serial number. |
|
group_by_prio |
One priority group per path priority value. Priorities are determined
by callout programs specified as a global, per-controller, or
per-multipath option in the |
|
group_by_node_name |
One priority group per target node name. Target node names are
fetched in the |
You must manually enter the failover priorities for the device in the
/etc/multipath.conf file. Examples for all settings
and options can be found in the
/usr/share/doc/packages/multipath-tools/multipath.conf.annotated
file.
If you modify the /etc/multipath.conf file, the
changes are not automatically applied when you save the file. For
information, see
Section 7.6.3, “Verifying the Multipath Setup in the /etc/multipath.conf File” and
Section 7.6.4, “Applying the /etc/multipath.conf File Changes to Update the Multipath Maps”.
A priority group is a collection of paths that go to
the same physical LUN. By default, I/O is distributed in a round-robin
fashion across all paths in the group. The multipath
command automatically creates priority groups for each LUN in the SAN
based on the path_grouping_policy setting for that
SAN. The multipath command multiplies the number of
paths in a group by the group’s priority to determine which group is
the primary. The group with the highest calculated value is the primary.
When all paths in the primary group are failed, the priority group with
the next highest value becomes active.
A path priority is an integer value assigned to a path. The higher the value, the higher the priority is. An external program is used to assign priorities for each path. For a given device, the paths with the same priorities belong to the same priority group.
Multipath Tools 0.4.9 for SLES 11 SP2 uses the prio
setting in the defaults{} or
devices{} section of the
/etc/multipath.conf file. It silently ignores the
keyword prio when it is specified for an individual
multipath definition in the
multipaths{) section. Multipath Tools 0.4.8 for SLES
11 SP1 and earlier allows the prio setting in the individual
multipath definition in the
multipaths{) section to override the
prio settings in the defaults{} or
devices{} section.
The syntax for the prio keyword in the
/etc/multipath.conf file is changed in
multipath-tools-0.4.9. The prio
line specifies the prioritizer. If the prioritizer requires an argument,
you specify the argument by using the prio_args
keyword on a second line. Previously, the prioritizer and its arguments
were included on the prio line.
Specifies the prioritizer program to call to obtain a path priority value. Weights are summed for each path group to determine the next path group to use in case of failure.
Use the prio_args keyword to specify arguments if
the specified prioritizer requires arguments.
Values.
If no prio keyword is specified, all paths are
equal. The default setting is “const” with a
prio_args setting with no value.
prio "const" prio_args ""
Example prioritizer programs include:
|
Prioritizer Program |
Description |
|---|---|
|
alua |
Generates path priorities based on the SCSI-3 ALUA settings. |
|
const |
Generates the same priority for all paths. |
|
emc |
Generates the path priority for EMC arrays. |
|
hdc |
Generates the path priority for Hitachi HDS Modular storage arrays. |
|
hp_sw |
Generates the path priority for Compaq/HP controller in active/standby mode. |
|
ontap |
Generates the path priority for NetApp arrays. |
|
random |
Generates a random priority for each path. |
|
rdac |
Generates the path priority for LSI/Engenio RDAC controller. |
|
weightedpath |
Generates the path priority based on the weighted values you
specify in the arguments for <hbtl|devname> <regex1> <prio1> <regex2> <prio2>...
The prio "weightedpath" prio_args "hbtl 1:.:.:. 2 4:.:.:. 4" The devname regex argument format uses a device node name with a weight value for each device. For example: prio "weightedpath" prio_args "devname sda 50 sde 10 sdc 50 sdf 10" |
Specifies the arguments for the specified prioritizer program that
requires arguments. Most prio programs do not need
arguments.
Values.
There is no default. The value depends on the prio
setting and whether the prioritizer requires arguments.
prio "const" prio_args ""
Multipath attributes are used to control the behavior of multipath I/O
for devices. You can specify attributes as defaults for all multipath
devices. You can also specify attributes that apply only to a given
multipath device by creating an entry for that device in the
multipaths section of the multipath configuration
file.
Specifies whether to use world-wide IDs (WWIDs) or to use the
/var/lib/multipath/bindings file to assign a
persistent and unique alias to the multipath devices in the form of
/dev/mapper/mpathN.
This option can be used in the devices section and
the multipaths section.
Values.
|
Value |
Description |
|---|---|
|
no |
(Default) Use the WWIDs shown in the
|
|
yes |
Autogenerate user-friendly names as aliases for the multipath devices instead of the actual ID. |
Specifies whether to monitor the failed path recovery, and indicates the timing for group failback after failed paths return to service.
When the failed path recovers, the path is added back into the multipath enabled path list based on this setting. Multipath evaluates the priority groups, and changes the active priority group when the priority of the primary path exceeds the secondary group.
Values.
|
Value |
Description |
|---|---|
|
manual |
(Default) The failed path is not monitored for recovery. The
administrator runs the |
|
immediate |
When a path recovers, enable the path immediately. |
|
n |
When the path recovers, wait n seconds before enabling the path. Specify an integer value greater than 0. |
We recommend failback setting of “manual” for multipath in cluster environments in order to prevent multipath failover ping-pong.
failback "manual"
Ensure that you verify the failback setting with your storage system vendor. Different storage systems can require different settings.
The default program and arguments to call to obtain a unique path identifier. Specify the location with an absolute Linux path.
This attribute is deprecated in SLES 11 SP3 and later. It is
replaced by the uid_attribute
(uid_attribute).
Values.
The default location and arguments are:
/lib/udev/scsi_id -g -u -s
Example:
getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n" getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
Specifies the behaviors to use on path failure.
Values.
|
Value |
Description |
|---|---|
|
n |
Specifies the number of retries until
In a cluster, you can specify a value of “0” to prevent queuing and allow resources to fail over. |
|
fail |
Specifies immediate failure (no queuing). |
|
queue |
Never stop queuing (queue forever until the path comes alive). |
We recommend a retry setting of “fail” or “0” in the
/etc/multipath.conf file when working in a
cluster. This causes the resources to fail over when the connection is
lost to storage. Otherwise, the messages queue and the resource
failover cannot occur.
no_path_retry "fail" no_path_retry "0"
Ensure that you verify the retry settings with your storage system vendor. Different storage systems can require different settings.
Determines the state of the path.
Values.
|
Value |
Description |
|---|---|
|
directio |
(Default in |
|
readsector0 |
(Default in |
|
tur |
Issues a SCSI test unit ready command to the device. This is the
preferred setting if the LUN supports it. On failure, the command
does not fill up |
|
custom_vendor_value |
Some SAN vendors provide custom path_checker options:
|
Specifies the path grouping policy for a multipath device hosted by a given controller.
Values.
|
Value |
Description |
|---|---|
|
failover |
(Default) One path is assigned per priority group so that only one path at a time is used. |
|
multibus |
All valid paths are in one priority group. Traffic is load-balanced across all active paths in the group. |
|
group_by_prio |
One priority group exists for each path priority value. Paths with the same priority are in the same priority group. Priorities are assigned by an external program. |
|
group_by_serial |
Paths are grouped by the SCSI target serial number (controller node WWN). |
|
group_by_node_name |
One priority group is assigned per target node name. Target node
names are fetched in
|
Specifies the path-selector algorithm to use for load balancing.
Values.
|
Value |
Description |
|---|---|
|
round-robin 0 |
(Default in SLES 11 SP2 and earlier) The load-balancing algorithm used to balance traffic across all active paths in a priority group. |
|
queue-length 0 |
A dynamic load balancer that balances the number of in-flight I/O on paths similar to the least-pending option. |
|
service-time 0 |
(Default in SLES 11 SP3 and later) A service-time oriented load balancer that balances I/O on paths according to the latency. |
Specifies path group timeout handling.
Values.
NONE (internal default)
Specifies the time in seconds between the end of one path checking cycle and the beginning of the next path checking cycle.
Values.
Specify an integer value greater than 0. The default value is 5. Ensure that you verify the polling_interval setting with your storage system vendor. Different storage systems can require different settings.
Specifies the program and arguments to use to determine the layout of the multipath map.
Multipath prio_callout programs are located in shared libraries in
/lib/libmultipath/lib*. By using shared
libraries, the callout programs are loaded into memory on daemon
startup.
When queried by the multipath command, the
specified mpath_prio_* callout program returns the priority for a
given path in relation to the entire multipath layout.
When it is used with the path_grouping_policy of group_by_prio, all paths with the same priority are grouped into one multipath group. The group with the highest aggregate priority becomes the active group.
When all paths in a group fail, the group with the next highest aggregate priority becomes active. Additionally, a failover command (as determined by the hardware handler) might be send to the target.
The mpath_prio_* program can also be a custom script created by a vendor or administrator for a specified setup.
A %n in the command line expands to the device
name in the /dev directory.
A %b in the command line expands to the device
number in major:minor format in the
/dev directory.
A %d in the command line expands to the device ID
in the /dev/disk/by-id directory.
If devices are hot-pluggable, use the %d flag
instead of %n. This addresses the short time that
elapses between the time when devices are available and when
udev creates the device nodes.
Values.
|
Value |
Description |
|---|---|
|
(No value) |
If no |
|
/bin/true |
Specify this value when the group_by_prio is not being used. |
The prioritizer programs generate path priorities
when queried by the multipath command. The program
names must begin with mpath_prio_ and are named
by the device type or balancing method used. Current prioritizer
programs include the following:
|
Prioritizer Program |
Description |
|---|---|
|
mpath_prio_alua %n |
Generates path priorities based on the SCSI-3 ALUA settings. |
|
mpath_prio_balance_units |
Generates the same priority for all paths. |
|
mpath_prio_emc %n |
Generates the path priority for EMC arrays. |
|
mpath_prio_hds_modular %b |
Generates the path priority for Hitachi HDS Modular storage arrays. |
|
mpath_prio_hp_sw %n |
Generates the path priority for Compaq/HP controller in active/standby mode. |
|
mpath_prio_netapp %n |
Generates the path priority for NetApp arrays. |
|
mpath_prio_random %n |
Generates a random priority for each path. |
|
mpath_prio_rdac %n |
Generates the path priority for LSI/Engenio RDAC controller. |
|
mpath_prio_tpc %n |
You can optionally use a script created by a vendor or administrator that gets the priorities from a file where you specify priorities to use for each path. |
|
mpath_prio_spec.sh %n |
Provides the path of a user-created script that generates the priorities for multipathing based on information contained in a second data file. (This path and filename are provided as an example. Specify the location of your script instead.) The script can be created by a vendor or administrator. The script’s target file identifies each path for all multipathed devices and specifies a priority for each path. For an example, see Section 7.11.3, “Using a Script to Set Path Priorities”. |
Specifies the number of I/O transactions to route to a path before
switching to the next path in the same path group, as determined by
the specified algorithm in the path_selector
setting.
The rr_min_io attribute is used only for kernels 2.6.31 and earlier.
It is obsoleted in SLES 11 SP2 and replaced by the
rr_min_io_rq attribute.
Values.
Specify an integer value greater than 0. The default value is 1000.
rr_min_io "1000"
Specifies the number of I/O requests to route to a path before switching to the next path in the current path group, using request-based device-mapper-multipath.
This attribute is available for systems running SLES 11 SP2 and later. It replaces the rr_min_io attribute.
Values.
Specify an integer value greater than 0. The default value is 1.
rr_min_io_rq "1"
Specifies the weighting method to use for paths.
Values.
|
Value |
Description |
|---|---|
|
uniform |
(Default) All paths have the same round-robin weights. |
|
priorities |
Each path’s weight is determined by the path’s priority times the rr_min_io_rq setting (or the rr_min_io setting for kernels 2.6.31 and earlier). |
A udev attribute that provides a unique path identifier. The default
value is ID_SERIAL.
All paths are active. I/O is configured for some number of seconds or some number of I/O transactions before moving to the next open path in the sequence.
A single path with the highest priority (lowest value setting) is active for traffic. Other paths are available for failover, but are not used unless failover occurs.
Multiple paths with the same priority fall into the active group. When all paths in that group fail, the device fails over to the next highest priority group. All paths in the group share the traffic load in a round-robin load balancing fashion.
You can create a script that interacts with Device Mapper Multipath
(DM-MPIO) to provide priorities for paths to the LUN when set as a
resource for the prio_callout setting.
First, set up a text file that lists information about each device and the
priority values you want to assign to each path. For example, name the
file /usr/local/etc/primary-paths. Enter one line for
each path in the following format:
host_wwpn target_wwpn scsi_id priority_value
Return a priority value for each path on the device. Ensure that the variable FILE_PRIMARY_PATHS resolves to a real file with appropriate data (host wwpn, target wwpn, scsi_id and priority value) for each device.
The contents of the primary-paths file for a single
LUN with eight paths each might look like this:
0x10000000c95ebeb4 0x200200a0b8122c6e 2:0:0:0 sdb 3600a0b8000122c6d00000000453174fc 50
0x10000000c95ebeb4 0x200200a0b8122c6e 2:0:0:1 sdc 3600a0b80000fd6320000000045317563 2
0x10000000c95ebeb4 0x200200a0b8122c6e 2:0:0:2 sdd 3600a0b8000122c6d0000000345317524 50
0x10000000c95ebeb4 0x200200a0b8122c6e 2:0:0:3 sde 3600a0b80000fd6320000000245317593 2
0x10000000c95ebeb4 0x200300a0b8122c6e 2:0:1:0 sdi 3600a0b8000122c6d00000000453174fc 5
0x10000000c95ebeb4 0x200300a0b8122c6e 2:0:1:1 sdj 3600a0b80000fd6320000000045317563 51
0x10000000c95ebeb4 0x200300a0b8122c6e 2:0:1:2 sdk 3600a0b8000122c6d0000000345317524 5
0x10000000c95ebeb4 0x200300a0b8122c6e 2:0:1:3 sdl 3600a0b80000fd6320000000245317593 51
To continue the example mentioned in
prio_callout, create a script
named /usr/local/sbin/path_prio.sh. You can use any
path and filename. The script does the following:
On query from multipath, grep the device and its path from the
/usr/local/etc/primary-paths file.
Return to multipath the priority value in the last column for that entry in the file.
The mpath_prio_alua(8) command is used as a priority
callout for the Linux multipath(8) command. It returns
a number that is used by DM-MPIO to group SCSI devices with the same
priority together. This path priority tool is based on ALUA (Asynchronous
Logical Unit Access).
mpath_prio_alua [-d directory] [-h] [-v] [-V] device [device...]
SCSI devices.
Specifies the Linux directory path where the listed device node names
can be found. The default directory is /dev. When
you use this option, specify the device node name only (such as
sda) for the device or devices you want to
manage.
Displays help for this command, then exits.
Turns on verbose output to display status in human-readable format. Output includes information about which port group the specified device is in and its current state.
Displays the version number of this tool, then exits.
Specifies the SCSI device (or multiple devices) that you want to
manage. The device must be a SCSI device that supports the Report
Target Port Groups (sg_rtpg(8)) command. Use one of
the following formats for the device node name:
The full Linux directory path, such as
/dev/sda. Do not use with the -d option.
The device node name only, such as sda. Specify
the directory path by using the -d option.
The major and minor number of the device separated by a colon (:)
with no spaces, such as 8:0. This creates a
temporary device node in the /dev directory
with a name in the format of
tmpdev-<major>:<minor>-<pid>.
For example, /dev/tmpdev-8:0-<pid>.
On success, returns a value of 0 and the priority value for the group.
Table 7.6, “ALUA Priorities for Device Mapper Multipath” shows the priority values
returned by the mpath_prio_alua command.
|
Priority Value |
Description |
|---|---|
|
50 |
The device is in the active, optimized group. |
|
10 |
The device is in an active but non-optimized group. |
|
1 |
The device is in the standby group. |
|
0 |
All other groups. |
Values are widely spaced because of the way the
multipath command handles them. It multiplies the
number of paths in a group with the priority value for the group, then
selects the group with the highest result. For example, if a
non-optimized path group has six paths (6 x 10 = 60) and the optimized
path group has a single path (1 x 50 = 50), the non-optimized group has
the highest score, so multipath chooses the non-optimized group. Traffic
to the device uses all six paths in the group in a round-robin fashion.
On failure, returns a value of 1 to 5 indicating the cause for the
command’s failure. For information, see the man page for
mpath_prio_alua.
Use the SCSI Report Target Port Groups (sg_rtpg(8))
command. For information, see the man page for
sg_rtpg(8).
Device Mapper Multipath I/O (DM-MPIO) is available and supported for
/boot and /root in SUSE Linux Enterprise Server
11. In addition, the YaST Partitioner in the YaST installer supports
enabling multipath during the install.
The multipath software must be running at install time if you want to install the operating system on a multipath device. The multipathd daemon is not automatically active during the system installation. You can start it by using the option in the YaST Partitioner.
During the install on the YaST Installation Settings page, click on to open the YaST Partitioner.
Select .
Select the main icon, click the button, then select .
Start multipath.
YaST starts to rescan the disks and shows available multipath devices
(such as
/dev/disk/by-id/dm-uuid-mpath-3600a0b80000f4593000012ae4ab0ae65).
This is the device that should be used for all further processing.
Click to continue with the installation.
The multipathd daemon is not automatically active during the system installation. You can start it by using the option in the YaST Partitioner.
To enable multipath I/O at install time for an active/passive multipath storage LUN:
During the install on the YaST Installation Settings page, click on to open the YaST Partitioner.
Select .
Select the main icon, click the button, then select .
Start multipath.
YaST starts to rescan the disks and shows available multipath devices
(such as
/dev/disk/by-id/dm-uuid-mpath-3600a0b80000f4593000012ae4ab0ae65).
This is the device that should be used for all further processing.
Write down the device path and UUID; you need it later.
Click to continue with the installation.
After all settings are done and the installation finished, YaST starts to write the boot loader information, and displays a countdown to restart the system. Stop the counter by clicking the button and press CTRL+ALT+F5 to access a console.
Use the console to determine if a passive path was entered in the
/boot/grub/device.map file for the
hd0 entry.
This is necessary because the installation does not distinguish between active and passive paths.
Mount the root device to /mnt by entering
mount /dev/disk/by-id/<UUID>_part2 /mnt
For example, enter
mount /dev/disk/by-id/dm-uuid-mpath-3600a0b80000f4593000012ae4ab0ae65_part2 /mnt
Mount the boot device to /mnt/boot by entering
mount /dev/disk/by-id/<UUID>_part1 /mnt/boot
For example, enter
mount /dev/disk/by-id/dm-uuid-mpath-3600a0b80000f4593000012ae4ab0ae65_part2 /mnt/boot
Open /mnt/boot/grub/device.map file by entering
less /mnt/boot/grub/device.map
In the /mnt/boot/grub/device.map file, determine
if the hd0 entry points to a passive path, then
do one of the following:
If the hd0 entry points to a passive path, change the configuration and reinstall the boot loader:
At the console, enter the following commands at the console prompt:
mount -o bind /dev /mnt/dev mount -o bind /sys /mnt/sys mount -o bind /proc /mnt/proc chroot
At the console, run multipath -ll, then check the
output to find the active path.
Passive paths are flagged as ghost.
In the /mnt/boot/grub/device.map file, change
the hd0 entry to an active path, save the changes,
and close the file.
In case the selection was to boot from MBR,
/etc/grub.conf should look like the following:
setup --stage2=/boot/grub/stage2 (hd0) (hd0,0) quit
Reinstall the boot loader by entering
grub < /etc/grub.conf
Enter the following commands:
exit umount /mnt/* umount /mnt
Return to the YaST graphical environment by pressing CTRL+ALT+F7.
Click to continue with the installation reboot.
Install Linux with only a single path active, preferably one where the
by-id symlinks are listed in the partitioner.
Mount the devices by using the /dev/disk/by-id path
used during the install.
After installation, add dm-multipath to
/etc/sysconfig/kernel:INITRD_MODULES.
For System Z, before running mkinitrd, edit the
/etc/zipl.conf file to change the by-path
information in zipl.conf with the same by-id
information that was used in the /etc/fstab.
Re-run /sbin/mkinitrd to update the
initrd image.
For System Z, after running mkinitrd, run
zipl.
Reboot the server.
Add multipath=off to the kernel command line.
This affects only the root device. All other devices are not affected.
Ideally, you should configure multipathing for devices before you use them
as components of a software RAID device. If you add multipathing after
creating any software RAID devices, the DM-MPIO service might be starting
after the multipath service on reboot, which makes
multipathing appear not to be available for RAIDs. You can use the
procedure in this section to get multipathing running for a previously
existing software RAID.
For example, you might need to configure multipathing for devices in a software RAID under the following circumstances:
If you create a new software RAID as part of the Partitioning settings during a new install or upgrade.
If you did not configure the devices for multipathing before using them in the software RAID as a member device or spare.
If you grow your system by adding new HBA adapters to the server or expanding the storage subsystem in your SAN.
The following instructions assume the software RAID device is
/dev/mapper/mpath0>, which is its device name as
recognized by the kernel. It assumes you have enabled user-friendly-names
in the /etc/multipath.conf file as described in
Section 7.9, “Configuring User-Friendly Names or Alias Names”.
Ensure that you modify the instructions for the device name of your software RAID.
Open a terminal console, then log in as the root
user or equivalent.
Except where otherwise directed, use this console to enter the commands in the following steps.
If any software RAID devices are currently mounted or running, enter the following commands for each device to dismount the device and stop it.
umount /dev/mapper/mpath0
mdadm --misc --stop /dev/mapper/mpath0
Stop the boot.md service by entering
/etc/init.d/boot.md stop
Start the boot.multipath and
multipathd services by entering the following
commands:
/etc/init.d/boot.multipath start
/etc/init.s/multipathd start
After the multipathing services are started, verify that the software
RAID’s component devices are listed in the
/dev/disk/by-id directory. Do one of the following:
Devices Are Listed:
The device names should now have symbolic links to their Device Mapper
Multipath device names, such as /dev/dm-1.
Devices Are Not Listed: Force the multipath service to recognize them by flushing and rediscovering the devices.
To do this, enter the following commands:
multipath -F
multipath -v0
The devices should now be listed in
/dev/disk/by-id, and have symbolic links to their
Device Mapper Multipath device names. For example:
lrwxrwxrwx 1 root root 10 2011-01-06 11:42 dm-uuid-mpath-36006016088d014007e0d0d2213ecdf11 -> ../../dm-1
Restart the boot.md service and the RAID device by
entering
/etc/init.d/boot.md start
Check the status of the software RAID by entering
mdadm --detail /dev/mapper/mpath0
The RAID’s component devices should match their Device Mapper Multipath
device names that are listed as the symbolic links of devices in the
/dev/disk/by-id directory.
Make a new initrd to ensure that the Device Mapper
Multipath services are loaded before the RAID services on reboot. Running
mkinitrd is needed only if the root (/) device or any
parts of it (such as /var,
/etc, /log) are on the SAN and
multipath is needed to boot.
Enter
mkinitrd -f multipath
Reboot the server to apply these post-install configuration settings.
Verify that the software RAID array comes up properly on top of the multipathed devices by checking the RAID status. Enter
mdadm --detail /dev/mapper/mpath0
For example:
Number Major Minor RaidDevice State |
0 253 0 0 active sync /dev/dm-0 |
1 253 1 1 active sync /dev/dm-1 |
2 253 2 2 active sync /dev/dm-2 |
If your system has already been configured for multipathing and you later
need to add more storage to the SAN, you can use the
rescan-scsi-bus.sh script to scan for the new devices.
By default, this script scans all HBAs with typical LUN ranges.
In EMC PowerPath environments, do not use the
rescan-scsi-bus.sh utility provided with the
operating system or the HBA vendor scripts for scanning the SCSI buses. To
avoid potential file system corruption, EMC requires that you follow the
procedure provided in the vendor documentation for EMC PowerPath for
Linux.
rescan-scsi-bus.sh [options] [host [host ...]]
You can specify hosts on the command line (deprecated), or use the
--hosts=LIST option (recommended).
For most storage subsystems, the script can be run successfully without
options. However, some special cases might need to use one or more of the
following parameters for the rescan-scsi-bus.sh script:
|
Option |
Description |
|---|---|
-l |
Activates scanning for LUNs 0-7. [Default: 0] |
-L NUM |
Activates scanning for LUNs 0 to NUM. [Default: 0] |
-w |
Scans for target device IDs 0 to 15. [Default: 0 to 7] |
-c |
Enables scanning of channels 0 or 1. [Default: 0] |
-r --remove |
Enables removing of devices. [Default: Disabled] |
-i --issueLip |
Issues a Fibre Channel LIP reset. [Default: Disabled] |
--forcerescan |
Rescans existing devices. |
--forceremove |
Removes and re-adds every device. WarningUse with caution, this option is dangerous. |
--nooptscan |
Don’t stop looking for LUNs if 0 is not found. |
--color |
Use colored prefixes OLD/NEW/DEL. |
--hosts=LIST |
Scans only hosts in LIST, where LIST is a comma-separated list of single values and ranges. No spaces are allowed. --hosts=A[-B][,C[-D]] |
--channels=LIST |
Scans only channels in LIST, where LIST is a comma-separated list of single values and ranges. No spaces are allowed. --channels=A[-B][,C[-D]] |
--ids=LIST |
Scans only target IDs in LIST, where LIST is a comma-separated list of single values and ranges. No spaces are allowed. --ids=A[-B][,C[-D]] |
--luns=LIST |
Scans only LUNs in LIST, where LIST is a comma-separated list of single values and ranges. No spaces are allowed. --luns=A[-B][,C[-D]] |
Use the following procedure to scan the devices and make them available to multipathing without rebooting the system.
On the storage subsystem, use the vendor’s tools to allocate the device and update its access control settings to allow the Linux system access to the new storage. Refer to the vendor’s documentation for details.
Scan all targets for a host to make its new device known to the middle layer of the Linux kernel’s SCSI subsystem. At a terminal console prompt, enter
rescan-scsi-bus.sh [options]
Check for scanning progress in the system log (the
/var/log/messages file). At a terminal console
prompt, enter
tail -30 /var/log/messages
This command displays the last 30 lines of the log. For example:
# tail -30 /var/log/messages . . . Feb 14 01:03 kernel: SCSI device sde: 81920000 Feb 14 01:03 kernel: SCSI device sdf: 81920000 Feb 14 01:03 multipathd: sde: path checker registered Feb 14 01:03 multipathd: sdf: path checker registered Feb 14 01:03 multipathd: mpath4: event checker started Feb 14 01:03 multipathd: mpath5: event checker started Feb 14 01:03:multipathd: mpath4: remaining active paths: 1 Feb 14 01:03 multipathd: mpath5: remaining active paths: 1
Repeat Step 2 through Step 3 to add paths through other HBA adapters on the Linux system that are connected to the new device.
Run the multipath command to recognize the devices for
DM-MPIO configuration. At a terminal console prompt, enter
multipath
You can now configure the new device for multipathing.
Use the example in this section to detect a newly added multipathed LUN without rebooting.
In EMC PowerPath environments, do not use the
rescan-scsi-bus.sh utility provided with the
operating system or the HBA vendor scripts for scanning the SCSI buses. To
avoid potential file system corruption, EMC requires that you follow the
procedure provided in the vendor documentation for EMC PowerPath for
Linux.
Open a terminal console, then log in as the root
user.
Scan all targets for a host to make its new device known to the middle layer of the Linux kernel’s SCSI subsystem. At a terminal console prompt, enter
rescan-scsi-bus.sh [options]
For syntax and options information for the
rescan-scsi-bus-sh script, see
Section 7.14, “Scanning for New Devices without Rebooting”.
Verify that the device is seen (such as if the link has a new time stamp) by entering
ls -lrt /dev/dm-*
You can also verify the devices in /dev/disk/by-id
by entering
ls -l /dev/disk/by-id/
Verify the new device appears in the log by entering
tail -33 /var/log/messages
Use a text editor to add a new alias definition for the device in the
/etc/multipath.conf file, such as
data_vol3.
For example, if the UUID is
36006016088d014006e98a7a94a85db11, make the
following changes:
defaults {
user_friendly_names yes
}
multipaths {
multipath {
wwid 36006016088d014006e98a7a94a85db11
alias data_vol3
}
}Create a partition table for the device by entering
fdisk /dev/disk/by-id/dm-uuid-mpath-<UUID>
Replace UUID with the device WWID, such as
36006016088d014006e98a7a94a85db11.
Trigger udev by entering
echo 'add' > /sys/block/<dm_device>/uevent
For example, to generate the device-mapper devices for the partitions on
dm-8, enter
echo 'add' > /sys/block/dm-8/uevent
Create a file system and label for the new partition by entering the following commands:
mke2fs -j /dev/disk/by-id/dm-uuid-mpath-<UUID_partN> tune2fs -L data_vol3 /dev/disk/by-id/dm-uuid-<UUID_partN>
Replace UUID_part1 with the actual UUID and
partition number, such as 36006016088d014006e98a7a94a85db11_part1.
Restart DM-MPIO to let it read the aliases by entering
/etc/init.d/multipathd restart
Verify that the device is recognized by multipathd by
entering
multipath -ll
Use a text editor to add a mount entry in the
/etc/fstab file.
At this point, the alias you created in
Step 5 is not yet in the
/dev/disk/by-label directory. Add the mount entry
the /dev/dm-9 path, then change the entry before the
next time you reboot to
LABEL=data_vol3
Create a directory to use as the mount point, then mount the device by entering
md /data_vol3
mount /data_vol3
Querying the multipath I/O status outputs the current status of the multipath maps.
The multipath -l option displays the current path status
as of the last time that the path checker was run. It does not run the path
checker.
The multipath -ll option runs the path checker, updates
the path information, then displays the current status information. This
option always the displays the latest information about the path status.
At a terminal console prompt, enter
multipath -ll
This displays information for each multipathed device. For example:
3600601607cf30e00184589a37a31d911 [size=127 GB][features="0"][hwhandler="1 emc"]
\_ round-robin 0 [active][first] \_ 1:0:1:2 sdav 66:240 [ready ][active] \_ 0:0:1:2 sdr 65:16 [ready ][active]
\_ round-robin 0 [enabled] \_ 1:0:0:2 sdag 66:0 [ready ][active] \_ 0:0:0:2 sdc 8:32 [ready ][active]
For each device, it shows the device’s ID, size, features, and hardware handlers.
Paths to the device are automatically grouped into priority groups on device discovery. Only one priority group is active at a time. For an active/active configuration, all paths are in the same group. For an active/passive configuration, the passive paths are placed in separate priority groups.
The following information is displayed for each group:
Scheduling policy used to balance I/O within the group, such as round-robin
Whether the group is active, disabled, or enabled
Whether the group is the first (highest priority) group
Paths contained within the group
The following information is displayed for each path:
The physical address as host:bus:target:lun, such as 1:0:1:2
Device node name, such as sda
Major:minor numbers
Status of the device
You might need to configure multipathing to queue I/O if all paths fail concurrently by enabling queue_if_no_path. Otherwise, I/O fails immediately if all paths are gone. In certain scenarios, where the driver, the HBA, or the fabric experience spurious errors, DM-MPIO should be configured to queue all I/O where those errors lead to a loss of all paths, and never propagate errors upward.
When you use multipathed devices in a cluster, you might choose to disable queue_if_no_path. This automatically fails the path instead of queuing the I/O, and escalates the I/O error to cause a failover of the cluster resources.
Because enabling queue_if_no_path leads to I/O being queued indefinitely
unless a path is reinstated, ensure that multipathd is
running and works for your scenario. Otherwise, I/O might be stalled
indefinitely on the affected multipathed device until reboot or until you
manually return to failover instead of queuing.
To test the scenario:
In a terminal console, log in as the root user.
Activate queuing instead of failover for the device I/O by entering:
dmsetup message device_ID 0 queue_if_no_path
Replace the device_ID with the ID for your device. The 0 value represents the sector and is used when sector information is not needed.
For example, enter:
dmsetup message 3600601607cf30e00184589a37a31d911 0 queue_if_no_path
Return to failover for the device I/O by entering:
dmsetup message device_ID 0 fail_if_no_path
This command immediately causes all queued I/O to fail.
Replace the device_ID with the ID for your device. For example, enter:
dmsetup message 3600601607cf30e00184589a37a31d911 0 fail_if_no_path
To set up queuing I/O for scenarios where all paths fail:
In a terminal console, log in as the root user.
Open the /etc/multipath.conf file in a text editor.
Uncomment the defaults section and its ending bracket, then add the
default_features setting, as follows:
defaults {
default_features "1 queue_if_no_path"
}
After you modify the /etc/multipath.conf file, you
must run mkinitrd to re-create the
initrd on your system, then reboot in order for the
changes to take effect.
When you are ready to return over to failover for the device I/O, enter:
dmsetup message mapname 0 fail_if_no_path
Replace the mapname with the mapped alias name or the device ID for the device. The 0 value represents the sector and is used when sector information is not needed.
This command immediately causes all queued I/O to fail and propagates the error to the calling application.
If all paths fail concurrently and I/O is queued and stalled, do the following:
Enter the following command at a terminal console prompt:
dmsetup message mapname 0 fail_if_no_path
Replace mapname with the
correct device ID or mapped alias name for the device. The 0 value
represents the sector and is used when sector information is not needed.
This command immediately causes all queued I/O to fail and propagates the error to the calling application.
Reactivate queueing by entering the following command at a terminal console prompt:
dmsetup message mapname 0 queue_if_no_path
This section describes some known issues and possible solutions for MPIO.
Multipath Tools 0.4.9 for SLES 11 SP2 uses the prio
setting in the defaults{} or
devices{} section of the
/etc/multipath.conf file. It silently ignores the
keyword prio when it is specified for an individual
multipath definition in the
multipaths{) section.
Multipath Tools 0.4.8 for SLES 11 SP1 and earlier allows the prio
setting in the individual multipath definition in the
multipaths{) section to override the
prio settings in the defaults{} or
devices{} section.
When you upgrade from multipath-tools-0.4.8 to
multipath-tools-0.4.9, the prio
settings in the /etc/multipath.conf file are broken
for prioritizers that require an argument. In multipath-tools-0.4.9, the
prio keyword is used to specify the prioritizer, and
the prio_args keyword is used to specify the argument
for prioritizers that require an argument. Previously, both the
prioritizer and its argument were specified on the same
prio line.
For example, in multipath-tools-0.4.8, the following line was used to specify a prioritizer and its arguments on the same line.
prio "weightedpath hbtl [1,3]:.:.+:.+ 260 [0,2]:.:.+:.+ 20"
After upgrading to multipath-tools-0.4.9, the command
causes an error. The message is similar to the following:
<Month day hh:mm:ss> | Prioritizer 'weightedpath hbtl [1,3]:.:.+:.+ 260 [0,2]:.:.+:.+ 20' not found in /lib64/multipath
To resolve this problem, use a text editor to modify the
prio line in the
/etc/multipath.conf file. Create two lines with the
prioritizer specified on the prio line, and the
prioritizer argument specified on the prio_args line
below it:
prio "weightedpath" prio_args "hbtl [1,3]:.:.+:.+ 260 [0,2]:.:.+:.+ 20"
For information about troubleshooting multipath I/O issues on SUSE Linux Enterprise Server, see the following Technical Information Documents (TIDs) in the Novell Support Knowledgebase:
If you want to use software RAIDs, create and configure them before you create file systems on the devices. For information, see the following:
The purpose of RAID (redundant array of independent disks) is to combine several hard disk partitions into one large virtual hard disk to optimize performance, data security, or both. Most RAID controllers use the SCSI protocol because it can address a larger number of hard disks in a more effective way than the IDE protocol and is more suitable for parallel processing of commands. There are some RAID controllers that support IDE or SATA hard disks. Software RAID provides the advantages of RAID systems without the additional cost of hardware RAID controllers. However, this requires some CPU time and has memory requirements that make it unsuitable for real high performance computers.
Software RAID is not supported underneath clustered file systems such as OCFS2, because RAID does not support concurrent activation. If you want RAID for OCFS2, you need the RAID to be handled by the storage subsystem.
SUSE Linux Enterprise offers the option of combining several hard disks into one soft RAID system. RAID implies several strategies for combining several hard disks in a RAID system, each with different goals, advantages, and characteristics. These variations are commonly known as RAID levels.
This section describes common RAID levels 0, 1, 2, 3, 4, 5, and nested RAID levels.
This level improves the performance of your data access by spreading out blocks of each file across multiple disk drives. Actually, this is not really a RAID, because it does not provide data backup, but the name RAID 0 for this type of system has become the norm. With RAID 0, two or more hard disks are pooled together. The performance is very good, but the RAID system is destroyed and your data lost if even one hard disk fails.
This level provides adequate security for your data, because the data is copied to another hard disk 1:1. This is known as hard disk mirroring. If a disk is destroyed, a copy of its contents is available on another mirrored disk. All disks except one could be damaged without endangering your data. However, if damage is not detected, damaged data might be mirrored to the correct disk and the data is corrupted that way. The writing performance suffers a little in the copying process compared to when using single disk access (10 to 20 percent slower), but read access is significantly faster in comparison to any one of the normal physical hard disks, because the data is duplicated so can be scanned in parallel. RAID 1 generally provides nearly twice the read transaction rate of single disks and almost the same write transaction rate as single disks.
These are not typical RAID implementations. Level 2 stripes data at the bit level rather than the block level. Level 3 provides byte-level striping with a dedicated parity disk and cannot service simultaneous multiple requests. Both levels are rarely used.
Level 4 provides block-level striping just like Level 0 combined with a dedicated parity disk. If a data disk fails, the parity data is used to create a replacement disk. However, the parity disk might create a bottleneck for write access. Nevertheless, Level 4 is sometimes used.
RAID 5 is an optimized compromise between Level 0 and Level 1 in terms of performance and redundancy. The hard disk space equals the number of disks used minus one. The data is distributed over the hard disks as with RAID 0. Parity blocks, created on one of the partitions, are there for security reasons. They are linked to each other with XOR, enabling the contents to be reconstructed by the corresponding parity block in case of system failure. With RAID 5, no more than one hard disk can fail at the same time. If one hard disk fails, it must be replaced as soon as possible to avoid the risk of losing data.
Several other RAID levels have been developed, such as RAIDn, RAID 10, RAID 0+1, RAID 30, and RAID 50. Some of them being proprietary implementations created by hardware vendors. These levels are not very widespread, and are not explained here.
The YaST soft RAID configuration can be reached from the YaST Expert Partitioner. This partitioning tool enables you to edit and delete existing partitions and create new ones that should be used with soft RAID.
You can create RAID partitions by first clicking › then selecting as the partition identifier. For RAID 0 and RAID 1, at least two partitions are needed—for RAID 1, usually exactly two and no more. If RAID 5 is used, at least three partitions are required. It is recommended to use only partitions of the same size because each segment can contribute only the same amount of space as the smallest sized partition. The RAID partitions should be stored on different hard disks to decrease the risk of losing data if one is defective (RAID 1 and 5) and to optimize the performance of RAID 0. After creating all the partitions to use with RAID, click › to start the RAID configuration.
On the next page of the wizard, choose among RAID levels 0, 1, and 5, then
click . The following dialog (see
Figure 8.1, “RAID Partitions”) lists all
partitions with either the or type. No swap or DOS partitions are shown. If a partition
is already assigned to a RAID volume, the name of the RAID device (for
example, /dev/md0) is shown in the list. Unassigned
partitions are indicated with “--”.
To add a previously unassigned partition to the selected RAID volume, first select the partition then click . At this point, the name of the RAID device is displayed next to the selected partition. Assign all partitions reserved for RAID. Otherwise, the space on the partition remains unused. After assigning all partitions, click to proceed to the settings dialog where you can fine-tune the performance (see Figure 8.2, “File System Settings”).
As with conventional partitioning, set the file system to use as well as
encryption and the mount point for the RAID volume. After completing the
configuration with , see the
/dev/md0 device and others indicated with
RAID in the Expert Partitioner.
Check the /proc/mdstat file to find out whether a RAID
partition has been damaged. In the event of a system failure, shut down
your Linux system and replace the defective hard disk with a new one
partitioned the same way. Then restart your system and enter the command
mdadm /dev/mdX --add /dev/sdX. Replace
X with your particular device identifiers. This
integrates the hard disk automatically into the RAID system and fully
reconstructs it.
Although you can access all data during the rebuild, you might encounter some performance issues until the RAID has been fully rebuilt.
Configuration instructions and more details for soft RAID can be found in the HOWTOs at:
The Software RAID HOWTO in the
/usr/share/doc/packages/mdadm/Software-RAID.HOWTO.html
file
Linux RAID mailing lists are also available, such as linux-raid.
In SUSE Linux Enterprise Server 11, the Device Mapper RAID tool has been integrated into
the YaST Partitioner. You can use the partitioner at install time to
create a software RAID1 for the system device that contains your root
(/) partition. The /boot partition
must be created on a separate device than the MD RAID1.
Ensure that your configuration meets the following requirements:
You need two hard drives to create the RAID1 mirror device. The hard drives should be similarly sized. The RAID assumes the size of the smaller drive. The block storage devices can be any combination of local (in or directly attached to the machine), Fibre Channel storage subsystems, or iSCSI storage subsystems.
You need a third device to use for the /boot
partition. The boot device should be a local device.
If you are using hardware RAID devices, do not attempt to run software RAIDs on top of it.
If you are using iSCSI target devices, you must enable the iSCSI initiator support before you create the RAID device.
If your storage subsystem provides multiple I/O paths between the server and its directly attached local devices, Fibre Channel devices, or iSCSI devices that you want to use in the software RAID, you must enable the multipath support before you create the RAID device.
If there are iSCSI target devices that you want to use for the root (/) partition, you must enable the iSCSI Initiator software to make those devices available to you before you create the software RAID1 device.
Proceed with the YaST install of SUSE Linux Enterprise 11 until you reach the Installation Settings page.
Click to open the Preparing Hard Disk page, click , then click .
On the Expert Partitioner page, expand in the panel to view the default proposal.
On the page, select , then click when prompted to continue with initializing the iSCSI initiator configuration.
If there are multiple I/O paths to the devices that you want to use for the root (/) partition, you must enable multipath support before you create the software RAID1 device.
Proceed with the YaST install of SUSE Linux Enterprise 11 until you reach the Installation Settings page.
Click to open the Preparing Hard Disk page, click , then click .
On the Expert Partitioner page, expand in the panel to view the default proposal.
On the page, select , then click when prompted to activate multipath.
This re-scans the devices and resolves the multiple paths so that each device is listed only once in the list of hard disks.
Proceed with the YaST install of SUSE Linux Enterprise 11 until you reach the Installation Settings page.
Click to open the Preparing Hard Disk page, click , then click .
On the Expert Partitioner page, expand in the panel to view the default proposal, select the proposed partitions, then click .
Create a /boot partition.
On the Expert Partitioner page under , select the device you want to use for the /boot partition, then click on the tab.
Under , select , then click .
Under , specify the size to use, then click .
Under , select , then select your preferred file system type (such as Ext2 or Ext3) from the drop-down list.
Under , select , then select from the drop-down list.
Click .
Create a swap partition.
On the Expert Partitioner page under , select the device you want to use for the swap partition, then click on the tab.
Under , select , then click .
Under , specify the size to use, then click .
Under , select , then select from the drop-down list.
Under , select , then select from the drop-down list.
Click .
Set up the format for each of the devices you want to use for the software RAID1.
On the Expert Partitioner page under , select the device you want to use in the RAID1, then click on the tab.
Under , select , then click .
Under , specify to use the maximum size, then click .
Under , select , then select from the drop-down list.
Under , select .
Click .
Repeat Step 6.a to Step 6.f for each device that you plan to use in the software RAID1.
Create the RAID device.
In the panel, select , then click on the RAID page.
The devices that you prepared in Step 6 are listed in .
Under , select .
In the panel, select the devices you want to use for the RAID, then click to move the devices to the panel.
Specify two or more devices for a RAID1.
To continue the example, two devices are selected for RAID1.
Click .
Under , select the chunk size from the drop-down list.
The default chunk size for a RAID1 (Mirroring) is 4 KB.
Available chunk sizes are 4 KB, 8 KB, 16 KB, 32 KB, 64 KB, 128 KB, 256 KB, 512 KB, 1 MB, 2 MB, or 4 MB.
Under , select , then select the file system type (such as Ext3) from the drop-down list.
Under , select , then select / from the
drop-down list.
Click .
The software RAID device is managed by Device Mapper, and creates a
device under the /dev/md0 path.
On the Expert Partitioner page, click .
The new proposal appears under on the Installation Settings page.
For example, the setup for the
Continue with the install.
Whenever you reboot your server, Device Mapper is started at boot time so that the software RAID is automatically recognized, and the operating system on the root (/) partition can be started.
This section describes how to create software RAID 6 and 10 devices, using
the Multiple Devices Administration (mdadm(8)) tool. You
can also use mdadm to create RAIDs 0, 1, 4, and 5. The
mdadm tool provides the functionality of legacy programs
mdtools and raidtools.
RAID 6 is essentially an extension of RAID 5 that allows for additional fault tolerance by using a second independent distributed parity scheme (dual parity). Even if two of the hard disk drives fail during the data recovery process, the system continues to be operational, with no data loss.
RAID 6 provides for extremely high data fault tolerance by sustaining multiple simultaneous drive failures. It handles the loss of any two devices without data loss. Accordingly, it requires N+2 drives to store N drives worth of data. It requires a minimum of 4 devices.
The performance for RAID 6 is slightly lower but comparable to RAID 5 in normal mode and single disk failure mode. It is very slow in dual disk failure mode.
|
Feature |
RAID 5 |
RAID 6 |
|---|---|---|
|
Number of devices |
N+1, minimum of 3 |
N+2, minimum of 4 |
|
Parity |
Distributed, single |
Distributed, dual |
|
Performance |
Medium impact on write and rebuild |
More impact on sequential write than RAID 5 |
|
Fault-tolerance |
Failure of one component device |
Failure of two component devices |
The procedure in this section creates a RAID 6 device
/dev/md0 with four devices:
/dev/sda1, /dev/sdb1,
/dev/sdc1, and /dev/sdd1. Ensure
that you modify the procedure to use your actual device nodes.
Open a terminal console, then log in as the
root user or equivalent.
Create a RAID 6 device. At the command prompt, enter
mdadm --create /dev/md0 --run --level=raid6 --chunk=128 --raid-devices=4 /dev/sdb1 /dev/sdc1 /dev/sdc1 /dev/sdd1
The default chunk size is 64 KB.
Create a file system on the RAID 6 device /dev/md0,
such as a Reiser file system (reiserfs). For example, at the command
prompt, enter
mkfs.reiserfs /dev/md0
Modify the command if you want to use a different file system.
Edit the /etc/mdadm.conf file to add entries for
the component devices and the RAID device /dev/md0.
Edit the /etc/fstab file to add an entry for the
RAID 6 device /dev/md0.
Reboot the server.
The RAID 6 device is mounted to /local.
(Optional) Add a hot spare to service the RAID array. For example, at the command prompt enter:
mdadm /dev/md0 -a /dev/sde1
A nested RAID device consists of a RAID array that uses another RAID array as its basic element, instead of using physical disks. The goal of this configuration is to improve the performance and fault tolerance of the RAID.
Linux supports nesting of RAID 1 (mirroring) and RAID 0 (striping) arrays. Generally, this combination is referred to as RAID 10. To distinguish the order of the nesting, this document uses the following terminology:
RAID 1+0: RAID 1 (mirror) arrays are built first, then combined to form a RAID 0 (stripe) array.
RAID 0+1: RAID 0 (stripe) arrays are built first, then combined to form a RAID 1 (mirror) array.
The following table describes the advantages and disadvantages of RAID 10 nesting as 1+0 versus 0+1. It assumes that the storage objects you use reside on different disks, each with a dedicated I/O capability.
|
RAID Level |
Description |
Performance and Fault Tolerance |
|---|---|---|
|
10 (1+0) |
RAID 0 (stripe) built with RAID 1 (mirror) arrays |
RAID 1+0 provides high levels of I/O performance, data redundancy, and disk fault tolerance. Because each member device in the RAID 0 is mirrored individually, multiple disk failures can be tolerated and data remains available as long as the disks that fail are in different mirrors. You can optionally configure a spare for each underlying mirrored array, or configure a spare to serve a spare group that serves all mirrors. |
|
10 (0+1) |
RAID 1 (mirror) built with RAID 0 (stripe) arrays |
RAID 0+1 provides high levels of I/O performance and data redundancy, but slightly less fault tolerance than a 1+0. If multiple disks fail on one side of the mirror, then the other mirror is available. However, if disks are lost concurrently on both sides of the mirror, all data is lost. This solution offers less disk fault tolerance than a 1+0 solution, but if you need to perform maintenance or maintain the mirror on a different site, you can take an entire side of the mirror offline and still have a fully functional storage device. Also, if you lose the connection between the two sites, either site operates independently of the other. That is not true if you stripe the mirrored segments, because the mirrors are managed at a lower level. If a device fails, the mirror on that side fails because RAID 1 is not fault-tolerant. Create a new RAID 0 to replace the failed side, then resynchronize the mirrors. |
A nested RAID 1+0 is built by creating two or more RAID 1 (mirror) devices, then using them as component devices in a RAID 0.
If you need to manage multiple connections to the devices, you must configure multipath I/O before configuring the RAID devices. For information, see Chapter 7, Managing Multipath I/O for Devices.
The procedure in this section uses the device names shown in the following table. Ensure that you modify the device names with the names of your own devices.
|
Raw Devices |
RAID 1 (mirror) |
RAID 1+0 (striped mirrors) | ||
|---|---|---|---|---|
|
|
| ||
|
|
Open a terminal console, then log in as the
root user or equivalent.
Create 2 software RAID 1 devices, using two different devices for each RAID 1 device. At the command prompt, enter these two commands:
mdadm --create /dev/md0 --run --level=1 --raid-devices=2 /dev/sdb1 /dev/sdc1
mdadm --create /dev/md1 --run --level=1 --raid-devices=2 /dev/sdd1 /dev/sde1
Create the nested RAID 1+0 device. At the command prompt, enter the following command using the software RAID 1 devices you created in Step 2:
mdadm --create /dev/md2 --run --level=0 --chunk=64 --raid-devices=2 /dev/md0 /dev/md1
The default chunk size is 64 KB.
Create a file system on the RAID 1+0 device
/dev/md2, such as a Reiser file system (reiserfs).
For example, at the command prompt, enter
mkfs.reiserfs /dev/md2
Modify the command if you want to use a different file system.
Edit the /etc/mdadm.conf file to add entries for
the component devices and the RAID device /dev/md2.
Edit the /etc/fstab file to add an entry for the
RAID 1+0 device /dev/md2.
Reboot the server.
The RAID 1+0 device is mounted to /local.
A nested RAID 0+1 is built by creating two to four RAID 0 (striping) devices, then mirroring them as component devices in a RAID 1.
If you need to manage multiple connections to the devices, you must configure multipath I/O before configuring the RAID devices. For information, see Chapter 7, Managing Multipath I/O for Devices.
In this configuration, spare devices cannot be specified for the underlying RAID 0 devices because RAID 0 cannot tolerate a device loss. If a device fails on one side of the mirror, you must create a replacement RAID 0 device, than add it into the mirror.
The procedure in this section uses the device names shown in the following table. Ensure that you modify the device names with the names of your own devices.
|
Raw Devices |
RAID 0 (stripe) |
RAID 0+1 (mirrored stripes) | ||
|---|---|---|---|---|
|
|
| ||
|
|
Open a terminal console, then log in as the root user or equivalent.
Create two software RAID 0 devices, using two different devices for each RAID 0 device. At the command prompt, enter these two commands:
mdadm --create /dev/md0 --run --level=0 --chunk=64 --raid-devices=2 /dev/sdb1 /dev/sdc1
mdadm --create /dev/md1 --run --level=0 --chunk=64 --raid-devices=2 /dev/sdd1 /dev/sde1
The default chunk size is 64 KB.
Create the nested RAID 0+1 device. At the command prompt, enter the following command using the software RAID 0 devices you created in Step 2:
mdadm --create /dev/md2 --run --level=1 --raid-devices=2 /dev/md0 /dev/md1
Create a file system on the RAID 0+1 device
/dev/md2, such as a Reiser file system (reiserfs).
For example, at the command prompt, enter
mkfs.reiserfs /dev/md2
Modify the command if you want to use a different file system.
Edit the /etc/mdadm.conf file to add entries for
the component devices and the RAID device /dev/md2.
Edit the /etc/fstab file to add an entry for the
RAID 0+1 device /dev/md2.
Reboot the server.
The RAID 0+1 device is mounted to /local.
In mdadm, the RAID10 level creates a single complex
software RAID that combines features of both RAID 0 (striping) and RAID 1
(mirroring). Multiple copies of all data blocks are arranged on multiple
drives following a striping discipline. Component devices should be the
same size.
The complex RAID 10 is similar in purpose to a nested RAID 10 (1+0), but differs in the following ways:
|
Feature |
Complex RAID10 |
Nested RAID 10 (1+0) |
|---|---|---|
|
Number of devices |
Allows an even or odd number of component devices |
Requires an even number of component devices |
|
Component devices |
Managed as a single RAID device |
Manage as a nested RAID device |
|
Striping |
Striping occurs in the near or far layout on component devices. The far layout provides sequential read throughput that scales by number of drives, rather than number of RAID 1 pairs. |
Striping occurs consecutively across component devices |
|
Multiple copies of data |
Two or more copies, up to the number of devices in the array |
Copies on each mirrored segment |
|
Hot spare devices |
A single spare can service all component devices |
Configure a spare for each underlying mirrored array, or configure a spare to serve a spare group that serves all mirrors. |
When configuring an complex RAID10 array, you must specify the number of replicas of each data block that are required. The default number of replicas is 2, but the value can be 2 to the number of devices in the array.
You must use at least as many component devices as the number of replicas you specify. However, number of component devices in a RAID10 array does not need to be a multiple of the number of replicas of each data block. The effective storage size is the number of devices divided by the number of replicas.
For example, if you specify 2 replicas for an array created with 5 component devices, a copy of each block is stored on two different devices. The effective storage size for one copy of all data is 5/2 or 2.5 times the size of a component device.
With the near layout, copies of a block of data are striped near each other on different component devices. That is, multiple copies of one data block are at similar offsets in different devices. Near is the default layout for RAID10. For example, if you use an odd number of component devices and two copies of data, some copies are perhaps one chunk further into the device.
The near layout for the mdadm RAID10 yields read and
write performance similar to RAID 0 over half the number of drives.
Near layout with an even number of disks and two replicas:
sda1 sdb1 sdc1 sde1 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9
Near layout with an odd number of disks and two replicas:
sda1 sdb1 sdc1 sde1 sdf1 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12
The far layout stripes data over the early part of all drives, then stripes a second copy of the data over the later part of all drives, making sure that all copies of a block are on different drives. The second set of values starts halfway through the component drives.
With a far layout, the read performance of the mdadm
RAID10 is similar to a RAID 0 over the full number of drives, but write
performance is substantially slower than a RAID 0 because there is more
seeking of the drive heads. It is best used for read-intensive operations
such as for read-only file servers.
The speed of the raid10 for writing is similar to other mirrored RAID types, like raid1 and raid10 using near layout, as the elevator of the file system schedules the writes in a more optimal way than raw writing. Using raid10 in the far layout well-suited for mirrored writing applications.
Far layout with an even number of disks and two replicas:
sda1 sdb1 sdc1 sde1 0 1 2 3 4 5 6 7 . . . 3 0 1 2 7 4 5 6
Far layout with an odd number of disks and two replicas:
sda1 sdb1 sdc1 sde1 sdf1 0 1 2 3 4 5 6 7 8 9 . . . 4 0 1 2 3 9 5 6 7 8
The offset layout duplicates stripes so that the multiple copies of a given chunk are laid out on consecutive drives and at consecutive offsets. Effectively, each stripe is duplicated and the copies are offset by one device. This should give similar read characteristics to a far layout if a suitably large chunk size is used, but without as much seeking for writes.
Offset layout with an even number of disks and two replicas:
sda1 sdb1 sdc1 sde1 0 1 2 3 3 0 1 2 4 5 6 7 7 4 5 6 8 9 10 11 11 8 9 10
Offset layout with an odd number of disks and two replicas:
sda1 sdb1 sdc1 sde1 sdf1 0 1 2 3 4 4 0 1 2 3 5 6 7 8 9 9 5 6 7 8 10 11 12 13 14 14 10 11 12 13
The RAID10 option for mdadm creates a RAID 10 device
without nesting. For information about RAID10, see
Section 10.3.1, “Understanding the Complex RAID10”.
The procedure in this section uses the device names shown in the following table. Ensure that you modify the device names with the names of your own devices.
|
Raw Devices |
RAID10 (near or far striping scheme) |
|---|---|
|
|
|
In YaST, create a 0xFD Linux RAID partition on the devices you want to
use in the RAID, such as /dev/sdf1,
/dev/sdg1, /dev/sdh1, and
/dev/sdi1.
Open a terminal console, then log in as the root user or equivalent.
Create a RAID 10 command. At the command prompt, enter (all on the same line):
mdadm --create /dev/md3 --run --level=10 --chunk=4 --raid-devices=4 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1
Create a Reiser file system on the RAID 10 device
/dev/md3. At the command prompt, enter
mkfs.reiserfs /dev/md3
Edit the /etc/mdadm.conf file to add entries for
the component devices and the RAID device /dev/md3.
For example:
DEVICE /dev/md3
Edit the /etc/fstab file to add an entry for the
RAID 10 device /dev/md3.
Reboot the server.
The RAID10 device is mounted to /raid10.
Launch YaST as the root user, then open the
Partitioner.
Select to view the available disks, such as sdab, sdc, sdd, and sde.
For each disk that you will use in the software RAID, create a RAID partition on the device. Each partition should be the same size. For a RAID 10 device, you need
Under , select the device, then select the tab in the right panel.
Click to open the wizard.
Under , select , then click .
For , specify the desired size of the RAID partition on this disk, then click .
Under , select , then select from the drop-down list.
Under , select , then click .
Repeat these steps until you have defined a RAID partition on the disks you want to use in the RAID 10 device.
Create a RAID 10 device:
Select , then select in the right panel to open the wizard.
Under , select .
In the list, select the desired Linux RAID partitions, then click to move them to the list.
(Optional) Click , specify the preferred order of the disks in the RAID array.
For RAID types where the order of added disks matters, you can specify the order in which the devices will be used to ensure that one half of the array resides on one disk subsystem and the other half of the array resides on a different disk subsystem. For example, if one disk subsystem fails, the system keeps running from the second disk subsystem.
Select each disk in turn and click one of the buttons, where X is the letter you want to assign to the disk. Available classes are A, B, C, D and E but for many cases fewer classes are needed (e.g. only A and B). Assign all available RAID disks this way.
You can press the Ctrl or Shift key to select multiple devices. You can also right-click a selected device and choose the appropriate class from the context menu.
Specify the order the devices by selecting one of the sorting options:
Sorted:
Sorts all devices of class A before all devices of class B and so
on. For example: AABBCC.
Interleaved:
Sorts devices by the first device of class A, then first device of
class B, then all the following classes with assigned devices. Then
the second device of class A, the second device of class B, and so
on follows. All devices without a class are sorted to the end of
devices list. For example, ABCABC.
Pattern File:
Select an existing file that contains multiple lines, where each is
a regular expression and a class name ("sda.*
A"). All devices that match the regular expression are
assigned to the specified class for that line. The regular
expression is matched against the kernel name
(/dev/sda1), the udev path name
(/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0-part1)
and then the udev ID
(/dev/disk/by-id/ata-ST3500418AS_9VMN8X8L-part1). The first match
made determines the class if a device’s name matches more then
one regular expression.
At the bottom of the dialog box, click to confirm the order.
Click .
Under , specify the C and , then click .
For a RAID 10, the parity options are n (near), f (far), and o (offset). The number indicates the number of replicas of each data block are required. Two is the default. For information, see Section 10.3.1, “Understanding the Complex RAID10”.
Add a file system and mount options to the RAID device, then click .
Select , select the newly created RAID device, then click to view its partitions.
Click .
Verify the changes to be made, then click to create the RAID.
A degraded array is one in which some devices are missing. Degraded arrays are supported only for RAID 1, RAID 4, RAID 5, and RAID 6. These RAID types are designed to withstand some missing devices as part of their fault-tolerance features. Typically, degraded arrays occur when a device fails. It is possible to create a degraded array on purpose.
|
RAID Type |
Allowable Number of Slots Missing | |
|---|---|---|
|
RAID 1 |
All but one device | |
|
RAID 4 |
One slot | |
|
RAID 5 |
One slot | |
|
RAID 6 |
One or two slots |
To create a degraded array in which some devices are missing, simply give
the word missing in place of a device name. This causes
mdadm to leave the corresponding slot in the array
empty.
When creating a RAID 5 array, mdadm automatically
creates a degraded array with an extra spare drive. This is because
building the spare into a degraded array is generally faster than
resynchronizing the parity on a non-degraded, but not clean, array. You can
override this feature with the --force option.
Creating a degraded array might be useful if you want create a RAID, but one of the devices you want to use already has data on it. In that case, you create a degraded array with other devices, copy data from the in-use device to the RAID that is running in degraded mode, add the device into the RAID, then wait while the RAID is rebuilt so that the data is now across all devices. An example of this process is given in the following procedure:
Create a degraded RAID 1 device /dev/md0, using one
single drive /dev/sd1, enter the following at the
command prompt:
mdadm --create /dev/md0 -l 1 -n 2 /dev/sda1 missing
The device should be the same size or larger than the device you plan to add to it.
If the device you want to add to the mirror contains data that you want to move to the RAID array, copy it now to the RAID array while it is running in degraded mode.
Add a device to the mirror. For example, to add
/dev/sdb1 to the RAID, enter the following at the
command prompt:
mdadm /dev/md0 -a /dev/sdb1
You can add only one device at a time. You must wait for the kernel to build the mirror and bring it fully online before you add another mirror.
Monitor the build progress by entering the following at the command prompt:
cat /proc/mdstat
To see the rebuild progress while being refreshed every second, enter
watch -n 1 cat /proc/mdstat
This section describes how to increase or reduce the size of a software RAID
1, 4, 5, or 6 device with the Multiple Device Administration
(mdadm(8)) tool.
Before starting any of the tasks described in this section, ensure that you have a valid backup of all of the data.
Resizing an existing software RAID device involves increasing or decreasing the space contributed by each component partition.
The mdadm(8) tool supports resizing only for software
RAID levels 1, 4, 5, and 6. These RAID levels provide disk fault tolerance
so that one component partition can be removed at a time for resizing. In
principle, it is possible to perform a hot resize for RAID partitions, but
you must take extra care for your data when doing so.
The file system that resides on the RAID must also be able to be resized in order to take advantage of the changes in available space on the device. In SUSE Linux Enterprise Server 11, file system resizing utilities are available for file systems Ext2, Ext3, and ReiserFS. The utilities support increasing and decreasing the size as follows:
|
File System |
Utility |
Increase Size |
Decrease Size |
|---|---|---|---|
|
Ext2 or Ext3 |
resize2fs |
Yes, offline only |
Yes, offline only |
|
ReiserFS |
resize_reiserfs |
Yes, online or offline |
Yes, offline only |
Resizing any partition or file system involves some risks that can potentially result in losing data.
To avoid data loss, ensure that you back up your data before you begin any resizing task.
Resizing the RAID involves the following tasks. The order in which these tasks is performed depends on whether you are increasing or decreasing its size.
|
Tasks |
Description |
Order If Increasing Size |
Order If Decreasing Size |
|---|---|---|---|
|
Resize each of the component partitions. |
Increase or decrease the active size of each component partition. You remove only one component partition at a time, modify its size, then return it to the RAID. |
1 |
2 |
|
Resize the software RAID itself. |
The RAID does not automatically know about the increases or decreases you make to the underlying component partitions. You must inform it about the new size. |
2 |
3 |
|
Resize the file system. |
You must resize the file system that resides on the RAID. This is possible only for file systems that provide tools for resizing, such as Ext2, Ext3, and ReiserFS. |
3 |
1 |
Before you begin, review the guidelines in Section 11.1, “Understanding the Resizing Process”.
Apply the procedure in this section to increase the size of a RAID 1, 4, 5, or 6. For each component partition in the RAID, remove the partition from the RAID, modify its size, return it to the RAID, then wait until the RAID stabilizes to continue. While a partition is removed, the RAID operates in degraded mode and has no or reduced disk fault tolerance. Even for RAIDs that can tolerate multiple concurrent disk failures, do not remove more than one component partition at a time.
If a RAID does not have disk fault tolerance, or it is simply not consistent, data loss results if you remove any of its partitions. Be very careful when removing partitions, and ensure that you have a backup of your data available.
The procedure in this section uses the device names shown in the following table. Ensure that you modify the names to use the names of your own devices.
|
RAID Device |
Component Partitions |
|---|---|
|
|
|
To increase the size of the component partitions for the RAID:
Open a terminal console, then log in as the
root user or equivalent.
Ensure that the RAID array is consistent and synchronized by entering
cat /proc/mdstat
If your RAID array is still synchronizing according to the output of this command, you must wait until synchronization is complete before continuing.
Remove one of the component partitions from the RAID array. For example,
to remove /dev/sda1, enter
mdadm /dev/md0 --fail /dev/sda1 --remove /dev/sda1
In order to succeed, both the fail and remove actions must be done.
Increase the size of the partition that you removed in Step 3 by doing one of the following:
Increase the size of the partition, using a disk partitioner such as
fdisk(8), cfdisk(8), or
parted(8). This option is the usual choice.
Replace the disk on which the partition resides with a higher-capacity device.
This option is possible only if no other file systems on the original disk are accessed by the system. When the replacement device is added back into the RAID, it takes much longer to synchronize the data because all of the data that was on the original device must be rebuilt.
Re-add the partition to the RAID array. For example, to add
/dev/sda1, enter
mdadm -a /dev/md0 /dev/sda1
Wait until the RAID is synchronized and consistent before continuing with the next partition.
Repeat Step 2 through Step 5 for each of the remaining component devices in the array. Ensure that you modify the commands for the correct component partition.
If you get a message that tells you that the kernel could not re-read the partition table for the RAID, you must reboot the computer after all partitions have been resized to force an update of the partition table.
Continue with Section 11.2.2, “Increasing the Size of the RAID Array”.
After you have resized each of the component partitions in the RAID (see Section 11.2.1, “Increasing the Size of Component Partitions”), the RAID array configuration continues to use the original array size until you force it to be aware of the newly available space. You can specify a size for the RAID or use the maximum available space.
The procedure in this section uses the device name
/dev/md0 for the RAID device. Ensure that you modify
the name to use the name of your own device.
Open a terminal console, then log in as the
root user or equivalent.
Check the size of the array and the device size known to the array by entering
mdadm -D /dev/md0 | grep -e "Array Size" -e "Device Size"
Do one of the following:
Increase the size of the array to the maximum available size by entering
mdadm --grow /dev/md0 -z max
Increase the size of the array to the maximum available size by entering
mdadm --grow /dev/md0 -z max --assume-clean
The array makes use of any space that has been added to the devices, but this space will not be synchronized. This is recommended for RAID1 because the sync is not needed. It can be useful for other RAID levels if the space that was added to the member devices was pre-zeroed.
Increase the size of the array to a specified value by entering
mdadm --grow /dev/md0 -z size
Replace size with an integer value in kilobytes (a kilobyte is 1024 bytes) for the desired size.
Recheck the size of your array and the device size known to the array by entering
mdadm -D /dev/md0 | grep -e "Array Size" -e "Device Size"
Do one of the following:
If your array was successfully resized, continue with Section 11.2.3, “Increasing the Size of the File System”.
If your array was not resized as you expected, you must reboot, then try this procedure again.
After you increase the size of the array (see Section 11.2.2, “Increasing the Size of the RAID Array”), you are ready to resize the file system.
You can increase the size of the file system to the maximum space available or specify an exact size. When specifying an exact size for the file system, ensure that the new size satisfies the following conditions:
The new size must be greater than the size of the existing data; otherwise, data loss occurs.
The new size must be equal to or less than the current RAID size because the file system size cannot extend beyond the space available.
Ext2 and Ext3 file systems can be resized when mounted or unmounted with
the resize2fs command.
Open a terminal console, then log in as the
root user or equivalent.
Increase the size of the file system using one of the following methods:
To extend the file system size to the maximum available size of the
software RAID device called /dev/md0, enter
resize2fs /dev/md0
If a size parameter is not specified, the size defaults to the size of the partition.
To extend the file system to a specific size, enter
resize2fs /dev/md0 size
The size parameter specifies the requested new size of the file system. If no units are specified, the unit of the size parameter is the block size of the file system. Optionally, the size parameter can be suffixed by one of the following the unit designators: s for 512 byte sectors; K for kilobytes (1 kilobyte is 1024 bytes); M for megabytes; or G for gigabytes.
Wait until the resizing is completed before continuing.
If the file system is not mounted, mount it now.
For example, to mount an Ext2 file system for a RAID named
/dev/md0 at mount point
/raid, enter
mount -t ext2 /dev/md0 /raid
Check the effect of the resize on the mounted file system by entering
df -h
The Disk Free (df) command shows the total size of
the disk, the number of blocks used, and the number of blocks available
on the file system. The -h option print sizes in human-readable format,
such as 1K, 234M, or 2G.
As with Ext2 and Ext3, a ReiserFS file system can be increased in size while mounted or unmounted. The resize is done on the block device of your RAID array.
Open a terminal console, then log in as the
root user or equivalent.
Increase the size of the file system on the software RAID device called
/dev/md0, using one of the following methods:
To extend the file system size to the maximum available size of the device, enter
resize_reiserfs /dev/md0
When no size is specified, this increases the volume to the full size of the partition.
To extend the file system to a specific size, enter
resize_reiserfs -s size /dev/md0
Replace size with the desired size in
bytes. You can also specify units on the value, such as 50000K
(kilobytes), 250M (megabytes), or 2G (gigabytes). Alternatively, you
can specify an increase to the current size by prefixing the value
with a plus (+) sign. For example, the following command increases
the size of the file system on /dev/md0 by 500
MB:
resize_reiserfs -s +500M /dev/md0
Wait until the resizing is completed before continuing.
If the file system is not mounted, mount it now.
For example, to mount an ReiserFS file system for a RAID named
/dev/md0 at mount point
/raid, enter
mount -t reiserfs /dev/md0 /raid
Check the effect of the resize on the mounted file system by entering
df -h
The Disk Free (df) command shows the total size of
the disk, the number of blocks used, and the number of blocks available
on the file system. The -h option print sizes in human-readable format,
such as 1K, 234M, or 2G.
Before you begin, review the guidelines in Section 11.1, “Understanding the Resizing Process”.
When decreasing the size of the file system on a RAID device, ensure that the new size satisfies the following conditions:
The new size must be greater than the size of the existing data; otherwise, data loss occurs.
The new size must be equal to or less than the current RAID size because the file system size cannot extend beyond the space available.
In SUSE Linux Enterprise Server, Ext2, Ext3, and ReiserFS provide utilities for decreasing the size of the file system. Use the appropriate procedure below for decreasing the size of your file system.
The procedures in this section use the device name
/dev/md0 for the RAID device. Ensure that you modify
commands to use the name of your own device.
The Ext2 and Ext3 file systems can be resized when mounted or unmounted.
Open a terminal console, then log in as the
root user or equivalent.
Decrease the size of the file system on the RAID by entering
resize2fs /dev/md0 <size>
Replace size with an integer value in kilobytes for the desired size. (A kilobyte is 1024 bytes.)
Wait until the resizing is completed before continuing.
If the file system is not mounted, mount it now. For example, to mount
an Ext2 file system for a RAID named /dev/md0 at
mount point /raid, enter
mount -t ext2 /dev/md0 /raid
Check the effect of the resize on the mounted file system by entering
df -h
The Disk Free (df) command shows the total size of
the disk, the number of blocks used, and the number of blocks available
on the file system. The -h option print sizes in human-readable format,
such as 1K, 234M, or 2G.
Continue with Section 11.3.2, “Decreasing the Size of the RAID Array”.
ReiserFS file systems can be decreased in size only if the volume is unmounted.
Open a terminal console, then log in as the
root user or equivalent.
Unmount the device by entering
umount /mnt/point
If the partition you are attempting to decrease in size contains system
files (such as the root (/) volume), unmounting is
possible only when booting from a bootable CD or floppy.
Decrease the size of the file system on the software RAID device called
/dev/md0 by entering
resize_reiserfs -s size /dev/md0
Replace size with the desired size in bytes.
You can also specify units on the value, such as 50000K (kilobytes),
250M (megabytes), or 2G (gigabytes). Alternatively, you can specify a
decrease to the current size by prefixing the value with a minus (-)
sign. For example, the following command reduces the size of the file
system on /dev/md0 by 500 MB:
resize_reiserfs -s -500M /dev/md0
Wait until the resizing is completed before continuing.
Mount the file system by entering
mount -t reiserfs /dev/md0 /mnt/point
Check the effect of the resize on the mounted file system by entering
df -h
The Disk Free (df) command shows the total size of
the disk, the number of blocks used, and the number of blocks available
on the file system. The -h option print sizes in human-readable format,
such as 1K, 234M, or 2G.
Continue with Section 11.3.2, “Decreasing the Size of the RAID Array”.
After you have resized the file system, the RAID array configuration
continues to use the original array size until you force it to reduce the
available space. Use the mdadm --grow mode to force the
RAID to use a smaller segment size. To do this, you must use the -z option
to specify the amount of space in kilobytes to use from each device in the
RAID. This size must be a multiple of the chunk size, and it must leave
about 128KB of space for the RAID superblock to be written to the device.
The procedure in this section uses the device name
/dev/md0 for the RAID device. Ensure that you modify
commands to use the name of your own device.
Open a terminal console, then log in as the
root user or equivalent.
Check the size of the array and the device size known to the array by entering
mdadm -D /dev/md0 | grep -e "Array Size" -e "Device Size"
Decrease the size of the array’s device size to a specified value by entering
mdadm --grow /dev/md0 -z <size>
Replace size with an integer value in kilobytes for the desired size. (A kilobyte is 1024 bytes.)
For example, the following command sets the segment size for each RAID device to about 40 GB where the chunk size is 64 KB. It includes 128 KB for the RAID superblock.
mdadm --grow /dev/md2 -z 41943168
Recheck the size of your array and the device size known to the array by entering
mdadm -D /dev/md0 | grep -e "Array Size" -e "Device Size"
Do one of the following:
If your array was successfully resized, continue with Section 11.3.3, “Decreasing the Size of Component Partitions”.
If your array was not resized as you expected, you must reboot, then try this procedure again.
After you decrease the segment size that is used on each device in the RAID, the remaining space in each component partition is not used by the RAID. You can leave partitions at their current size to allow for the RAID to grow at a future time, or you can reclaim this now unused space.
To reclaim the space, you decrease the component partitions one at a time. For each component partition, you remove it from the RAID, reduce its partition size, return the partition to the RAID, then wait until the RAID stabilizes. To allow for metadata, you should specify a slightly larger size than the size you specified for the RAID in Section 11.3.2, “Decreasing the Size of the RAID Array”.
While a partition is removed, the RAID operates in degraded mode and has no or reduced disk fault tolerance. Even for RAIDs that can tolerate multiple concurrent disk failures, you should never remove more than one component partition at a time.
If a RAID does not have disk fault tolerance, or it is simply not consistent, data loss results if you remove any of its partitions. Be very careful when removing partitions, and ensure that you have a backup of your data available.
The procedure in this section uses the device names shown in the following table. Ensure that you modify the commands to use the names of your own devices.
|
RAID Device |
Component Partitions |
|---|---|
|
|
|
To decrease the size of the component partitions for the RAID:
Open a terminal console, then log in as the
root user or equivalent.
Ensure that the RAID array is consistent and synchronized by entering
cat /proc/mdstat
If your RAID array is still synchronizing according to the output of this command, you must wait until synchronization is complete before continuing.
Remove one of the component partitions from the RAID array. For example,
to remove /dev/sda1, enter
mdadm /dev/md0 --fail /dev/sda1 --remove /dev/sda1
In order to succeed, both the fail and remove actions must be done.
Decrease the size of the partition that you removed in
Step 3 to a size that is
slightly larger than the size you set for the segment size. The size
should be a multiple of the chunk size and allow 128 KB for the RAID
superblock. Use a disk partitioner such as fdisk,
cfdisk, or parted to decrease the
size of the partition.
Re-add the partition to the RAID array. For example, to add
/dev/sda1, enter
mdadm -a /dev/md0 /dev/sda1
Wait until the RAID is synchronized and consistent before continuing with the next partition.
Repeat Step 2 through Step 5 for each of the remaining component devices in the array. Ensure that you modify the commands for the correct component partition.
If you get a message that tells you that the kernel could not re-read the partition table for the RAID, you must reboot the computer after resizing all of its component partitions.
(Optional) Expand the size of the RAID and file system to use the maximum amount of space in the now smaller component partitions:
Expand the size of the RAID to use the maximum amount of space that is now available in the reduced-size component partitions:
mdadm --grow /dev/md0 -z max
Expand the size of the file system to use all of the available space in the newly resized RAID. For information, see Section 11.2.3, “Increasing the Size of the File System”.
Storage enclosure LED Monitoring utility (ledmon(8)) and
LED Control (ledctl(8)) utility are Linux user space
applications that use a broad range of interfaces and protocols to control
storage enclosure LEDs. The primary usage is to visualize the status of
Linux MD software RAID devices created with the mdadm utility. The ledmon
daemon monitors the status of the drive array and updates the status of the
drive LEDs. The ledctl utility allows you to set LED patterns for specified
devices.
These LED utilities use the SGPIO (Serial General Purpose Input/Output) specification (Small Form Factor (SFF) 8485) and the SCSI Enclosure Services (SES) 2 protocol to control LEDs. They implement the International Blinking Pattern Interpretation (IBPI) patterns of the SFF-8489 specification for SGPIO. The IBPI defines how the SGPIO standards are interpreted as states for drives and slots on a backplane and how the backplane should visualize the states with LEDs.
Some storage enclosures do not adhere strictly to the SFF-8489 specification. An enclosure processor might accept an IBPI pattern but not blink the LEDs according to the SFF-8489 specification, or the processor might support only a limited number of the IBPI patterns.
LED management (AHCI) and SAF-TE protocols are not supported by the
ledmon and ledctl utilities.
The ledmon and ledctl applications
have been verified to work with Intel storage controllers such as the Intel
AHCI controller and Intel SAS controller. Beginning in SUSE Linux Enterprise Server 11
SP3, they also support PCIe-SSD (solid state disk) enclosure LEDs to
control the storage enclosure status (OK, Fail, Rebuilding) LEDs of
PCIe-SSD devices that are part of an MD software RAID volume. The
applications might also work with the IBPI-compliant storage controllers of
other vendors (especially SAS/SCSI controllers); however, other vendors’
controllers have not been tested.
The ledmon application is a daemon process that
constantly monitors the state of MD software RAID devices or the state of
block devices in a storage enclosure or drive bay. Only a single instance
of the daemon should be running at a time. The ledmon
application is part of Intel Enclosure LED Utilities.
The state is visualized on LEDs associated with each slot in a storage array enclosure or a drive bay. The application monitors all software RAID devices and visualizes their state. It does not provide a way to monitor only selected software RAID volumes.
The ledmon application supports two types of LED
systems: A two-LED system (Activity LED and Status LED) and a three-LED
system (Activity LED, Locate LED, and Fail LED). This tool has the highest
priority when accessing the LEDs.
Sets a path to local configuration file. If this option is specified, the global configuration file and user configuration file have no effect.
Sets a path to local log file. If this user-defined file is specified,
the global log file /var/log/ledmon.log is not
used.
Sets the time interval between scans of sysfs. The
value is given in seconds. The minimum is 5 seconds. The maximum is not
specified.
Specifies the verbose level. The level options are specified in the order of no information to the most information. Use the --quiet option for no logging. Use the --all option to log everything. If you specify more then one verbose option, the last option in the command applies.
Prints the command information to the console, then exits.
Displays version of ledmon and information about the
license, then exits.
Global log file, used by ledmon application. To
force logging to a user-defined file, use the -l
option.
User configuration file, shared between ledmon and
all ledctl application instances.
Global configuration file, shared between ledmon and
all ledctl application instances.
The ledmon daemon does not recognize the PFA (Predicted
Failure Analysis) state from the SFF-8489 specification. Thus, the PFA
pattern is not visualized.
The Enclosure LED Application (ledctl(8)) is a user
space application that controls LEDs associated with each slot in a storage
enclosure or a drive bay. The ledctl application is a
part of Intel Enclosure LED Utilities.
When you issue the command, the LEDs of the specified devices are set to a
specified pattern and all other LEDs are turned off. User must have root
privileges to use this application. Because the ledmon
application has the highest priority when accessing LEDs, some patterns set
by ledctl might have no effect if ledmon is running
(except the Locate pattern).
The ledctl application supports two types of LED
systems: A two-LED system (Activity LED and Status LED) and a three-LED
system (Activity LED, Fail LED, and Locate LED).
ledctl [options] pattern_name=list_of_devices
Issue the command as the root user or as user
with root privileges.
The ledctl application accepts the following names for
argument, according to the SFF-8489
specification.
Turns on the Locate LED associated with the specified devices or empty slots. This state is used to identify a slot or drive.
Turns off the Locate LED associated with the specified devices or empty slots.
Turns off the Status LED, Failure LED, and Locate LED associated with the specified devices.
Turns off only the Status LED and Failure LED associated with the specified devices.
Visualizes the In a Critical Array pattern.
Visualizes the Rebuild pattern. This supports both
of the rebuild states for compatibility and legacy reasons.
Visualizes the In a Failed Array pattern.
Visualizes the Hotspare pattern.
Visualizes the Predicted Failure Analysis pattern.
Visualizes the Failure pattern.
SES-2 R/R ABORT
SES-2 REBUILD/REMAP
SES-2 IN FAILED ARRAY
SES-2 IN CRITICAL ARRAY
SES-2 CONS CHECK
SES-2 HOTSPARE
SES-2 RSVD DEVICE
SES-2 OK
SES-2 IDENT
SES-2 REMOVE
SES-2 INSERT
SES-2 MISSING
SES-2 DO NOT REMOVE
SES-2 ACTIVE
SES-2 ENABLE BYP B
SES-2 ENABLE BYP A
SES-2 DEVICE OFF
SES-2 FAULT
When a non-SES-2 pattern is sent to a device in an enclosure, the pattern is automatically translated to the SCSI Enclosure Services (SES) 2 pattern as shown in Table 12.1, “Translation between Non-SES-2 Patterns and SES-2 Patterns”.
|
Non-SES-2 Pattern |
SES-2 Pattern |
|---|---|
|
locate |
ses_ident |
|
locate_off |
ses_ident |
|
normal |
ses_ok |
|
off |
ses_ok |
|
ica |
ses_ica |
|
degraded |
ses_ica |
|
rebuild |
ses_rebuild |
|
rebuild_p |
ses_rebuild |
|
ifa |
ses_ifa |
|
failed_array |
ses_ifa |
|
hotspare |
ses_hotspare |
|
pfa |
ses_rsvd_dev |
|
failure |
ses_fault |
|
disk_failed |
ses_fault |
When you issue the ledctl command, the LEDs of the
specified devices are set to the specified pattern and all other LEDs are
turned off. The list of devices can be provided in one of two formats:
A list of devices separated by a comma and no spaces
A list in curly braces with devices separated by a space
If you specify multiple patterns in the same command, the device list for each pattern can use the same or different format. For examples that show the two list formats, see Section 12.4.7, “Examples”.
A device is a path to file in the /dev directory or
in the /sys/block directory. The path can identify a
block device, an MD software RAID device, or a container device. For a
software RAID device or a container device, the reported LED state is set
for all of the associated block devices.
The LEDs of devices listed in list_of_devices are set to the given pattern pattern_name and all other LEDs are turned off.
Sets a path to local configuration file. If this option is specified, the global configuration file and user configuration file have no effect.
Sets a path to local log file. If this user-defined file is specified,
the global log file /var/log/ledmon.log is not
used.
Turns off all messages sent to stdout or
stderr out. The messages are still logged to local
file and the syslog facility.
Prints the command information to the console, then exits.
Displays version of ledctl and information about the
license, then exits.
Global log file, used by all instances of the ledctl
application. To force logging to a user-defined file, use the
-l option.
User configuration file, shared between ledmon and
all ledctl application instances.
Global configuration file, shared between ledmon and
all ledctl application instances.
To locate a single block device:
ledctl locate=/dev/sda
To turn off the Locate LED off for a single block device:
ledctl locate_off=/dev/sda
To locate disks of an MD software RAID device and to set a rebuild pattern for two of its block devices at the same time:
ledctl locate=/dev/md127 rebuild={ /sys/block/sd[a-b] }To turn off the Status LED and Failure LED for the specified devices:
ledctl off={ /dev/sda /dev/sdb }To locate three block devices:
ledctl locate=/dev/sda,/dev/sdb,/dev/sdc
ledctl locate={ /dev/sda /dev/sdb /dev/sdc }
The ledctl.conf file is the configuration file for the
Intel Enclosure LED Utilities. The utilities do not use a configuration
file at the moment. The name and location of file have been reserved for
feature improvements.
See the following resources for details about the LED patterns and monitoring tools:
Storage area networks (SANs) can contain many disk drives that are dispersed across complex networks. This can make device discovery and device ownership difficult. iSCSI initiators must be able to identify storage resources in the SAN and determine whether they have access to them.
Internet Storage Name Service (iSNS) is a standards-based service that facilitates the automated discovery, management, and configuration of iSCSI devices on a TCP/IP network. iSNS provides intelligent storage discovery and management services comparable to those found in Fibre Channel networks.
iSNS should be used only in secure internal networks.
For an iSCSI initiator to discover iSCSI targets, it needs to identify which devices in the network are storage resources and what IP addresses it needs to access them. A query to an iSNS server returns a list of iSCSI targets and the IP addresses that the initiator has permission to access.
Using iSNS, you create iSNS discovery domains and discovery domain sets. You then group or organize iSCSI targets and initiators into discovery domains and group the discovery domains into discovery domain sets. By dividing storage nodes into domains, you can limit the discovery process of each host to the most appropriate subset of targets registered with iSNS, which allows the storage network to scale by reducing the number of unnecessary discoveries and by limiting the amount of time each host spends establishing discovery relationships. This lets you control and simplify the number of targets and initiators that must be discovered.
Both iSCSI targets and iSCSI initiators use iSNS clients to initiate transactions with iSNS servers by using the iSNS protocol. They then register device attribute information in a common discovery domain, download information about other registered clients, and receive asynchronous notification of events that occur in their discovery domain.
iSNS servers respond to iSNS protocol queries and requests made by iSNS clients using the iSNS protocol. iSNS servers initiate iSNS protocol state change notifications and store properly authenticated information submitted by a registration request in an iSNS database.
Some of the benefits provided by iSNS for Linux include:
Provides an information facility for registration, discovery, and management of networked storage assets.
Integrates with the DNS infrastructure.
Consolidates registration, discovery, and management of iSCSI storage.
Simplifies storage management implementations.
Improves scalability compared to other discovery methods.
An example of the benefits iSNS provides can be better understood through the following scenario:
Suppose you have a company that has 100 iSCSI initiators and 100 iSCSI targets. Depending on your configuration, all iSCSI initiators could potentially try to discover and connect to any of the 100 iSCSI targets. This could create discovery and connection difficulties. By grouping initiators and targets into discovery domains, you can prevent iSCSI initiators in one department from discovering the iSCSI targets in another department. The result is that the iSCSI initiators in a specific department only discover those iSCSI targets that are part of the department’s discovery domain.
iSNS Server for Linux is included with SLES 10 SP2 and later, but is not
installed or configured by default. You must install the iSNS package
modules (isns and yast2-isns
modules) and configure the iSNS service.
iSNS can be installed on the same server where iSCSI target or iSCSI initiator software is installed. Installing both the iSCSI target software and iSCSI initiator software on the same server is not supported.
To install iSNS for Linux:
Start YaST and select .
When prompted to install the isns package, click
.
Follow the install instructions to provide the SUSE Linux Enterprise Server 11 installation disks.
When the installation is complete, the iSNS Service configuration dialog box opens automatically to the tab.
In , specify the DNS name or IP address of the iSNS Server.
In , select one of the following:
When Booting: The iSNS service starts automatically on server startup.
Manually (Default):
The iSNS service must be started manually by entering rcisns
start or /etc/init.d/isns start at the
server console of the server where you install it.
Specify the following firewall settings:
Open Port in Firewall: Select the check box to open the firewall and allow access to the service from remote computers. The firewall port is closed by default.
Firewall Details: If you open the firewall port, the port is open on all network interfaces by default. Click to select interfaces on which to open the port, select the network interfaces to use, then click .
Click to apply the configuration settings and complete the installation.
Continue with Section 13.3, “Configuring iSNS Discovery Domains”.
In order for iSCSI initiators and targets to use the iSNS service, they must belong to a discovery domain.
The SNS service must be installed and running to configure iSNS discovery domains. For information, see Section 13.4, “Starting iSNS”.
A default discovery domain named is automatically created when you install the iSNS service. The existing iSCSI targets and initiators that have been configured to use iSNS are automatically added to the default discovery domain.
To create a new discovery domain:
Start YaST and under , select .
Click the tab.
The area lists all discovery domains. You can create new discovery domains, or delete existing ones.Deleting a domain removes the members from the domain, but it does not delete the iSCSI node members.
The area lists all iSCSI nodes assigned to a selected discovery domain. Selecting a different discovery domain refreshes the list with members from that discovery domain. You can add and delete iSCSI nodes from a selected discovery domain. Deleting an iSCSI node removes it from the domain, but it does not delete the iSCSI node.
Creating an iSCSI node allows a node that is not yet registered to be added as a member of the discovery domain. When the iSCSI initiator or target registers this node, then it becomes part of this domain.
When an iSCSI initiator performs a discovery request, the iSNS service returns all iSCSI node targets that are members of the same discovery domain.
Click the button.
You can also select an existing discovery domain and click the button to remove that discovery domain.
Specify the name of the discovery domain you are creating, then click .
Continue with Section 13.3.2, “Creating iSNS Discovery Domain Sets”.
Discovery domains must belong to a discovery domain set. You can create a discovery domain and add nodes to that discovery domain, but it is not active and the iSNS service does not function unless you add the discovery domain to a discovery domain set. A default discovery domain set named is automatically created when you install iSNS and the default discovery domain is automatically added to that domain set.
To create a discovery domain set:
Start YaST and under , select .
Click the tab.
The area lists all of the discover domain sets. A discovery domain must be a member of a discovery domain set in order to be active.
In an iSNS database, a discovery domain set contains discovery domains, which in turn contains iSCSI node members.
The area lists all discovery domains that are assigned to a selected discovery domain set. Selecting a different discovery domain set refreshes the list with members from that discovery domain set. You can add and delete discovery domains from a selected discovery domain set. Removing a discovery domain removes it from the domain set, but it does not delete the discovery domain.
Adding an discovery domain to a set allows a not yet registered iSNS discovery domain to be added as a member of the discovery domain set.
Click the button.
You can also select an existing discovery domain set and click the button to remove that discovery domain set.
Specify the name of the discovery domain set you are creating, then click .
Continue with Section 13.3.3, “Adding iSCSI Nodes to a Discovery Domain”.
Start YaST and under , select .
Click the tab.
Review the list of nodes to ensure that the iSCSI targets and initiators that you want to use the iSNS service are listed.
If an iSCSI target or initiator is not listed, you might need to restart
the iSCSI service on the node. You can do this by running the
rcopen-iscsi restart command to restart an initiator
or the rciscsitarget restart command to restart a
target.
You can select an iSCSI node and click the button to remove that node from the iSNS database. This is useful if you are no longer using an iSCSI node or have renamed it.
The iSCSI node is automatically added to the list (iSNS database) again when you restart the iSCSI service or reboot the server unless you remove or comment out the iSNS portion of the iSCSI configuration file.
Click the tab, select the desired discovery domain, then click the button.
Click , select the node you want to add to the domain, then click .
Repeat Step 5 for as many nodes as you want to add to the discovery domain, then click when you are finished adding nodes.
An iSCSI node can belong to more than one discovery domain.
Continue with Section 13.3.4, “Adding Discovery Domains to a Discovery Domain Set”.
Start YaST and under , select .
Click the tab.
Select to add a new set to the list of discovery domain sets.
Choose a discovery domain set to modify.
Click , select the discovery domain you want to add to the discovery domain set, then click .
Repeat the last step for as many discovery domains as you want to add to the discovery domain set, then click .
A discovery domain can belong to more than one discovery domain set.
iSNS must be started at the server where you install it. Enter one of the
following commands at a terminal console as the
root user:
rcisns start/etc/init.d/isns start
You can also use the stop, status,
and restart options with iSNS.
iSNS can also be configured to start automatically each time the server is rebooted:
Start YaST and under , select .
With the tab selected, specify the IP address of your iSNS server, then click .
In the section of the screen, select .
You can also choose to start the iSNS server manually. You must then use
the rcisns start command to start the service each
time the server is restarted.
iSNS must be stopped at the server where it is running. Enter one of the
following commands at a terminal console as the
root user:
rcisns stop/etc/init.d/isns stopFor information, see the Linux iSNS for iSCSI project. The electronic mailing list for this project is Linux iSNS - Discussion.
General information about iSNS is available in RFC 4171: Internet Storage Name Service.
One of the central tasks in computer centers and when operating servers is providing hard disk capacity for server systems. Fibre Channel is often used for this purpose. iSCSI (Internet SCSI) solutions provide a lower-cost alternative to Fibre Channel that can leverage commodity servers and Ethernet networking equipment. Linux iSCSI provides iSCSI initiator and target software for connecting Linux servers to central storage systems.
iSCSI is a storage networking protocol that facilitates data transfers of SCSI packets over TCP/IP networks between block storage devices and servers. iSCSI target software runs on the target server and defines the logical units as iSCSI target devices. iSCSI initiator software runs on different servers and connects to the target devices to make the storage devices available on that server.
It is not supported to run iSCSI target software and iSCSI initiator software on the same server in a production environment.
The iSCSI target and initiator servers communicate by sending SCSI packets at the IP level in your LAN. When an application running on the initiator server starts an inquiry for an iSCSI target device, the operating system produces the necessary SCSI commands. The SCSI commands are then embedded in IP packets and encrypted as necessary by software that is commonly known as the iSCSI initiator. The packets are transferred across the internal IP network to the corresponding iSCSI remote station, called the iSCSI target.
Many storage solutions provide access over iSCSI, but it is also possible to run a Linux server that provides an iSCSI target. In this case, it is important to set up a Linux server that is optimized for file system services. The iSCSI target accesses block devices in Linux. Therefore, it is possible to use RAID solutions to increase disk space as well as a lot of memory to improve data caching. For more information about RAID, also see Chapter 8, Software RAID Configuration.
YaST includes entries for iSCSI Target and iSCSI Initiator software, but the packages are not installed by default.
It is not supported to run iSCSI target software and iSCSI initiator software on the same server in a production environment.
Install the iSCSI target software on the server where you want to create iSCSI target devices.
Launch YaST as the root user.
Select
When you are prompted to install the iscsitarget
package, click .
Follow the on-screen install instructions, and provide the installation media as needed.
When the installation is complete, YaST opens to the iSCSI Target Overview page with the tab selected.
Continue with Section 14.2, “Setting Up an iSCSI Target”.
Install the iSCSI initiator software on each server where you want to access the target devices that you set up on the iSCSI target server.
Launch YaST as the root user.
Select
When you are prompted to install the open-iscsi
package, click .
Follow the on-screen install instructions, and provide the installation media as needed.
When the installation is complete, YaST opens to the iSCSI Initiator Overview page with the tab selected.
Continue with Section 14.3, “Configuring iSCSI Initiator”.
SUSE Linux Enterprise Server comes with an open source iSCSI target solution that evolved from the Ardis iSCSI target. A basic setup can be done with YaST, but to take full advantage of iSCSI, a manual setup is required.
The iSCSI target configuration exports existing block devices to iSCSI initiators. You must prepare the storage space you want to use in the target devices by setting up unformatted partitions or devices by using the Partitioner in YaST, or by partitioning the devices from the command line. iSCSI LIO targets can use unformatted partitions with Linux, Linux LVM, or Linux RAID file system IDs.
After you set up a device or partition for use as an iSCSI target, you never access it directly via its local path. Do not specify a mount point for it when you create it.
Launch YaST as the root user.
Select .
Click to continue through the warning about using the Partitioner.
Click to create a partition, but do not format it, and do not mount it.
Select , then click .
Specify the amount of space to use, then click .
Select , then specify the file system ID type.
iSCSI targets can use unformatted partitions with Linux, Linux LVM, or Linux RAID file system IDs.
Select .
Click .
Repeat Step 4 for each area that you want to use later as an iSCSI LUN.
Click to keep your changes, then close YaST.
You can use a Xen guest server as the iSCSI target server. You must assign the storage space you want to use for the iSCSI storage devices to the guest virtual machine, then access the space as virtual disks within the guest environment. Each virtual disk can be a physical block device, such as an entire disk, partition, or volume, or it can be a file-backed disk image where the virtual disk is a single image file on a larger physical disk on the Xen host server. For the best performance, create each virtual disk from a physical disk or a partition. After you set up the virtual disks for the guest virtual machine, start the guest server, then configure the new blank virtual disks as iSCSI target devices by following the same process as for a physical server.
file-backed disk images are created on the Xen host server, then assigned
to the Xen guest server. By default, Xen stores file-backed disk images
in the
/var/lib/xen/images/vm_name
directory, where vm_name
is the name of the virtual machine.
For example, if you want to create the disk image
/var/lib/xen/images/vm_one/xen-0 with a size of 4
GB, first ensure that the directory is there, then create the image
itself.
Log in to the host server as the root user.
At a terminal console prompt, enter the following commands
mkdir -p /var/lib/xen/images/vm_one dd if=/dev/zero of=/var/lib/xen/images/vm_one/xen-0 seek=1M bs=4096 count=1
Assign the file system image to the guest virtual machine in the Xen configuration file.
Log in as the root user on the guest server,
then use YaST to set up the virtual block device by using the process
in Section 14.2.1.1, “Partitioning Devices”.
To configure the iSCSI target, run the
module in YaST (command yast2 iscsi-server). The
configuration is split into three tabs. In the
tab, select the start mode and the firewall settings. If you want to
access the iSCSI target from a remote machine, select . If an iSNS server should manage the discovery and
access control, activate and enter
the IP address of your iSNS server. You cannot use hostnames or DNS names;
you must use the IP address. For more about iSNS, read
Chapter 13, iSNS for Linux.
The tab provides settings for the iSCSI server. The authentication set here is used for the discovery of services, not for accessing the targets. If you do not want to restrict the access to the discovery, use .
If authentication is needed, there are two possibilities to consider. requires the initiator to prove that it has permission to run a discovery on the iSCSI target. requires the iSCSI target to prove that it is the expected target. Therefore, the iSCSI target can also provide a user name and password. Find more information about authentication in RFC 3720.
The targets are defined in the tab. Use to create a new iSCSI target. The first dialog box asks for information about the device to export.
The line has a fixed syntax that looks like the following:
iqn.yyyy-mm.<reversed domain name>:unique_id
It always starts with iqn. yyyy-mm is the format of the date when this target is activated. Find more about naming conventions in RFC 3722.
The is freely selectable. It should follow some scheme to make the whole system more structured.
It is possible to assign several LUNs to a target. To do this, select a target in the tab, then click . Then, add new LUNs to an existing target.
Add the path to the block device or file system image to export.
The next menu configures the access restrictions of the target. The configuration is very similar to the configuration of the discovery authentication. In this case, at least an incoming authentication should be setup.
finishes the configuration of the new target, and brings you back to the overview page of the tab. Activate your changes by clicking .
To create a target device:
Launch YaST as the root user.
Select
YaST opens to the iSCSI Target Overview page with the tab selected.
In the area, select one of the following:
When booting: Automatically start the initiator service on subsequent server reboots.
Manually (default): Start the service manually.
If you are using iSNS for target advertising, select the check box, then type the IP address.
If desired, open the firewall ports to allow access to the server from remote computers.
Select the check box.
Specify the network interfaces where you want to open the port by clicking , selecting the check box next to a network interface to enable it, then clicking to accept the settings.
If authentication is required to connect to target devices you set up on this server, select the tab, deselect to enable authentication, then specify the necessary credentials for incoming and outgoing authentication.
The option is enabled by default. For a more secure configuration, you can specify authentication for incoming, outgoing, or both incoming and outgoing. You can also specify multiple sets of credentials for incoming authentication by adding pairs of user names and passwords to the list under .
Configure the iSCSI target devices.
Select the tab.
If you have not already done so, select and delete the example iSCSI target from the list, then confirm the deletion by clicking .
Click to add a new iSCSI target.
The iSCSI target automatically presents an unformatted partition or block device and completes the Target and Identifier fields.
You can accept this, or browse to select a different space.
You can also subdivide the space to create LUNs on the device by clicking and specifying sectors to allocate to that LUN. If you need additional options for these LUNs, select .
Click
Repeat Step 7.c to Step 7.e for each iSCSI target device you want to create.
(Optional) On the tab, click to export the information about the configured iSCSI targets to a file.
This makes it easier to later provide this information to consumers of the resources.
Click to create the devices, then click to restart the iSCSI software stack.
Configure an iSCSI target in /etc/ietd.conf. All
parameters in this file before the first Target
declaration are global for the file. Authentication information in this
portion has a special meaning—it is not global, but is used for the
discovery of the iSCSI target.
If you have access to an iSNS server, you should first configure the file to tell the target about this server. The address of the iSNS server must always be given as an IP address. You cannot specify the DNS name for the server. The configuration for this functionality looks like the following:
iSNSServer 192.168.1.111 iSNSAccessControl no
This configuration makes the iSCSI target register itself with the
iSNS server, which in turn provides the discovery for
initiators. For more about iSNS, see
Chapter 13, iSNS for Linux. The access control for
the iSNS discovery is not supported. Just keep iSNSAccessControl
no.
All direct iSCSI authentication can be done in two directions. The iSCSI
target can require the iSCSI initiator to authenticate with the
IncomingUser, which can be added multiple times. The
iSCSI initiator can also require the iSCSI target to authenticate. Use
OutgoingUser for this. Both have the same syntax:
IncomingUser <username> <password> OutgoingUser <username> <password>
The authentication is followed by one or more target definitions. For each
target, add a Target section. This section always
starts with a Target identifier followed, by
definitions of logical unit numbers:
Target iqn.yyyy-mm.<reversed domain name>[:identifier]
Lun 0 Path=/dev/mapper/system-v3
Lun 1 Path=/dev/hda4
Lun 2 Path=/var/lib/xen/images/xen-1,Type=fileio
In the Target line, yyyy-mm is the
date when this target is activated, and identifier is
freely selectable. Find more about naming conventions in
RFC
3722. Three different block devices are exported in
this example. The first block device is a logical volume (see also
Chapter 4, LVM Configuration), the second is an IDE
partition, and the third is an image available in the local file system.
All these look like block devices to an iSCSI initiator.
Before activating the iSCSI target, add at least one
IncomingUser after the Lun definitions.
It does the authentication for the use of this target.
To activate all your changes, restart the iscsitarget daemon with
rcopen-iscsi restart. Check your
configuration in the /proc file system:
cat /proc/net/iet/volume
tid:1 name:iqn.2006-02.com.example.iserv:systems
lun:0 state:0 iotype:fileio path:/dev/mapper/system-v3
lun:1 state:0 iotype:fileio path:/dev/hda4
lun:2 state:0 iotype:fileio path:/var/lib/xen/images/xen-1
There are many more options that control the behavior of the iSCSI target.
For more information, see the man page of ietd.conf.
Active sessions are also displayed in the /proc file
system. For each connected initiator, an extra entry is added to
/proc/net/iet/session:
cat /proc/net/iet/session
tid:1 name:iqn.2006-02.com.example.iserv:system-v3
sid:562949957419520 initiator:iqn.2005-11.de.suse:cn=rome.example.com,01.9ff842f5645
cid:0 ip:192.168.178.42 state:active hd:none dd:none
sid:281474980708864 initiator:iqn.2006-02.de.suse:01.6f7259c88b70
cid:0 ip:192.168.178.72 state:active hd:none dd:none
When changes to the iSCSI target configuration are necessary, you must
always restart the target to activate changes that are done in the
configuration file. Unfortunately, all active sessions are interrupted in
this process. To maintain an undisturbed operation, the changes should be
done in the main configuration file /etc/ietd.conf,
but also made manually to the current configuration with the
administration utility ietadm.
To create a new iSCSI target with a LUN, first update your configuration file. The additional entry could be:
Target iqn.2006-02.com.example.iserv:system2
Lun 0 Path=/dev/mapper/system-swap2
IncomingUser joe secretTo set up this configuration manually, proceed as follows:
Create a new target with the command ietadm --op new --tid=2
--params Name=iqn.2006-02.com.example.iserv:system2.
Add a logical unit with ietadm --op new --tid=2 --lun=0
--params Path=/dev/mapper/system-swap2.
Set the user name and password combination on this target with
ietadm --op new --tid=2 --user
--params=IncomingUser=joe,Password=secret.
Check the configuration with cat
/proc/net/iet/volume.
It is also possible to delete active connections. First, check all active
connections with the command cat /proc/net/iet/session.
This might look like:
cat /proc/net/iet/session
tid:1 name:iqn.2006-03.com.example.iserv:system
sid:281474980708864 initiator:iqn.1996-04.com.example:01.82725735af5
cid:0 ip:192.168.178.72 state:active hd:none dd:none
To delete the session with the session ID 281474980708864, use the command
ietadm --op delete --tid=1 --sid=281474980708864
--cid=0. Be aware that this makes the device inaccessible on the
client system and processes accessing this device are likely to hang.
ietadm can also be used to change various configuration parameters. Obtain
a list of the global variables with ietadm --op show --tid=1
--sid=0. The output looks like:
InitialR2T=Yes ImmediateData=Yes MaxConnections=1 MaxRecvDataSegmentLength=8192 MaxXmitDataSegmentLength=8192 MaxBurstLength=262144 FirstBurstLength=65536 DefaultTime2Wait=2 DefaultTime2Retain=20 MaxOutstandingR2T=1 DataPDUInOrder=Yes DataSequenceInOrder=Yes ErrorRecoveryLevel=0 HeaderDigest=None DataDigest=None OFMarker=No IFMarker=No OFMarkInt=Reject IFMarkInt=Reject
All of these parameters can be easily changed. For example, if you want to change the maximum number of connections to two, use
ietadm --op update --tid=1 --params=MaxConnections=2.
In the file /etc/ietd.conf, the associated line
should look like MaxConnections 2.
The changes that you make with the ietadm utility are
not permanent for the system. These changes are lost at the next reboot
if they are not added to the /etc/ietd.conf
configuration file. Depending on the usage of iSCSI in your network, this
might lead to severe problems.
There are several more options available for the ietadm
utility. Use ietadm -h to find an overview. The
abbreviations there are target ID (tid), session ID (sid), and connection
ID (cid). They can also be found in
/proc/net/iet/session.
The iSCSI initiator, also called an iSCSI client, can be used to connect to any iSCSI target. This is not restricted to the iSCSI target solution explained in Section 14.2, “Setting Up an iSCSI Target”. The configuration of iSCSI initiator involves two major steps: the discovery of available iSCSI targets and the setup of an iSCSI session. Both can be done with YaST.
The iSCSI Initiator Overview in YaST is divided into three tabs:
Service: The tab can be used to enable the iSCSI initiator at boot time. It also offers to set a unique and an iSNS server to use for the discovery. The default port for iSNS is 3205.
Connected Targets: The tab gives an overview of the currently connected iSCSI targets. Like the tab, it also gives the option to add new targets to the system.
On this page, you can select a target device, then toggle the start-up setting for each iSCSI target device:
Automatic: This option is used for iSCSI targets that are to be connected when the iSCSI service itself starts up. This is the typical configuration.
Onboot:
This option is used for iSCSI targets that are to be connected during
boot; that is, when root (/) is on iSCSI. As
such, the iSCSI target device will be evaluated from the initrd on
server boots.
Discovered Targets: provides the possibility of manually discovering iSCSI targets in the network.
Launch YaST as the root user.
Select (you can also use the yast2
iscsi-client.
YaST opens to the iSCSI Initiator Overview page with the tab selected.
In the area, select one of the following:
When booting: Automatically start the initiator service on subsequent server reboots.
Manually (default): Start the service manually.
Specify or verify the .
Specify a well-formed iSCSI qualified name (IQN) for the iSCSI initiator on this server. The initiator name must be globally unique on your network. The IQN uses the following general format:
iqn.yyyy-mm.com.mycompany:n1:n2
where n1 and n2 are alphanumeric characters. For example:
iqn.1996-04.de.suse:01:9c83a3e15f64
The is automatically completed with
the corresponding value from the
/etc/iscsi/initiatorname.iscsi file on the server.
If the server has iBFT (iSCSI Boot Firmware Table) support, the is completed with the corresponding value in the IBFT, and you are not able to change the initiator name in this interface. Use the BIOS Setup to modify it instead.The iBFT is a block of information containing various parameters useful to the iSCSI boot process, including the iSCSI target and initiator descriptions for the server.
Use either of the following methods to discover iSCSI targets on the network.
iSNS: To use iSNS (Internet Storage Name Service) for discovering iSCSI targets, continue with Section 14.3.1.2, “Discovering iSCSI Targets by Using iSNS”.
Discovered Targets: To discover iSCSI target devices manually, continue with Section 14.3.1.3, “Discovering iSCSI Targets Manually”.
Before you can use this option, you must have already installed and configured an iSNS server in your environment. For information, see Chapter 13, iSNS for Linux.
In YaST, select, then select the tab.
Specify the IP address of the iSNS server and port.
The default port is 3205.
On the iSCSI Initiator Overview page, click to save and apply your changes.
Repeat the following process for each of the iSCSI target servers that you want to access from the server where you are setting up the iSCSI initiator.
In YaST, select, then select the tab.
Click to open the iSCSI Initiator Discovery dialog box.
Enter the IP address and change the port if needed. IPv6 addresses are supported.
The default port is 3260.
If authentication is required, deselect , then specify the credentials the or authentication.
Click to start the discovery and connect to the iSCSI target server.
If credentials are required, after a successful discovery, use to activate the target.
You are prompted for authentication credentials to use the selected iSCSI target.
Click to finish the configuration.
If everything went well, the target now appears in .
The virtual iSCSI device is now available.
On the iSCSI Initiator Overview page, click to save and apply your changes.
You can find the local device path for the iSCSI target device by using
the lsscsi command:
lsscsi [1:0:0:0] disk IET VIRTUAL-DISK 0 /dev/sda
In YaST, select, then select the tab to view a list of the iSCSI target devices that are currently connected to the server.
Select the iSCSI target device that you want to manage.
Click to modify the setting:
Automatic: This option is used for iSCSI targets that are to be connected when the iSCSI service itself starts up. This is the typical configuration.
Onboot:
This option is used for iSCSI targets that are to be connected
during boot; that is, when root (/) is on
iSCSI. As such, the iSCSI target device will be evaluated from the
initrd on server boots.
Click to save and apply your changes.
Both the discovery and the configuration of iSCSI connections require a
running iscsid. When running the discovery the first time, the internal
database of the iSCSI initiator is created in the directory
/var/lib/open-iscsi.
If your discovery is password protected, provide the authentication
information to iscsid. Because the internal database does not exist when
doing the first discovery, it cannot be used at this time. Instead, the
configuration file /etc/iscsid.conf must be edited to
provide the information. To add your password information for the
discovery, add the following lines to the end of
/etc/iscsid.conf:
discovery.sendtargets.auth.authmethod = CHAP discovery.sendtargets.auth.username = <username> discovery.sendtargets.auth.password = <password>
The discovery stores all received values in an internal persistent database. In addition, it displays all detected targets. Run this discovery with the following command:
iscsiadm -m discovery --type=st --portal=<targetip>The output should look like the following:
10.44.171.99:3260,1 iqn.2006-02.com.example.iserv:systems
To discover the available targets on a iSNS server, use
the following command:
iscsiadm --mode discovery --type isns --portal <targetip>
For each target defined on the iSCSI target, one line appears. For more information about the stored data, see Section 14.3.3, “The iSCSI Client Databases”.
The special --login option of iscsiadm
creates all needed devices:
iscsiadm -m node -n iqn.2006-02.com.example.iserv:systems --login
The newly generated devices show up in the output of
lsscsi and can now be accessed by mount.
All information that was discovered by the iSCSI initiator is stored in
two database files that reside in
/var/lib/open-iscsi. There is one database for the
discovery of targets and one for the discovered nodes. When accessing a
database, you first must select if you want to get your data from the
discovery or from the node database. Do this with the -m
discovery and -m node parameters of
iscsiadm. Using iscsiadm just with
one of these parameters gives an overview of the stored records:
iscsiadm -m discovery 10.44.171.99:3260,1 iqn.2006-02.com.example.iserv:systems
The target name in this example is
iqn.2006-02.com.example.iserv:systems. This name is
needed for all actions that relate to this special data set. To examine
the content of the data record with the ID
iqn.2006-02.com.example.iserv:systems, use the
following command:
iscsiadm -m node --targetname iqn.2006-02.com.example.iserv:systems node.name = iqn.2006-02.com.example.iserv:systems node.transport_name = tcp node.tpgt = 1 node.active_conn = 1 node.startup = manual node.session.initial_cmdsn = 0 node.session.reopen_max = 32 node.session.auth.authmethod = CHAP node.session.auth.username = joe node.session.auth.password = ******** node.session.auth.username_in = <empty> node.session.auth.password_in = <empty> node.session.timeo.replacement_timeout = 0 node.session.err_timeo.abort_timeout = 10 node.session.err_timeo.reset_timeout = 30 node.session.iscsi.InitialR2T = No node.session.iscsi.ImmediateData = Yes ....
To edit the value of one of these variables, use the command
iscsiadm with the update operation.
For example, if you want iscsid to log in to the iSCSI target when it
initializes, set the variable node.startup to the value
automatic:
iscsiadm -m node -n iqn.2006-02.com.example.iserv:systems -p ip:port --op=update --name=node.startup --value=automatic
Remove obsolete data sets with the delete operation If
the target iqn.2006-02.com.example.iserv:systems is no
longer a valid record, delete this record with the following command:
iscsiadm-m node -n iqn.2006-02.com.example.iserv:systems -p ip:port --op=delete
Use this option with caution because it deletes the record without any additional confirmation prompt.
To get a list of all discovered targets, run the iscsiadm -m
node command.
Booting from an iSCSI disk on i386, x86_64, and ppc64 architectures is supported, when iSCSI enabled firmware is used.
To use iSCSI disks during installation, it is necessary to add the following parameter to the boot option line:
withiscsi=1
During installation, an additional screen appears that provides the option to attach iSCSI disks to the system and use them in the installation process.
iSCSI devices will appear asynchronously during the boot process. While
the initrd guarantees that those devices are setup correctly for the root
file system, there are no such guarantees for any other file systems or
mount points like /usr. Hence any system mount points
like /usr or /var are not
supported. If you want to use those devices, ensure correct
synchronisation of the respective services and devices.
In SLES 10, you could add the hotplug option to your
device in the /etc/fstab file to mount iSCSI targets.
For example:
/dev/disk/by-uuid-blah /oracle/db ext3 hotplug,rw 0 2
For SLES 11, the hotplug option no longer works. Use
the nofail option instead. For example:
/dev/sdb1 /mnt/mountpoint ext3 acl,user,nofail 0 0
For information, see TID 7004427: /etc/fstab entry does not mount iSCSI device on boot up .
A firewall might drop packets if it gets to busy. The default for the SUSE Firewall is to drop packets after three minutes. If you find that iSCSI traffic packets are being dropped, you can consider configuring the SUSE Firewall to queue packets instead of dropping them when it gets too busy.
Use the troubleshooting tips in this section when using LVM on iSCSI targets.
When you set up the iSCSI Initiator, ensure that you enable discovery at boot time so that udev can discover the iSCSI devices at boot time and set up the devices to be used by LVM.
Remember that udev provides the default setup for
devices in SLES 11. Ensure that all of the applications that create
devices have a Runlevel setting to run at boot so that
udev can recognize and assign devices for them at
system startup. If the application or service is not started until later,
udev does not create the device automatically as it
would at boot time.
You can check your runlevel settings for LVM2 and iSCSI in by going to . The following services should be enabled at boot (B):
| boot.lvm |
| boot.open-iscsi |
| open-iscsi |
When Open-iSCSI starts, it can mount the targets even if the option
node.startup option is set to manual in the
/etc/iscsi/iscsid.conf file if you manually modified
the configuration file.
Check the
/etc/iscsi/nodes/<target_name>/<ip_address,port>/default
file. It contains a node.startup setting that overrides
the /etc/iscsi/iscsid.conf file. Setting the mount
option to manual by using the YaST interface also sets the
node.startup = manual in the
/etc/iscsi/nodes/<target_name>/<ip_address,port>/default
files.
The iSCSI protocol has been available for several years. There are many reviews and additional documentation comparing iSCSI with SAN solutions, doing performance benchmarks, or just describing hardware solutions. Important sources of more information about open-iscsi are:
The online documentation and manpages for iscsiadm,
iscsid, ietd.conf, and
ietd and the example configuration file
/etc/iscsid.conf.
LIO (linux-iscsi.org) is the standard open-source multiprotocol SCSI target for Linux. LIO replaced the STGT (SCSI Target) framework as the standard unified storage target in Linux with Linux kernel version 2.6.38 and later. YaST supports the iSCSI LIO Target Server software in SUSE Linux Enterprise Server 11 SP3 and later.
This section describes how to use YaST to configure an iSCSI LIO Target Server and set up iSCSI LIO target devices. You can use any iSCSI initiator software to access the target devices.
Use the YaST Software Management tool to install the iSCSI LIO Target Server software on the SUSE Linux Enterprise Server server where you want to create iSCSI LIO target devices.
Launch YaST as the root user.
Select
Select the tab, type lio,
then click .
Select the iSCSI LIO Target Server packages:
|
iSCSI LIO Target Server Packages |
Description |
|---|---|
|
|
Provides a GUI interface in YaST for the configuration of iSCSI LIO target devices. |
|
|
Provides APIs for configuring and controlling iSCSI LIO target
devices that are used by
|
|
|
Provides SNMP (Simple Network Management Protocol) monitoring of
iSCSI LIO target devices by using the dynamic load module
(
The For information about Net-SNMP, see the open source Net-SNMP Project. |
|
|
Provides debug information for the |
In the lower right corner of the dialog box, click to install the selected packages.
When you are prompted to approve the automatic changes, click
to accept the iSCSI LIO Target Server
dependencies for the lio-utils,
perl-SNMP, and net-snmp
packages.
Close and re-launch YaST, then click and verify that the option is available in the menu.
Continue with Section 15.2, “Starting the iSCSI LIO Target Service”.
The iSCSI LIO Target service is by default configured to be started manually. You can configure the service to start automatically on system restart. If you use a firewall on the server and you want the iSCSI LIO targets to be available to other computers, you must open a port in the firewall for each adapter that you want to use for target access. TCP port 3260 is the port number for the iSCSI protocol, as defined by IANA (Internet Assigned Numbers Authority).
To configure the iSCSI LIO Target Server service settings:
Log in to the iSCSI LIO target server as the
root user, then launch a terminal console.
Ensure that the /etc/init.d/target daemon is
running. At the command prompt, enter
/etc/init.d/target start
The command returns a message to confirm that the daemon is started, or that the daemon is already running.
Launch YaST as the root user.
In the YaST Control Center, select , then select .
You can also search for “lio”, then select .
In the iSCSI LIO Target Overview dialog box, select the tab.
Under , specify how you want the iSCSI LIO target service to be started:
When Booting: The service starts automatically on server restart.
Manually: (Default) You must start the service manually after a server restart. The target devices are not available until you start the service.
If you use a firewall on the server and you want the iSCSI LIO targets to be available to other computers, open a port in the firewall for each adapter interface that you want to use for target access.
Firewall settings are disabled by default. They are not needed unless you deploy a firewall on the server. The default port number is 3260. If the port is closed for all of the network interfaces, the iSCSI LIO targets are not available to other computers.
On the tab, select the check box to enable the firewall settings.
Click to view or configure the network interfaces to use.
All available network interfaces are listed, and all are selected by default.
For each interface, specify whether to open or close a port for it:
Open: Select the interface’s check box to open the port. You can also click to open a port on all of the interfaces.
Close: Deselect the interface’s check box to close the port. You can also click to close the port on all of the interfaces.
Click to save and apply your changes.
If you are prompted to confirm the settings, click to continue, or click to return to the dialog box and make the desired changes.
Click to save and apply the iSCSI LIO Target service settings.
Log in as the root user, then launch a terminal
console.
At the command prompt, enter
/etc/init.d/target start
The command returns a message to confirm that the daemon is started, or that the daemon is already running.
The iSCSI LIO Target Server software supports the PPP-CHAP (Point-to-Point Protocol Challenge Handshake Authentication Protocol), a three-way authentication method defined in the Internet Engineering Task Force (IETF) RFC 1994. iSCSI LIO Target Server uses this authentication method for the discovery of iSCSI LIO targets and clients, not for accessing files on the targets. If you do not want to restrict the access to the discovery, use . The option is enabled by default. If authentication for discovery is enabled, its settings apply to all iSCSI LIO target groups.
We recommend that you use authentication for target and client discovery in production environments.
If authentication is needed for a more secure configuration, you can use incoming authentication, outgoing authentication, or both. requires an iSCSI initiator to prove that it has the permissions to run a discovery on the iSCSI LIO target. The initiator must provide the incoming user name and password. requires the iSCSI LIO target to prove to the initiator that it is the expected target. The iSCSI LIO target must provide the outgoing user name and password to the iSCSI initiator. The user name and password pair can be different for incoming and outgoing discovery.
To configure authentication preferences for iSCSI LIO targets:
Log in to the iSCSI LIO target server as the
root user, then launch a terminal console.
Ensure that the /etc/init.d/target daemon is
running. At the command prompt, enter
/etc/init.d/target start
The command returns a message to confirm that the daemon is started, or that the daemon is already running.
Launch YaST as the root user.
In the YaST Control Center, select , then select .
You can also search for “lio”, then select .
In the iSCSI LIO Target Overview dialog box, select the tab to configure the authentication settings. Authentication settings are disabled by default.
Specify whether to require authentication for iSCSI LIO targets:
Disable authentication: (Default) Select the check box to disable incoming and outgoing authentication for discovery on this server. All iSCSI LIO targets on this server can be discovered by any iSCSI initiator client on the same network. This server can discover any iSCSI initiator client on the same network that does not require authentication for discovery. Skip Step 7 and continue with Step 8.
Enable authentication: Deselect the check box. The check boxes for both and are automatically selected. Continue with Step 7.
Configure the authentication credentials needed for incoming discovery, outgoing discovery, or both. The user name and password pair can be different for incoming and outgoing discovery.
Configure incoming authentication by doing one of the following:
Disable incoming authentication: Deselect the check box. All iSCSI LIO targets on this server can be discovered by any iSCSI initiator client on the same network.
Enable incoming authentication: Select the check box, then specify an existing user name and password pair to use for incoming discovery of iSCSI LIO targets.
Configure outgoing authentication by doing one of the following:
Disable outgoing authentication: Deselect the check box. This server can discover any iSCSI initiator client on the same network that does not require authentication for discovery.
Enable outgoing authentication: Select the check box, then specify an existing user name and password pair to use for outgoing discovery of iSCSI initiator clients.
Click to save and apply the settings.
The iSCSI LIO target configuration exports existing block devices to iSCSI initiators. You must prepare the storage space you want to use in the target devices by setting up unformatted partitions or devices on the server. iSCSI LIO targets can use unformatted partitions with Linux, Linux LVM, or Linux RAID file system IDs.
After you set up a device or partition for use as an iSCSI target, you never access it directly via its local path. Do not specify a mount point for it when you create it.
Launch YaST as the root user.
In YaST, select .
Click to continue through the warning about using the Partitioner.
At the bottom of the Partitions page, click to create a partition, but do not format it, and do not mount it.
On the Expert Partitioner page, select ,
then select the leaf node name (such as sdc) of
the disk you want to configure.
Select , then click .
Specify the amount of space to use, then click .
Under , select , then select the file system ID type from the drop-down list.
iSCSI LIO targets can use unformatted partitions with Linux (0x83), Linux LVM (0x8E), or Linux RAID (0xFD) file system IDs.
Under , select .
Click .
Repeat Step 4 to create an unformatted partition for each area that you want to use later as an iSCSI LIO target.
Click to keep your changes, then close YaST.
You can use a virtual machine guest server as a iSCSI LIO Target Server. This section describes how to assign partitions to a Xen virtual machine. You can also use other virtual environments that are supported by SUSE Linux Enterprise Server 11 SP2 or later.
In a Xen virtual environment, you must assign the storage space you want to use for the iSCSI LIO target devices to the guest virtual machine, then access the space as virtual disks within the guest environment. Each virtual disk can be a physical block device, such as an entire disk, partition, or volume, or it can be a file-backed disk image where the virtual disk is a single image file on a larger physical disk on the Xen host server. For the best performance, create each virtual disk from a physical disk or a partition. After you set up the virtual disks for the guest virtual machine, start the guest server, then configure the new blank virtual disks as iSCSI target devices by following the same process as for a physical server.
File-backed disk images are created on the Xen host server, then assigned
to the Xen guest server. By default, Xen stores file-backed disk images in
the
/var/lib/xen/images/vm_name
directory, where vm_name
is the name of the virtual machine.
For example, if you want to create the disk image
/var/lib/xen/images/vm_one/xen-0 with a size of 4 GB,
first ensure that the directory is there, then create the image itself.
Log in to the host server as the root user.
At a terminal console prompt, enter the following commands:
mkdir -p /var/lib/xen/images/vm_one dd if=/dev/zero of=/var/lib/xen/images/vm_one/xen-0 seek=1M bs=4096 count=1
Assign the file system image to the guest virtual machine in the Xen configuration file.
Log in as the root user on the guest server,
then use YaST to set up the virtual block device by using the process
in Section 15.4.1, “Partitioning Devices”.
You can use YaST to configure iSCSI LIO target devices. YaST uses APIs
provided by the lio-utils software. iSCSI LIO targets
can use unformatted partitions with Linux, Linux LVM, or Linux RAID file
system IDs.
Before you begin, create the unformatted partitions that you want to use as iSCSI LIO targets as described in Section 15.4, “Preparing the Storage Space”.
Log in to the iSCSI LIO target server as the
root user, then launch a terminal console.
Ensure that the /etc/init.d/target daemon is
running. At the command prompt, enter
/etc/init.d/target start
The command returns a message to confirm that the daemon is started, or that the daemon is already running.
Launch YaST as the root user.
In the YaST Control Center, select , then select .
You can also search for “lio”, then select .
In the iSCSI LIO Target Overview dialog box, select the tab to configure the targets.
Click , then define a new iSCSI LIO target group and devices:
The iSCSI LIO Target software automatically completes the , , , , and fields. is selected by default.
If you have multiple network interfaces, use the IP address drop-down list to select the IP address of the network interface to use for this target group.
Select if you want to require client authentication for this target group.
Requiring authentication is recommended in a production environment.
Click , browse to select the device or partition, specify a name, then click .
The LUN number is automatically generated, beginning with 0. A name is automatically generated if you leave the field empty.
(Optional) Repeat Step 6.a through Step 6.c to add more targets to this target group.
After all desired targets have been added to the group, click .
On the Modify iSCSI Target Client Setup page, configure information for the clients that are permitted to access LUNs in the target group:
After you specify at least one client for the target group, the , , , and buttons are enabled. You can use or to add more clients for the target group.
Add: Add a new client entry for the selected iSCSI LIO target group.
Edit LUN: Configure which LUNs in the iSCSI LIO target group to map to a selected client. You can map each of the allocated targets to a preferred client LUN.
Edit Auth: Configure the preferred authentication method for a selected client. You can specify no authentication, or you can configure incoming authentication, outgoing authentication, or both.
Delete: Remove a selected client entry from the list of clients allocated to the target group.
Copy: Add a new client entry with the same LUN mappings and authentication settings as a selected client entry. This allows you to easily allocate the same shared LUNs, in turn, to each node in a cluster.
Click , specify the client name, select or deselect the check box, then click to save the settings.
Select a client entry, click , modify the LUN mappings to specify which LUNs in the iSCSI LIO target group to allocate to the selected client, then click to save the changes.
If the iSCSI LIO target group consists of multiple LUNs, you can allocate one or multiple LUNs to the selected client. By default, each of the available LUNs in the group are assigned to a Client LUN.
To modify the LUN allocation, perform one or more of the following actions:
Add: Click to create an new entry, then use the drop-down list to map a Target LUN to it.
Delete: Select the entry, then click to remove a Target LUN mapping.
Change: Select the entry, then use the drop-down list to select which Target LUN to map to it.
Typical allocation plans include the following:
A single server is listed as a client. All of the LUNs in the target group are allocated to it.
You can use this grouping strategy to logically group the iSCSI SAN storage for a given server.
Multiple independent servers are listed as clients. One or multiple target LUNs are allocated to each server. Each LUN is allocated to only one server.
You can use this grouping strategy to logically group the iSCSI SAN storage for a given department or service category in the data center.
Each node of a cluster is listed as a client. All of the shared target LUNs are allocated to each node. All nodes are attached to the devices, but for most file systems, the cluster software locks a device for access and mounts it on only one node at a time. Shared file systems (such as OCFS2) make it possible for multiple nodes to concurrently mount the same file structure and to open the same files with read and write access.
You can use this grouping strategy to logically group the iSCSI SAN storage for a given server cluster.
Select a client entry, click , specify the authentication settings for the client, then click to save the settings.
You can require , or you can configure , , or both. You can specify only one user name and password pair for each client. The credentials can be different for incoming and outgoing authentication for a client. The credentials can be different for each client.
Repeat Step 7.a through Step 7.c for each iSCSI client that can access this target group.
After the client assignments are configured, click .
Click to save and apply the settings.
You can modify an existing iSCSI LIO target group as follows:
Add or remove target LUN devices from a target group
Add or remove clients for a target group
Modify the client LUN-to-target LUN mappings for a client of a target group
Modify the user name and password credentials for a client authentication (incoming, outgoing, or both)
To view or modify the settings for an iSCSI LIO target group:
Log in to the iSCSI LIO target server as the
root user, then launch a terminal console.
Ensure that the /etc/init.d/target daemon is
running. At the command prompt, enter
/etc/init.d/target start
The command returns a message to confirm that the daemon is started, or that the daemon is already running.
Launch YaST as the root user.
In the YaST Control Center, select , then select .
You can also search for “lio”, then select .
In the iSCSI LIO Target Overview dialog box, select the tab to view a list of target groups.
Select the iSCSI LIO target group to be modified, then click .
On the Modify iSCSI Target LUN Setup page, add LUNs to the target group, edit the LUN assignments, or remove target LUNs from the group. After all desired changes have been made to the group, click .
For option information, see Step 6 in Section 15.5, “Setting Up an iSCSI LIO Target Group”.
On the Modify iSCSI Target Client Setup page, configure information for the clients that are permitted to access LUNs in the target group. After all desired changes have been made to the group, click .
For option information, see Step 7 in Section 15.5, “Setting Up an iSCSI LIO Target Group”.
Click to save and apply the settings.
Deleting an iSCSI LIO target group removes the definition of the group, and the related setup for clients, including LUN mappings and authentication credentials. It does not destroy the data on the partitions. To give clients access again, you can allocate the target LUNs to a different or new target group, and configure the client access for them.
Log in to the iSCSI LIO target server as the
root user, then launch a terminal console.
Ensure that the /etc/init.d/target daemon is
running. At the command prompt, enter
/etc/init.d/target start
The command returns a message to confirm that the daemon is started, or that the daemon is already running.
Launch YaST as the root user.
In the YaST Control Center, select , then select .
You can also search for “lio”, then select .
In the iSCSI LIO Target Overview dialog box, select the tab to view a list of configured target groups.
Select the iSCSI LIO target group to be deleted, then click .
When you are prompted, click to confirm the deletion, or click to cancel it.
Click to save and apply the settings.
This section describes some known issues and possible solutions for iSCSI LIO Target Server.
When adding or editing an iSCSI LIO target group, you get an error:
Problem setting network portal <ip_address>:3260
The /var/log/YasT2/y2log log file contains the
following error:
find: `/sys/kernel/config/target/iscsi': No such file or directory
This problem occurs if the iSCSI LIO Target Server software is not currently running. To resolve this issue, exit YaST, manually start iSCSI LIO at the command line, then try again.
Open a terminal console as the root user.
At the command prompt, enter
/etc/init.d/target start
You can also enter the following to check if configfs,
iscsi_target_mod, and
target_core_mod are loaded. A sample response is shown.
lsmod | grep iscsi
iscsi_target_mod 295015 0
target_core_mod 346745 4
iscsi_target_mod,target_core_pscsi,target_core_iblock,target_core_file
configfs 35817 3 iscsi_target_mod,target_core_mod
scsi_mod 231620 16
iscsi_target_mod,target_core_pscsi,target_core_mod,sg,sr_mod,mptctl,sd_mod,
scsi_dh_rdac,scsi_dh_emc,scsi_dh_alua,scsi_dh_hp_sw,scsi_dh,libata,mptspi,
mptscsih,scsi_transport_spiIf you use a firewall on the target server, you must open the iSCSI port that you are using to allow other computers to see the iSCSI LIO targets. For information, see Step 7 in Section 15.2.1, “Configuring iSCSI LIO Startup Preferences”.
A physical storage object that provides the actual storage underlying an iSCSI endpoint.
The standard format for SCSI commands. CDBs are commonly 6, 10, or 12 bytes long, though they can be 16 bytes or of variable length.
A point-to-point protocol (PPP) authentication method used to confirm the identity of one computer to another. After the Link Control Protocol (LCP) connects the two computers, and the CHAP method is negotiated, the authenticator sends a random Challenge to the peer. The peer issues a cryptographically hashed Response that depends upon the Challenge and a secret key. The authenticator verifies the hashed Response against its own calculation of the expected hash value, and either acknowledges the authentication or terminates the connection. CHAP is defined in the Internet Engineering Task Force (IETF) RFC 1994.
A 16‐bit number, generated by the initiator, that uniquely identifies a connection between two iSCSI devices. This number is presented during the login phase.
The combination of an iSCSI Target Name with an iSCSI TPG (IQN + Tag).
A 64‐bit number that uniquely identifies every device in the world. The format consists of 24 bits that are unique to a given company, and 40 bits assigned by the company to each device it builds.
The originating end of a SCSI session. Typically a controlling device such as a computer.
The class of protocols or devices that use the IP protocol to move data in a storage network. FCIP (Fibre Channel over Internet Protocol), iFCP (Internet Fibre Channel Protocol), and iSCSI (Internet SCSI) are all examples of IPS protocols.
A name format for iSCSI that uniquely identifies every device in the
world (for example:
iqn.5886.com.acme.tapedrive.sn‐a12345678).
A 48‐bit number, generated by the initiator, that uniquely identifies a session between the initiator and the Target. This value is created during the login process, and is sent to the target with a Login PDU.
A part of the iSCSI specification that allows multiple TCP/IP connections between an initiator and a target.
A method by which data can take multiple redundant paths between a server and storage.
The combination of an iSCSI Endpoint with an IP address plus a TCP (Transmission Control Protocol) port. TCP port 3260 is the port number for the iSCSI protocol, as defined by IANA (Internet Assigned Numbers Authority).
A document that describes the behavior of SCSI in general terms, allowing for different types of devices communicating over various media.
The receiving end of a SCSI session, typically a device such as a disk drive, tape drive, or scanner.
A list of SCSI target ports that are all treated the same when creating views. Creating a view can help facilitate LUN (logical unit number) mapping. Each view entry specifies a target group, host group, and a LUN.
The combination of an iSCSI endpoint with one or more LUNs.
A list of IP addresses and TCP port numbers that determines which interfaces a specific iSCSI target will listen to.
A 16‐bit number, generated by the target, that uniquely identifies a session between the initiator and the target. This value is created during the login process, and is sent to the initiator with a Login Response PDU (protocol data units).
Many enterprise data centers rely on Ethernet for their LAN and data traffic, and on Fibre Channel networks for their storage infrastructure. Open Fibre Channel over Ethernet (FCoE) Initiator software allows servers with Ethernet adapters to connect to a Fibre Channel storage subsystem over an Ethernet network. This connectivity was previously reserved exclusively for systems with Fibre Channel adapters over a Fibre Channel fabric. The FCoE technology reduces complexity in the data center by aiding network convergence. This helps to preserve your existing investments in a Fibre Channel storage infrastructure and to simplify network management.
Open-FCoE allows you to run the Fibre Channel protocols on the host, instead
of on proprietary hardware on the host bus adapter. It is targeted for 10
Gbps (gigabit per second) Ethernet adapters, but can work on any Ethernet
adapter that supports pause frames. The initiator software provides a Fibre
Channel protocol processing module as well as an Ethernet based transport
module. The Open-FCoE module acts as a low-level driver for SCSI. The
Open-FCoE transport uses net_device to send and receive
packets. Data Center Bridging (DCB) drivers provide the quality of service
for FCoE.
FCoE is an encapsulation protocol that moves the Fibre Channel protocol traffic over Ethernet connections without changing the Fibre Channel frame. This allows your network security and traffic management infrastructure to work the same with FCoE as it does with Fibre Channel.
You might choose to deploy Open-FCoE in your enterprise if the following conditions exist:
Your enterprise already has a Fibre Channel storage subsystem and administrators with Fibre Channel skills and knowledge.
You are deploying 10 Gbps Ethernet in the network.
This section describes how to set up FCoE in your network.
You can set up FCoE disks in your storage infrastructure by enabling FCoE at the switch for the connections to a server. If FCoE disks are available when the SUSE Linux Enterprise Server operating system is installed, the FCoE Initiator software is automatically installed at that time.
If the FCoE Initiator software and YaST FCoE Client software are not installed, use the following procedure to manually install them on an existing system:
Log in to the server as the root user.
In YaST, select .
Search for and select the following FCoE packages:
open-fcoe
yast2-fcoe-client
For example, type fcoe in the
field, click to
locate the software packages, then select the check box next to each
software package that you want to install.
Click , then click to accept the automatic changes.
The YaST installation for SUSE Linux Enterprise Server allows you to configure FCoE
disks during the operating system installation if FCoE is enabled at the
switch for the connections between the server and the Fibre Channel storage
infrastructure. Some system BIOS types can automatically detect the FCoE
disks, and report the disks to the YaST Installation software. However,
automatic detection of FCoE disks is not supported by all BIOS types. To
enable automatic detection in this case, you can add the
withfcoe option to the kernel command line when you
begin the installation:
withfcoe=1
When the FCoE disks are detected, the YaST installation offers the option to configure FCoE instances at that time. On the Disk Activation page, select to access the FCoE configuration. For information about configuring the FCoE interfaces, see Section 16.3, “Managing FCoE Services with YaST”.
FCoE devices will appear asynchronously during the boot process. While the
initrd guarantees that those devices are setup correctly for the root file
system, there are no such guarantees for any other file systems or mount
points like /usr. Hence any system mount points like
/usr or /var are not supported.
If you want to use those devices, ensure correct synchronisation of the
respective services and devices.
You can use the YaST FCoE Client Configuration option to create,
configure, and remove FCoE interfaces for the FCoE disks in your Fibre
Channel storage infrastructure. To use this option, the FCoE Initiator
service (the fcoemon daemon) and the Link Layer
Discovery Protocol agent daemon (lldpad) must be
installed and running, and the FCoE connections must be enabled at the
FCoE-capable switch.
Log in as the root user, then launch YaST.
In YaST, select .
The Fibre Channel over Ethernet Configuration dialog box provides three tabs:
On the tab, view or modify the FCoE service and Lldpad (Link Layer Discovery Protocol agent daemon) service start time as necessary.
FCoE Service Start:
Specifies whether to start the Fibre Channel over Ethernet service
fcoemon daemon at the server boot time or manually.
The daemon controls the FCoE interfaces and establishes a connection
with the lldpad daemon. The values are
(default) or
.
Lldpad Service Start:
Specifies whether to start the Link Layer Discovery Protocol agent
lldpad daemon at the server boot time or manually.
The lldpad daemon informs the
fcoemon daemon about the Data Center Bridging
features and the configuration of the FCoE interfaces. The values are
(default) or
.
If you modify a setting, click to save and apply the change.
On the tab, view information about all of the detected network adapters on the server, including information about VLAN and FCoE configuration. You can also create an FCoE VLAN interface, change settings for an existing FCoE interface, or remove an FCoE interface.
View FCoE Information.
The table displays the following information about each adapter:
Device Name:
Specifies the adapter name such as eth4.
Model: Specifies the adapter model information.
FCoE VLAN Interface.
Interface Name:
If a name is assigned to the interface, such as
eth4.200, FCoE is available on the switch, and
the FCoE interface is activated for the adapter.
Not Configured: If the status is , FCoE is enabled on the switch, but an FCoE interface has not been activated for the adapter. Select the adapter, then click to activate the interface on the adapter.
Not Available: If the status is , FCoE is not possible for the adapter because FCoE has not been enabled for that connection on the switch.
FCoE Enable: Specifies whether FCoE is enabled on the switch for the adapter connection. ( or )
DCB Required: Specifies whether the adapter requires Data Center Bridging. ( or )
Auto VLAN: Specifies whether automatic VLAN configuration is enabled for the adapter. ( or )
DCB Capable: Specifies whether the adapter supports Data Center Bridging. ( or )
Change FCoE Settings.
Select an FCoE VLAN interface, then click at the bottom of the page to open the Change FCoE Settings dialog box.
FCoE Enable: Enable or disable the creation of FCoE instances for the adapter. Values are or .
DCB Required: Specifies whether Data Center Bridging is required for the adapter. Values are (default) or . DCB is usually required.
Auto VLAN:
Specifies whether the fcoemon daemon creates the
VLAN interfaces automatically. Values are or
.
If you modify a setting, click to save and apply
the change. The settings are written to the
/etc/fcoe/ethX file. The
fcoemon daemon reads the configuration files for each
FCoE interface when the daemon is initialized. There is a file for every
FCoE interface.
Create FCoE VLAN Interfaces.
Select an adapter that has FCoE enabled but is not configured, then click
to configure the FCoE interface. The assigned
interface name appears in the list, such as
eth5.200.
Remove FCoE Interface.
Select the FCoE interface that you want to remove, click at the bottom of the page, then click to confirm. The FCoE Interface value changes to .
On the tab, view or modify the general settings for the FCoE system service.
Debug:
Enables or disables debugging messages from the FCoE service script
and fcoemon daemon. The values are
or (default).
Use syslog:
Specifies whether messages are sent to the system log
(/var/log/syslog). The values are
(default) or . (Data is
logged in /var/log/messages.)
If you modify a setting, click to save and apply
the change. The settings are written to the
/etc/fcoe/config file.
Click to save and apply changes.
Log in to the server as the root user, then open
a terminal console.
Use YaST to configure the Ethernet network interface card, such as
eth2.
Start the Link Layer Discovery Protocol agent daemon
(lldpad).
rclldpad start
Enable Data Center Bridging on your Ethernet adapter.
dcbtool sc eth2 dcb on Version: 2 Command: Set Config Feature: DCB State Port: eth2 Status: Successful
Enable and set the Priority Flow Control (PFC) settings for Data Center Bridging.
dcbtool sc eth<x> pfc e:1 a:1 w:1
Argument setting values are:
Controls feature enable.
Controls whether the feature is advertised via Data Center Bridging Exchange protocol to the peer.
Controls whether the feature is willing to change its operational configuration based on what is received from the peer.
Enable the Data Center Bridging to accept the switch’s priority setting for FCoE.
dcbtool sc eth2 app:fcoe e:1 Version: 2 Command: Set Config Feature: Application FCoE Port: eth2 Status: Successful
Copy the default FCoE configuration file to
/etc/fcoe/cfg-eth2.
cp /etc/fcoe/cfg-ethx /etc/fcoe/cfg-eth2
Start the FCoE Initiator service.
rcfcoe start Starting FCoE initiator service
Set up the Link Layer Discovery Protocol agent daemon
(lldpad) and the FCoE Initiator service to start when
booting.
chkconfig boot.lldpad on chkconfig boot.fcoe on
The fcoeadm utility is the Fibre Channel over Ethernet
(FCoE) management tool for the Open-FCoE project. It can be used to create,
destroy, and reset an FCoE instance on a given network interface. The
fcoeadm utility sends commands to a running
fcoemon process via a socket interface. For information
about fcoemon, see the fcoemon(8) man
page.
The fcoeadm utility allows you to query the FCoE
instances about the following:
Interfaces
Target LUNs
Port statistics
The fcoeadm utility is part of the
fcoe-utils package. It is maintained by the
Open-FCoE project.
Fiber Channel over Ethernet Administration version 1.0.12.
fcoeadm [-c|--create] [<ethX>] [-d|--destroy] [<ethX>] [-r|--reset] [<ethX>] [-S|--Scan] [<ethX>] [-i|--interface] [<ethX>] [-t|--target] [<ethX>] [-l|--lun] [<ethX>] [-s|--stats <ethX>] [<interval>] [-v|--version] [-h|--help]
-c, --create <ethX>
Creates an FCoE instance based on the specified network interface. If
an fcoemon configuration file does not exist for
the Open-FCoE service daemon interface
(/etc/fcoe/cfg-ethx; see
fcoemon(8) man page), the created FCoE instance
does not require Data Center Bridging.
Example:
To create an FCoE instance on eth2.101:
fcoeadm -c eth2.101
-d, --destroy <ethX>
Destroys an FCoE instance on the specified network interface. This
does not destroy FCoE instances created by
fipvlan.
Example:
To destroy an FCoE instance on eth2.101:
fcoeadm -d eth2.101
-h, --help
Displays the usage message of the fcoeadm command.
-i, --interface [<ethX>]Shows information about the FCoE instance on the specified network interface. If no network interface is specified, it shows information for all FCoE instances.
Examples.
To show information about all of the adapters and their ports that have FCoE instances created:
fcoeadm -i
To show information about all of the FCoE instances on interface
eth3:
fcoeadm -i eth3
-l, --lun [<ethX>]Shows detailed information about the discovered SCSI LUNs associated with the FCoE instance on the specified network interface. If no network interface is specified, it shows information about SCSI LUNs from all FCoE instances.
Examples.
To show detailed information about all of the LUNs discovered on all FCoE connections:
fcoeadm -l
To show detailed information about all of the LUNs discovered on a
specific connections, such as eth3.101:
fcoeadm -l eth3.101
-r, --reset <ethX>
Resets the FCoE instance on the specified network interface. This
does not reset FCoE instances created by fipvlan.
Example:
To reset the FCoE instance on eth2.101:
fcoeadm -r eth2.101
-s, --stats <ethX> [interval]Shows the statistics (including FC4 statistics) of the FCoE instance on the specified network interface. It displays one line per given time interval. Specify the interval value in whole integers greater than 0. The interval value is the elapsed time of the interval in seconds. If an interval is not specified, the default interval is 1 second.
Examples:
You can show statistics information about a specific
eth3 port that has FCoE instances. The
statistics are displayed one line per time interval. The default
interval of one second is not specified in the command.
fcoeadm -s eth3
To show statistics information about a specific
eth3 port that has FCoE instances, at an
interval of 3 seconds. The statistics are displayed one line per time
interval.
fcoeadm -s eth3 3
-S, --Scan <ethX>
Rescans for new targets and LUN for the specified network interface.
This does not rescan any NPIV (N_Port ID Virtualization) instances
created on the same port, and does not rescan any FCoE instances
created by fipvlan.
-t, --target [<ethX>]Shows information about the discovered targets associated with the FCoE instance on the specified network interface. If no network interface is specified, it shows information about discovered targets from all FCoE instances.
Examples.
You can show information about all of the discovered targets from all of the ports that have FCoE instances. They might be on different adapter cards. After each discovered target, any associated LUNs are listed.
fcoeadm -t
You can show information about all of the discovered targets from a
given eth3 port having FCoE instance. After each
discovered target, any associated LUNs are listed.
fcoeadm -t eth3
-v, --version
Displays the version of the fcoeadm command.
fcoeadm -i eth0.201
Description: 82599EB 10-Gigabit SFI/SFP+ Network Connection
Revision: 01
Manufacturer: Intel Corporation
Serial Number: 001B219B258C
Driver: ixgbe 3.3.8-k2
Number of Ports: 1
Symbolic Name: fcoe v0.1 over eth0.201
OS Device Name: host8
Node Name: 0x1000001B219B258E
Port Name: 0x2000001B219B258E
FabricName: 0x2001000573D38141
Speed: 10 Gbit
Supported Speed: 10 Gbit
MaxFrameSize: 2112
FC-ID (Port ID): 0x790003
State: Onlinefcoeadm -t eth0.201
Interface: eth0.201
Roles: FCP Target
Node Name: 0x200000D0231B5C72
Port Name: 0x210000D0231B5C72
Target ID: 0
MaxFrameSize: 2048
OS Device Name: rport-8:0-7
FC-ID (Port ID): 0x79000C
State: Online
LUN ID Device Name Capacity Block Size Description
------ ----------- ---------- ---------- ----------------------------
40 /dev/sdqi 792.84 GB 512 IFT DS S24F-R2840-4 (rev 386C)
72 /dev/sdpk 650.00 GB 512 IFT DS S24F-R2840-4 (rev 386C)
168 /dev/sdgy 1.30 TB 512 IFT DS S24F-R2840-4 (rev 386C)
You can use the fdisk(8) command to set up partitions
for an FCoE initiator disk.
fdisk /dev/sdc
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel.
Building a new DOS disklabel with disk identifier 0xfc691889.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won’t be recoverable.
Warning: Invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
Command (n for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 4
First cylinder (1-1017, default 1):
Using default value 1
Last cylinder, *cylinders or *size(K,M,G) (1-1017, default 1017):
Using default value 1017
Command (n for help): w
The partition table has been altered!
Calling loctl() to re-read partition table.
Syncing disks.
You can use the mkfs(8) command to create a file system
on an FCoE initiator disk.
mkfs /dev/sdc
mke2fs 1.41.9 (22-Aug-2011)
/dev/sdc is entire device, not just one partition!
Proceed anyway? (y, n) y
Filesystem label=
OS type: Linux
Block size=4096 (log-2)
262144 inodes, 1048576 blocks
52428 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1073741824
32 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 804736
Writing inode tables: done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 27 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.For information, see the follow documentation:
For information about the Open-FCoE service daemon, see the
fcoemon(8)man page.
For information about the Open-FCoE Administration tool, see the
fcoeadm(8) man page.
For information about the Data Center Bridging Configuration tool, see
the dcbtool(8) man page.
For information about the Link Layer Discovery Protocol agent daemon, see
the lldpad(8) man page.
A Logical Volume Manager (LVM) logical volume snapshot is a copy-on-write technology that monitors changes to an existing volume’s data blocks so that when a write is made to one of the blocks, the block’s value at the snapshot time is copied to a snapshot volume. In this way, a point-in-time copy of the data is preserved until the snapshot volume is deleted.
A file system snapshot contains metadata about and data blocks from a source logical volume that have changed since the snapshot was taken. When you access data via the snapshot, you see a point-in-time copy of the source logical volume. There is no need to restore data from backup media or to overwrite the changed data.
During the snapshot’s lifetime, the snapshot must be mounted before its source logical volume can be mounted.
LVM volume snapshots allow you to create a backup from a point-in-time view of the file system. The snapshot is created instantly and persists until you delete it. You can backup the file system from the snapshot while the volume itself continues to be available for users. The snapshot initially contains some metadata about the snapshot, but no actual data from the source logical volume. Snapshot uses copy-on-write technology to detect when data changes in an original data block. It copies the value it held when the snapshot was taken to a block in the snapshot volume, then allows the new data to be stored in the source block. As more blocks change from their original value on the source logical volume, the snapshot size grows.
When you are sizing the snapshot, consider how much data is expected to change on the source logical volume and how long you plan to keep the snapshot. The amount of space that you allocate for a snapshot volume can vary, depending on the size of the source logical volume, how long you plan to keep the snapshot, and the number of data blocks that are expected to change during the snapshot’s lifetime. The snapshot volume cannot be resized after it is created. As a guide, create a snapshot volume that is about 10% of the size of the original logical volume. If you anticipate that every block in the source logical volume will change at least one time before you delete the snapshot, then the snapshot volume should be at least as large as the source logical volume plus some additional space for metadata about the snapshot volume. Less space is required if the data changes infrequently or if the expected lifetime is sufficiently brief.
In LVM2, snapshots are read/write by default. When you write data directly to the snapshot, that block is marked in the exception table as used, and never gets copied from the source logical volume. You can mount the snapshot volume, and test application changes by writing data directly to the snapshot volume. You can easily discard the changes by dismounting the snapshot, removing the snapshot, and then re-mounting the source logical volume.
In a virtual guest environment, you can use the snapshot function for LVM logical volumes you create on the server’s disks, just as you would on a physical server.
In a virtual host environment, you can use the snapshot function to back up the virtual machine’s storage back-end, or to test changes to a virtual machine image, such as for patches or upgrades, without modifying the source logical volume. The virtual machine must be using an LVM logical volume as its storage back-end, as opposed to using a virtual disk file. You can mount the LVM logical volume and use it to store the virtual machine image as a file-backed disk, or you can assign the LVM logical volume as a physical disk to write to it as a block device.
Beginning in SLES 11 SP3, an LVM logical volume snapshots can be thinly provisioned. Thin provisioning is assumed if you to create a snapshot without a specified size. The snapshot is created as a thin volume that uses space as needed from a thin pool. A thin snapshot volume has the same characteristics as any other thin volume. You can independently activate the volume, extend the volume, rename the volume, remove the volume, and even snapshot the volume.
To use thinly provisioned snapshots in a cluster, the source logical volume and its snapshots must be managed in a single cluster resource. This allows the volume and its snapshots to always be mounted exclusively on the same node.
When you are done with the snapshot, it is important to remove it from the system. A snapshot eventually fills up completely as data blocks change on the source logical volume. When the snapshot is full, it is disabled, which prevents you from remounting the source logical volume.
If you create multiple snapshots for a source logical volume, remove the snapshots in a last created, first deleted order.
The Logical Volume Manager (LVM) can be used for creating snapshots of your file system.
Open a terminal console, log in as the root
user, then enter
lvcreate -s [-L <size>] -n snap_volume source_volume_path
If no size is specified, the snapshot is created as a thin snapshot.
For example:
lvcreate -s -L 1G -n linux01-snap /dev/lvm/linux01
The snapshot is created as the /dev/lvm/linux01-snap
volume.
Open a terminal console, log in as the root
user, then enter
lvdisplay snap_volume
For example:
lvdisplay /dev/vg01/linux01-snap --- Logical volume --- LV Name /dev/lvm/linux01 VG Name vg01 LV UUID QHVJYh-PR3s-A4SG-s4Aa-MyWN-Ra7a-HL47KL LV Write Access read/write LV snapshot status active destination for /dev/lvm/linux01 LV Status available # open 0 LV Size 80.00 GB Current LE 1024 COW-table size 8.00 GB COW-table LE 512 Allocated to snapshot 30% Snapshot chunk size 8.00 KB Segments 1 Allocation inherit Read ahead sectors 0 Block device 254:5
Open a terminal console, log in as the root
user, then enter
lvremove snap_volume_path
For example:
lvremove /dev/lvmvg/linux01-snap
Using an LVM logical volume for a virtual machine’s back-end storage allows flexibility in administering the underlying device, such as making it easier to move storage objects, create snapshots, and back up data. You can mount the LVM logical volume and use it to store the virtual machine image as a file-backed disk, or you can assign the LVM logical volume as a physical disk to write to it as a block device. You can create a virtual disk image on the LVM logical volume, then snapshot the LVM.
You can leverage the read/write capability of the snapshot to create different instances of a virtual machine, where the changes are made to the snapshot for a particular virtual machine instance. You can create a virtual disk image on an LVM logical volume, snapshot the source logical volume, and modify the snapshot for a particular virtual machine instance. You can create another snapshot of the source logical volume, and modify it for a different virtual machine instance. The majority of the data for the different virtual machine instances resides with the image on the source logical volume.
You can also leverage the read/write capability of the snapshot to preserve the virtual disk image while testing patches or upgrades in the guest environment. You create a snapshot of the LVM volume that contains the image, and then run the virtual machine on the snapshot location. The source logical volume is unchanged, and all changes for that machine are written to the snapshot. To return to the source logical volume of the virtual machine image, you power off the virtual machine, then remove the snapshot from the source logical volume. To start over, you re-create the snapshot, mount the snapshot, and restart the virtual machine on the snapshot image.
The following procedure uses a file-backed virtual disk image and the Xen hypervisor. You can adapt the procedure in this section for other hypervisors that run on the SUSE Linux platform, such as KVM.
To run a file-backed virtual machine image from the snapshot volume:
Ensure that the source logical volume that contains the file-backed
virtual machine image is mounted, such as at mount point
/var/lib/xen/images/<image_name>.
Create a snapshot of the LVM logical volume with enough space to store the differences that you expect.
lvcreate -s -L 20G -n myvm-snap /dev/lvmvg/myvm
If no size is specified, the snapshot is created as a thin snapshot.
Create a mount point where you will mount the snapshot volume.
mkdir -p /mnt/xen/vm/myvm-snap
Mount the snapshot volume at the mount point you created.
mount -t auto /dev/lvmvg/myvm-snap /mnt/xen/vm/myvm-snap
In a text editor, copy the configuration file for the source virtual
machine, modify the paths to point to the file-backed image file on the
mounted snapshot volume, and save the file such as
/etc/xen/myvm-snap.cfg.
Start the virtual machine using the mounted snapshot volume of the virtual machine.
xm create -c /etc/xen/myvm-snap.cfg
(Optional) Remove the snapshot, and use the unchanged virtual machine image on the source logical volume.
unmount /mnt/xenvms/myvm-snap lvremove -f /dev/lvmvg/mylvm-snap
(Optional) Repeat this process as desired.
Snapshots can be useful if you need to roll back or restore data on a volume to a previous state. For example, you might need to revert data changes that resulted from an administrator error or a failed or undesirable package installation or upgrade.
You can use the lvconvert --merge command to revert the
changes made to an LVM logical volume. The merging begins as follows:
If both the source logical volume and snapshot volume are not open, the merge begins immediately.
If the source logical volume or snapshot volume are open, the merge starts the first time either the source logical volume or snapshot volume are activated and both are closed.
If the source logical volume cannot be closed, such as the
root file system, the merge is deferred until
the next time the server reboots and the source logical volume is
activated.
If the source logical volume contains a virtual machine image, you must shut down the virtual machine, deactivate the source logical volume and snapshot volume (by dismounting them in that order), and then issue the merge command. Because the source logical volume is automatically remounted and the snapshot volume is deleted when the merge is complete, you should not restart the virtual machine until after the merge is complete. After the merge is complete, you use the resulting logical volume for the virtual machine.
After a merge begins, the merge continues automatically after server restarts until it is complete. A new snapshot cannot be created for the source logical volume while a merge is in progress.
While the merge is in progress, reads or writes to the source logical volume are transparently redirected to the snapshot that is being merged. This allows users to immediately view and access the data as it was when the snapshot was created. They do not need to wait for the merge to complete.
When the merge is complete, the source logical volume contains the same data as it did when the snapshot was taken, plus any data changes made after the merge began. The resulting logical volume has the source logical volume’s name, minor number, and UUID. The source logical volume is automatically remounted, and the snapshot volume is removed.
Open a terminal console, log in as the root
user, then enter
lvconvert --merge [-b] [-i <seconds>] [<snap_volume_path>[...<snapN>]|@<volume_tag>]
You can specify one or multiple snapshots on the command line. You can
alternatively tag multiple source logical volumes with the same volume
tag then specify
@<volume_tag> on the
command line. The snapshots for the tagged volumes are merged to their
respective source logical volumes. For information about tagging logical
volumes, see Section 4.7, “Tagging LVM2 Storage Objects”.
The options include:
Run the daemon in the background. This allows multiple specified snapshots to be merged concurrently in parallel.
Report progress as a percentage at regular intervals. Specify the interval in seconds.
For more information about this command, see the
lvconvert(8) man page.
For example:
lvconvert --merge /dev/lvmvg/linux01-snap
This command merges /dev/lvmvg/linux01-snap into its
source logical volume.
lvconvert --merge @mytag
If lvol1, lvol2, and
lvol3 are all tagged with mytag,
each snapshot volume is merged serially with its respective source
logical volume; that is: lvol1, then
lvol2, then lvol3. If the
--background option is specified, the snapshots for
the respective tagged logical volume are merged concurrently in parallel.
(Optional) If both the source logical volume and snapshot volume are open and they can be closed, you can manually deactivate and activate the source logical volume to get the merge to start immediately.
umount <original_volume> lvchange -an <original_volume> lvchange -ay <original_volume> mount <original_volume> <mount_point>
For example:
umount /dev/lvmvg/lvol01 lvchange -an /dev/lvmvg/lvol01 lvchange -ay /dev/lvmvg/lvol01 mount /dev/lvmvg/lvol01 /mnt/lvol01
(Optional) If both the source logical volume and snapshot volume are open
and the source logical volume cannot be closed, such as the
root file system, you can restart the server and
mount the source logical volume to get the merge to start immediately
after the restart.
There is no single standard for Access Control Lists (ACLs) in Linux beyond
the simple user-group-others read, write, and execute
(rwx) flags. One option for finer control are the
Draft POSIX ACLs, which were never formally
standardised by POSIX. Another is the NFSv4 ACLs, which were designed to be
part of the NFSv4 network file system with the goal of making something that
provided reasonable compatibility between POSIX systems on Linux and WIN32
systems on Microsoft Windows.
NFSv4 ACLs are not sufficient to correctly implement Draft POSIX ACLs so no
attempt has been made to map ACL accesses on an NFSv4 client (such as using
setfacl).
When using NFSv4, Draft POSIX ACLs cannot be used even in emulation and
NFSv4 ACLs need to be used directly; i.e., while setfacl
can work on NFSv3, it cannot work on NFSv4.+To allow NFSv4 ACLs to be used
on an NFSv4 file system, SUSE Linux Enterprise Server provides the
nfs4-acl-tools package which contains the following:
nfs4-getfacl
nfs4-setfacl
nfs4-editacl
These operate in a generally similar way to getfacl and
setfacl for examining and modifying NFSv4 ACLs.These
commands are effective only if the file system on the NFS server provides
full support for NFSv4 ACLs. Any limitation imposed by the server will
affect programs running on the client in that some particular combinations
of Access Control Entries (ACEs) might not be possible.
It is not supported to mount NFS volumes locally on the exporting NFS server.
For information, see “Introduction to NFSv4 ACLs” on the Linux-nfs.org Web site.
This section describes how to work around known issues for devices, software RAIDs, multipath I/O, and volumes.
Device Mapper Multipath I/O (DM-MPIO) is supported for the boot partition, beginning in SUSE Linux Enterprise Server 10 Support Pack 1. For information, see Section 7.12, “Configuring Multipath I/O for the Root Device”.
The root (/) partition using the Btrfs file system
stops accepting data. You receive the error “No space left
on device”.
See the following sections for information about possible causes and prevention of this issue.
If Snapper is running for the Btrfs file system, the “No
space left on device” problem is typically caused by
having too much data stored as snapshots on your system.
You can remove some snapshots from Snapper, however, the snapshots are not deleted immediately and might not free up as much space as you need.
To delete files from Snapper:
Log in as the root user, then open a terminal
console.
Gain back enough space for the system to come up.
At the command prompt, enter
btrfs filesystem show
Label: none uuid: 40123456-cb2c-4678-8b3d-d014d1c78c78 Total devices 1 FS bytes used 20.00GB devid 1 size 20.00GB used 20.00GB path /dev/sda3
Enter
btrfs fi balance start </mountpoint> -dusage=5
This command attempts to relocate data in empty or near-empty data chunks, allowing the space to be reclaimed and reassigned to metadata. This can take awhile (many hours for 1 TB) although the system is otherwise usable during this time.
List the snapshots in Snapper. Enter
snapper -c root list
Delete one or more snapshots from Snapper. Enter
snapper -c root delete #
Ensure that you delete the oldest snapshots first. The older a snapshot is, the more disk space it occupies.
To help prevent this problem, you can change the Snapper cleanup defaults
to be more aggressive in the
/etc/snapper/configs/root configuration file, or for
other mount points. Snapper provides three algorithms to clean up old
snapshots. The algorithms are executed in a daily cron-job. The cleanup
frequency is defined in the Snapper configuration for the mount point.
Lower the TIMELINE_LIMIT parameters for daily, monthly, and yearly to
reduce how long and the number of snapshots to be retained. For
information, see
“Adjusting
the Config File” in the
SUSE Linux Enterprise Server
Administration Guide.
If you use Snapper with Btrfs on the file system disk, it is advisable to reserve twice the amount of disk space than the standard storage proposal. The YaST Partitioner automatically proposes twice the standard disk space in the Btrfs storage proposal for the root file system.
If the system disk is filling up with data, you can try deleting files
from /var/log, /var/crash, and
/var/cache.
The Btrfs root file system subvolumes
/var/log, /var/crash and
/var/cache can use all of the available disk space
during normal operation, and cause a system malfunction. To help avoid
this situation, SUSE Linux Enterprise Server offers Btrfs quota support for subvolumes.
See the btrfs(8) manual page for more details.
See Section 15.8, “Troubleshooting iSCSI LIO Target Server”.
This appendix contains the GNU General Public License Version 2 and the GNU Free Documentation License Version 1.2.
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc. 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA
Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to ensure that the software is free for all its users. This General Public License applies to most of the Free Software Foundation’s software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too.
When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to ensure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must ensure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.
We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software.
Also, for each author’s protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors’ reputations.
Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone’s free use or not licensed at all.
The precise terms and conditions for copying, distribution and modification follow.
0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The “Program”, below, refers to any such program or work, and a “work based on the Program” means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term “modification”.) Each licensee is addressed as “you”.
Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program’s source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program.
You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions:
a). You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change.
b). You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License.
c). If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program.
In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License.
3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following:
a). Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or,
b). Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or,
c). Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable.
If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.
5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients’ exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License.
7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances.
It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice.
This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and “any later version”, you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation.
10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally.
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the “copyright” line and a pointer to where the full notice is found.
one line to give the program’s name and an idea of what it does. Copyright (C) yyyy name of author
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w’. This is free software, and you are welcome to redistribute it under certain conditions; type `show c’ for details.
The hypothetical commands `show w’ and `show c’ should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w’ and `show c’; they could even be mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your school, if any, to sign a “copyright disclaimer” for the program, if necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision’ (which makes passes at compilers) written by James Hacker.
signature of Ty Coon, 1 April 1989 Ty Coon, President of Vice
This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License.
Version 1.2, November 2002
Copyright (C) 2000,2001,2002 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
The purpose of this License is to make a manual, textbook, or other functional and useful document “free” in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others.
This License is a kind of “copyleft”, which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference.
This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The “Document”, below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as “you”. You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law.
A “Modified Version” of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language.
A “Secondary Section” is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document’s overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them.
The “Invariant Sections” are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none.
The “Cover Texts” are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words.
A “Transparent” copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not“Transparent” is called “Opaque”.
Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only.
The “Title Page” means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, “Title Page” means the text near the most prominent appearance of the work’s title, preceding the beginning of the body of the text.
A section “Entitled XYZ” means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a specific section name mentioned below, such as “Acknowledgements”, “Dedications”, “Endorsements”, or “History”.) To “Preserve the Title” of such a section when you modify the Document means that it remains a section “Entitled XYZ” according to this definition.
The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License.
You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly display copies.
If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document’s license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.
It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.
You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:
A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission.
B. List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement.
C. State on the Title page the name of the publisher of the Modified Version, as the publisher.
D. Preserve all the copyright notices of the Document.
E. Add an appropriate copyright notice for your modifications adjacent to the other copyright notices.
F. Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below.
G. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document’s license notice.
H. Include an unaltered copy of this License.
I. Preserve the section Entitled “History”, Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled “History” in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence.
J. Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the “History” section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers to gives permission.
K. For any section Entitled “Acknowledgements” or “Dedications”, Preserve the Title of the section, and preserve in the section all the substance and tone of each of the contributor acknowledgements and/or dedications given therein.
L. Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles.
M. Delete any section Entitled “Endorsements”. Such a section may not be included in the Modified Version.
N. Do not retitle any existing section to be Entitled “Endorsements” or to conflict in title with any Invariant Section.
O. Preserve any Warranty Disclaimers.
If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version’s license notice. These titles must be distinct from any other section titles.
You may add a section Entitled “Endorsements”, provided it contains nothing but endorsements of your Modified Version by various parties--for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.
You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections Entitled “History” in the various original documents, forming one section Entitled “History”; likewise combine any sections Entitled “Acknowledgements”, and any sections Entitled “Dedications”. You must delete all sections Entitled “Endorsements”.
You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.
A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an “aggregate” if the copyright resulting from the compilation is not used to limit the legal rights of the compilation’s users beyond what the individual works permit. When the Document is included in an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one half of the entire aggregate, the Document’s Cover Texts may be placed on covers that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate.
Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the license notices in the Document, and any Warranty Disclaimers, provided that you also include the original English version of this License and the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer, the original version will prevail.
If a section in the Document is Entitled“Acknowledgements”, “Dedications”, or “History”, the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title.
You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.
The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License “or any later version” applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation.
To use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page:
Copyright (c) YEAR YOUR NAME. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”.
If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the “with...Texts.” line with this:
with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.
If you have Invariant Sections without Cover Texts, or some other combination of the three, merge those two alternatives to suit the situation.
If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.
This section contains information about documentation content changes made to the SUSE Linux Enterprise Server Storage Administration Guide since the initial release of SUSE Linux Enterprise Server 11.
This document was updated on the following dates:
Updates were made to fix the following bugs and feature requests:
Make clear that BTRFS compression is not supported (Bugzilla #864606 and #879021).
State that IPv6 is supported with iSCSI (feature request #314804).
The ncpfs file system is no longer supported (feature request #313107).
Updates were made to the following section. The changes are explained below.
|
Location |
Change |
|---|---|
|
Corrected examples for cciss: devnode "^cciss.c[0-9]d[0-9].*" devnode "^cciss.c[0-9]d[0-9]*[p[0-9]*]" |
Updates were made to the following section. The changes are explained below.
|
Location |
Change |
|---|---|
|
Added issues related to partitioning multipath devices. |
Updates were made to the following section. The changes are explained below.
|
Location |
Change |
|---|---|
|
Added the -p ip:port option to the following commands: iscsiadm -m node -t iqn.2006-02.com.example.iserv:systems -p ip:port --op=update --name=node.startup --value=automatic iscsiadm -m node -t iqn.2006-02.com.example.iserv:systems -p ip:port --op=delete |
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
Added a link to | |
|
Added examples for using the |
|
Location |
Change |
|---|---|
|
For clarity, changed “original volume” to “source logical volume”. | |
|
In a Xen host environment, you can use the snapshot function to back up the virtual machine’s storage back-end or to test changes to a virtual machine image such as for patches or upgrades. | |
|
Section 17.5, “Using Snapshots for Virtual Machines on a Virtual Host” |
This section is new. |
|
If the source logical volume contains a virtual machine image, you must shut down the virtual machine, deactivate the source logical volume and snapshot volume (by dismounting them in that order), issue the merge command, and then activate the snapshot volume and source logical volume (by mounting them in that order). Because the source logical volume is automatically remounted and the snapshot volume is deleted when the merge is complete, you should not restart the virtual machine until after the merge is complete. After the merge is complete, you use the resulting logical volume for the virtual machine. |
|
Location |
Change |
|---|---|
|
Added information about how to prepare for a SLES 10 SP4 to SLES 11 upgrade if the system device is managed by EVMS. |
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
Added information about thin provisioning of LVM logical volumes. |
Chapter 15, Mass Storage over IP Networks: iSCSI LIO Target Server is new.
|
Location |
Change |
|---|---|
|
This section is new. | |
|
Section 7.2.11.1, “Storage Arrays That Are Automatically Detected for Multipathing” |
Updated the list of storage arrays that have a default definition in
the
|
|
The | |
|
This section is new. | |
|
Section 7.4.3, “Configuring the Device Drivers in initrd for Multipathing” |
Added information about the SCSI hardware handlers for
|
|
Section 7.7, “Configuring Default Policies for Polling, Queueing, and Failback” |
Updated the default values list to annotate the deprecated attributes
|
|
Section 7.11.2.1, “Understanding Priority Groups and Attributes” |
The default for
The
Added |
|
Section 7.14, “Scanning for New Devices without Rebooting” Section 7.15, “Scanning for New Partitioned Devices without Rebooting” |
Warning
In EMC PowerPath environments, do not use the
|
|
Location |
Change |
|---|---|
|
Section 10.3.3, “Creating a Complex RAID10 with the YaST Partitioner” |
This section is new. |
|
Location |
Change |
|---|---|
|
The cleanup frequency is defined in the Snapper configuration for the mount point. Lower the TIMELINE_LIMIT parameters for daily, monthly, and yearly to reduce how long and the number of snapshots to be retained. | |
|
This section is new. |
Chapter 12, Storage Enclosure LED Utilities for MD Software RAIDs is new.
|
Location |
Change |
|---|---|
|
Added information about thin provisioning for LVM logical volume snapshots. | |
|
This section is new. |
|
Location |
Change |
|---|---|
|
This section is new. |
|
Location |
Change |
|---|---|
|
This section is new. |
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
Section 1.2.3.4, “Ext3 File System Inode Size and Number of Inodes” |
This section was updated to discuss changes to the default settings for inode size and bytes-per-inode ratio in SLES 11. |
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
Section 4.6, “Automatically Activating Non-Root LVM Volume Groups” |
This section is new. |
|
Location |
Change |
|---|---|
|
Multiple device support that allows you to grow or shrink the file system. The feature is planned to be available in a future release of the YaST Partitioner. |
Updates were made to the following sections. The changes are explained below.
This section has been modified to focus on the software RAID1 type. Software RAID0 and RAID5 are not supported. They were previously included in error. Additional important changes are noted below.
|
Location |
Change |
|---|---|
|
Section 9.1, “Prerequisites for Using a Software RAID1 Device for the Root Partition” |
You need a third device to use for the |
|
Step 4 in Section 9.4, “Creating a Software RAID1 Device for the Root (/) Partition” |
Create the |
|
Step 7.b in Section 9.4, “Creating a Software RAID1 Device for the Root (/) Partition” |
Under , select . |
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
Important
If you add multipath support after you have configured LVM, you must
modify the |
|
Location |
Change |
|---|---|
|
Ensure that the configuration settings in the
| |
|
Section 7.2.3, “Using WWID, User-Friendly, and Alias Names for Multipathed Devices” |
When using links to multipath-mapped devices in the
|
|
To accept both raw disks and partitions for Device Mapper names,
specify the path as follows, with no hyphen (-) before
filter = [ "a|/dev/disk/by-id/dm-uuid-.*mpath-.*|", "r|.*|" ] | |
|
Ensure that the configuration files for | |
|
Section 7.9, “Configuring User-Friendly Names or Alias Names” |
Before you begin, review the requirements in Section 7.2.3, “Using WWID, User-Friendly, and Alias Names for Multipathed Devices”. |
Updates were made to the following section. The changes are explained below.
|
Location |
Change |
|---|---|
|
This section has been updated to be consistent with File System Support and Sizes on the SUSE Linux Enterprise Server Technical Information Web site. | |
|
This section is new. |
Corrections for front matter and typographical errata.
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
Section 7.6, “Creating or Modifying the /etc/multipath.conf File” |
Specific examples for configuring devices were moved to a higher organization level. |
|
Section 7.6.3, “Verifying the Multipath Setup in the /etc/multipath.conf File” |
It is assumed that the |
|
Location |
Change |
|---|---|
|
Section 1.2.3.4, “Ext3 File System Inode Size and Number of Inodes” |
To allow space for extended attributes and ACLs for a file on Ext3 file systems, the default inode size for Ext3 was increased from 128 bytes on SLES 10 to 256 bytes on SLES 11. |
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
In SLES 11 SP2, the
Multipath Tools 0.4.9 and later uses the | |
|
Section 7.11.2.1, “Understanding Priority Groups and Attributes” |
Added information about using |
|
Section 7.11.2.1, “Understanding Priority Groups and Attributes” |
In SLES 11 SP2, the rr_min_io multipath attribute is obsoleted and replaced by the rr_min_io_rq attribute. |
|
Section 7.19.1, “PRIO Settings for Individual Devices Fail After Upgrading to Multipath 0.4.9” |
This section is new. |
|
Section 7.19.2, “PRIO Settings with Arguments Fail After Upgrading to multipath-tools-0.4.9” |
This section is new. |
|
Fixed broken links. |
|
Location |
Change |
|---|---|
|
The Btrfs tools package is |
Updates were made to the following sections. The changes are explained below.
This section was updated to use the /dev/disk/by-id
directory for device paths in all examples.
|
Location |
Change |
|---|---|
|
Revised the order of the procedures so that you reduce the size of the RAID before you reduce the individual component partition sizes. |
Updates were made to the following sections. The changes are explained below.
This section is new. Open Fibre Channel over Ethernet (OpenFCoE) is supported beginning in SLES 11.
This section is new.
|
Location |
Change |
|---|---|
|
This section is new. |
This section is new.
|
Location |
Change | ||||
|---|---|---|---|---|---|
|
Section 7.7, “Configuring Default Policies for Polling, Queueing, and Failback” |
The default | ||||
|
In the | |||||
|
Section 7.11.2.1, “Understanding Priority Groups and Attributes” |
Recommendations were added for the no_path_retry and failback settings when multipath I/O is used in a cluster environment. | ||||
|
Section 7.11.2.1, “Understanding Priority Groups and Attributes” |
The path-selector option names and settings were corrected:
|
|
Location |
Change |
|---|---|
|
This section is new. |
|
Location |
Change |
|---|---|
|
This section is new. | |
|
Section 14.5.4, “iSCSI Targets Are Mounted When the Configuration File Is Set to Manual” |
This section is new. |
|
Location |
Change |
|---|---|
|
This section is new. With SUSE Linux Enterprise 11 SP2, the Btrfs file system is supported as root file system, that is, the file system for the operating system, across all architectures of SUSE Linux Enterprise 11 SP2. | |
ImportantThe ReiserFS file system is fully supported for the lifetime of SUSE Linux Enterprise Server 11 specifically for migration purposes. SUSE plans to remove support for creating new ReiserFS file systems starting with SUSE Linux Enterprise Server 12. | |
|
This section is new. | |
|
The values in this section were updated to current standards. | |
|
This section is new. |
|
Location |
Change |
|---|---|
|
Section 5.1, “Guidelines for Resizing” Section 5.4, “Decreasing the Size of an Ext2 or Ext3 File System” |
The resize2fs command allows only the Ext3 file system to be resized if mounted. The size of an Ext3 volume can be increased or decreased when the volume is mounted or unmounted. The Ext2/4 file systems must be unmounted for increasing or decreasing the volume size. |
|
Location |
Change |
|---|---|
|
The |
Updates were made to the following section. The changes are explained below.
|
Location |
Change |
|---|---|
|
Section 7.2.4, “Using LVM2 on Multipath Devices” Section 7.13, “Configuring Multipath I/O for an Existing Software RAID” |
Running |
Updates were made to the following section. The changes are explained below.
|
Location |
Change |
|---|---|
|
Section 7.11.2.1, “Understanding Priority Groups and Attributes” |
The default setting for path_grouping_policy changed from multibus to failover in SLES 11. |
|
Location |
Change |
|---|---|
|
This section is new. |
This release fixes broken links and removes obsolete references.
Updates were made to the following section. The changes are explained below.
|
Location |
Change |
|---|---|
|
LVM2 does not restrict the number of physical extents. Having a large number of extents has no impact on I/O performance to the logical volume, but it slows down the LVM tools. |
|
Location |
Change |
|---|---|
|
Tuning the Failover for Specific Host Bus Adapters |
This section was removed. For HBA failover guidance, refer to your vendor documentation. |
|
Location |
Change |
|---|---|
|
Decreasing the size of the file system is supported when the file system is unmounted. |
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
The discussion and procedure were expanded to explain how to configure a partition that uses the entire disk. The procedure was modified to use the Hard Disk partitioning feature in the YaST Partitioner. | |
|
All LVM Management sections |
Procedures throughout the chapter were modified to use Volume Management in the YaST Partitioner. |
|
This section is new. | |
|
This section is new. | |
|
This section is new. | |
|
This section is new. |
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
Details were added to the procedure. |
|
Location |
Change |
|---|---|
|
Section 7.9, “Configuring User-Friendly Names or Alias Names” |
Using user-friendly names for the root device can result in data loss. Added alternatives from TID 7001133: Recommendations for the usage of user_friendly_names in multipath configurations. |
|
Location |
Change |
|---|---|
|
Errata in the example were corrected. |
|
Location |
Change |
|---|---|
|
Section 14.5.1, “Hotplug Doesn’t Work for Mounting iSCSI Targets” |
This section is new. |
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
The example in Step 3 was corrected. | |
|
Section 7.2.8, “SAN Timeout Settings When the Root Device Is Multipathed” |
This section is new. |
|
The file list for a package can vary for different server architectures. For a list of files included in the multipath-tools package, go to the Web page, find your architecture and select , then search on “multipath-tools” to find the package list for that architecture. | |
|
If the SAN device will be used as the root device on the server, modify the timeout settings for the device as described in Section 7.2.8, “SAN Timeout Settings When the Root Device Is Multipathed”. | |
|
Section 7.6.3, “Verifying the Multipath Setup in the /etc/multipath.conf File” |
Added example output for -v3 verbosity. |
|
Section 7.12.1.1, “Enabling Multipath I/O at Install Time on an Active/Active Multipath Storage LUN” |
This section is new. |
|
This section is new. |
|
Location |
Change |
|---|---|
|
Step 7.g in Section 14.2.2, “Creating iSCSI Targets with YaST” |
In the function, the option allows you to export the iSCSI target information, which makes it easier to provide this information to consumers of the resources. |
|
This section is new. |
|
Location |
Change |
|---|---|
|
The Software RAID HOW-TO has been deprecated. Use the Linux RAID wiki instead. |
|
Location |
Change |
|---|---|
|
This section is new. |
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
Section 9.1, “Prerequisites for Using a Software RAID1 Device for the Root Partition” |
Corrected an error in the RAID 0 definition.. |
|
Location |
Change |
|---|---|
|
Added information about using the
| |
|
Section 7.15, “Scanning for New Partitioned Devices without Rebooting” |
Added information about using the
|
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
Section 7.2.4, “Using LVM2 on Multipath Devices” Section 7.13, “Configuring Multipath I/O for an Existing Software RAID” |
The -f mpath option changed to -f multipath: mkinitrd -f multipath |
|
prio_callout (prio_callout) in Section 7.11.2.1, “Understanding Priority Groups and Attributes” |
Multipath prio_callout programs are located in shared libraries in
|
|
Location |
Change |
|---|---|
|
The resize2fs utility supports online or offline resizing for the ext3 file system. |
|
Location |
Change |
|---|---|
|
Section 2.5.11, “Location Change for Multipath Tool Callouts” |
This section is new. |
|
Section 2.5.12, “Change from mpath to multipath for the mkinitrd -f Option” |
This section is new. |
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
In the YaST Control Center, select › . |
|
Location |
Change |
|---|---|
|
The keyword devnode_blacklist has been deprecated and replaced with the keyword blacklist. | |
|
Section 7.7, “Configuring Default Policies for Polling, Queueing, and Failback” |
Changed getuid_callout to getuid. |
|
Section 7.11.2.1, “Understanding Priority Groups and Attributes” |
Changed getuid_callout to getuid. |
|
Section 7.11.2.1, “Understanding Priority Groups and Attributes” |
Added descriptions for path_selector of least-pending, length-load-balancing, and service-time options. |
|
Location |
Change |
|---|---|
|
Section 2.5.10, “Advanced I/O Load-Balancing Options for Multipath” |
This section is new. |
Updates were made to the following section. The change is explained below.
|
Location |
Change |
|---|---|
|
This section is new. |
Updates were made to the following sections. The changes are explained below.
|
Location |
Change |
|---|---|
|
Section 7.12, “Configuring Multipath I/O for the Root Device” |
Added Step 4 and Step 6 for System Z. |
|
Section 7.15, “Scanning for New Partitioned Devices without Rebooting” |
Corrected the syntax for the command lines in Step 2. |
|
Section 7.15, “Scanning for New Partitioned Devices without Rebooting” |
Step 7 replaces old Step 7 and Step 8. |
|
Location |
Change |
|---|---|
|
To see the rebuild progress while being refreshed every second, enter watch -n 1 cat /proc/mdstat |
|
Location |
Change |
|---|---|
|
Section 14.3.1, “Using YaST for the iSCSI Initiator Configuration” |
Re-organized material for clarity. Added information about how to use the settings for the Start-up option for iSCSI target devices:
|
Updates were made to the following section. The changes are explained below.
|
Location |
Change |
|---|---|
|
Section 7.2.11.1, “Storage Arrays That Are Automatically Detected for Multipathing” |
Testing of the IBM zSeries device with multipathing has shown that
the dev_loss_tmo parameter should be set to 90 seconds, and the
fast_io_fail_tmo parameter should be set to 5 seconds. If you are
using zSeries devices, you must manually create and configure the
|
|
Multipathing is supported for the | |
|
Section 7.10, “Configuring Default Settings for zSeries Devices” |
This section is new. |
|
Section 7.12, “Configuring Multipath I/O for the Root Device” |
DM-MP is now available and supported for |