Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
Applies to SUSE Linux Enterprise High Availability Extension 12 SP5

21 Cluster Logical Volume Manager (cLVM)

Abstract

When managing shared storage on a cluster, every node must be informed about changes that are done to the storage subsystem. The Logical Volume Manager 2 (LVM2), which is widely used to manage local storage, has been extended to support transparent management of volume groups across the whole cluster. Clustered volume groups can be managed using the same commands as local storage.

21.1 Conceptual Overview

Clustered LVM2 is coordinated with different tools:

Distributed Lock Manager (DLM)

Coordinates disk access for cLVM and mediates metadata access through locking.

Logical Volume Manager2 (LVM2)

Enables flexible distribution of one file system over several disks. LVM2 provides a virtual pool of disk space.

Clustered Logical Volume Manager (cLVM)

Coordinates access to the LVM2 metadata so every node knows about changes. cLVM does not coordinate access to the shared data itself; to enable cLVM to do so, you must configure OCFS2 or other cluster-aware applications on top of the cLVM-managed storage.

21.2 Configuration of cLVM

Depending on your scenario it is possible to create a RAID 1 device with cLVM with the following layers:

Make sure you have fulfilled the following prerequisites:

  • A shared storage device is available, such as provided by a Fibre Channel, FCoE, SCSI, iSCSI SAN, or DRBD*.

  • In case of DRBD, both nodes must be primary (as described in the following procedure).

  • Check if the locking type of LVM2 is cluster-aware. The keyword locking_type in /etc/lvm/lvm.conf must contain the value 3 (the default is 1). Copy the configuration to all nodes, if necessary.

  • Check if the lvmetad daemon is disabled, because it cannot work with cLVM. In /etc/lvm/lvm.conf, the keyword use_lvmetad must be set to 0 (the default is 1). Copy the configuration to all nodes, if necessary.

21.2.1 Creating the Cluster Resources

Preparing the cluster for use of cLVM includes the following basic steps:

Procedure 21.1: Creating a DLM Resource
  1. Start a shell and log in as root.

  2. Check the current configuration of the cluster resources:

    root # crm configure show
  3. If you have already configured a DLM resource (and a corresponding base group and base clone), continue with Procedure 21.2, “Configuring DLM, CLVM, and STONITH”.

    Otherwise, configure a DLM resource and a corresponding base group and base clone as described in Procedure 17.1, “Configuring a Base Group for DLM”.

  4. Leave the crm live configuration with exit.

21.2.2 Scenario: Configuring Cmirrord

To track mirror log information in a cluster, the cmirrord daemon is used. Cluster mirrors are not possible without this daemon running.

We assume that /dev/sda and /dev/sdb are the shared storage devices as with DRBD, iSCSI, and others. Replace these with your own device name(s), if necessary. Proceed as follows:

Procedure 21.2: Configuring DLM, CLVM, and STONITH
  1. Create a cluster with at least two nodes as described in Installation and Setup Quick Start.

  2. Configure your cluster to run dlm, clvmd, and STONITH:

    root # crm configure
    crm(live)configure# primitive clvmd ocf:heartbeat:clvm \
            params with_cmirrord=1 \
            op stop interval=0 timeout=100 \
    	       op start interval=0 timeout=90 \
    	       op monitor interval=20 timeout=20
    crm(live)configure# primitive dlm ocf:pacemaker:controld \
            op start timeout="90" \
            op stop timeout="100" \
            op monitor interval="60" timeout="60"
    crm(live)configure# primitive sbd_stonith stonith:external/sbd \
            params pcmk_delay_max=30
    crm(live)configure# group g-storage dlm clvmd
    crm(live)configure# clone cl-storage g-storage \
            meta interleave="true" ordered=true
  3. Leave crmsh with exit and commit your changes.

Continue configuring your disks with Procedure 21.3.

Procedure 21.3: Configuring the Disks for cLVM
  1. Create a clustered volume group (VG):

    root # pvcreate /dev/sda /dev/sdb
    root # vgcreate -cy vg1 /dev/sda /dev/sdb
  2. Create a mirrored-log logical volume (LV) in your cluster:

    root # lvcreate -n lv1 -m1 -l10%VG vg1 --mirrorlog mirrored
  3. Use lvs to show the progress. If the percentage number has reached 100%, the mirrored disk is successfully synchronized.

  4. To test the clustered volume /dev/vg1/lv1, use the following steps:

    1. Read or write to /dev/vg1/lv1.

    2. Deactivate your LV with lvchange -an.

    3. Activate your LV with lvchange -ay.

    4. Use lvconvert to convert a mirrored log to a disk log.

  5. Create a mirrored-log LV in another cluster VG. This is a different volume group from the previous one.

The current cLVM can only handle one physical volume (PV) per mirror side. If one mirror is actually made up of several PVs that need to be concatenated or striped, lvcreate does not understand this. For this reason, lvcreate and cmirrord metadata needs to understand grouping of PVs into one side, effectively supporting RAID10.

To support RAID10 for cmirrord, use the following procedure (assuming that /dev/sda, /dev/sdb, /dev/sdc, and /dev/sdd are the shared storage devices):

  1. Create a volume group (VG):

    root # pvcreate /dev/sda /dev/sdb /dev/sdc /dev/sdd
      Physical volume "/dev/sda" successfully created
      Physical volume "/dev/sdb" successfully created
      Physical volume "/dev/sdc" successfully created
      Physical volume "/dev/sdd" successfully created
    root # vgcreate vgtest /dev/sda /dev/sdb /dev/sdc /dev/sdd
      Clustered volume group "vgtest" successfully created
  2. Open the file /etc/lvm/lvm.conf and go to the section allocation. Set the following line and save the file:

    mirror_logs_require_separate_pvs = 1
  3. Add your tags to your PVs:

    root # pvchange --addtag @a /dev/sda /dev/sdb
    root # pvchange --addtag @b /dev/sdc /dev/sdd

    A tag is an unordered keyword or term assigned to the metadata of a storage object. Tagging allows you to classify collections of LVM2 storage objects in ways that you find useful by attaching an unordered list of tags to their metadata.

  4. List your tags:

    root # pvs -o pv_name,vg_name,pv_tags /dev/sd{a,b,c,d}

    You should receive this output:

    PV        VG   PV Tags
    /dev/sda  vgtest   a
    /dev/sdb  vgtest   a
    /dev/sdc  vgtest   b
    /dev/sdd  vgtest   b

If you need further information regarding LVM2, refer to the SUSE Linux Enterprise Server 12 SP5 Storage Administration Guide: https://documentation.suse.com/sles-12/html/SLES-all/cha-lvm.html.

21.2.3 Scenario: cLVM with iSCSI on SANs

The following scenario uses two SAN boxes which export their iSCSI targets to several clients. The general idea is displayed in Figure 21.1, “Setup of iSCSI with cLVM”.

Setup of iSCSI with cLVM
Figure 21.1: Setup of iSCSI with cLVM
Warning
Warning: Data Loss

The following procedures will destroy any data on your disks!

Configure only one SAN box first. Each SAN box needs to export its own iSCSI target. Proceed as follows:

Procedure 21.4: Configuring iSCSI Targets (SAN)
  1. Run YaST and click Network Services › iSCSI LIO Target to start the iSCSI Server module.

  2. If you want to start the iSCSI target whenever your computer is booted, choose When Booting, otherwise choose Manually.

  3. If you have a firewall running, enable Open Port in Firewall.

  4. Switch to the Global tab. If you need authentication enable incoming or outgoing authentication or both. In this example, we select No Authentication.

  5. Add a new iSCSI target:

    1. Switch to the Targets tab.

    2. Click Add.

    3. Enter a target name. The name needs to be formatted like this:

      iqn.DATE.DOMAIN

      For more information about the format, refer to Section 3.2.6.3.1. Type "iqn." (iSCSI Qualified Name) at http://www.ietf.org/rfc/rfc3720.txt.

    4. If you want a more descriptive name, you can change it as long as your identifier is unique for your different targets.

    5. Click Add.

    6. Enter the device name in Path and use a Scsiid.

    7. Click Next twice.

  6. Confirm the warning box with Yes.

  7. Open the configuration file /etc/iscsi/iscsid.conf and change the parameter node.startup to automatic.

Now set up your iSCSI initiators as follows:

Procedure 21.5: Configuring iSCSI Initiators
  1. Run YaST and click Network Services › iSCSI Initiator.

  2. If you want to start the iSCSI initiator whenever your computer is booted, choose When Booting, otherwise set Manually.

  3. Change to the Discovery tab and click the Discovery button.

  4. Add your IP address and your port of your iSCSI target (see Procedure 21.4, “Configuring iSCSI Targets (SAN)”). Normally, you can leave the port as it is and use the default value.

  5. If you use authentication, insert the incoming and outgoing user name and password, otherwise activate No Authentication.

  6. Select Next. The found connections are displayed in the list.

  7. Proceed with Finish.

  8. Open a shell, log in as root.

  9. Test if the iSCSI initiator has been started successfully:

    root # iscsiadm -m discovery -t st -p 192.168.3.100
    192.168.3.100:3260,1 iqn.2010-03.de.jupiter:san1
  10. Establish a session:

    root # iscsiadm -m node -l -p 192.168.3.100 -T iqn.2010-03.de.jupiter:san1
    Logging in to [iface: default, target: iqn.2010-03.de.jupiter:san1, portal: 192.168.3.100,3260]
    Login to [iface: default, target: iqn.2010-03.de.jupiter:san1, portal: 192.168.3.100,3260]: successful

    See the device names with lsscsi:

    ...
    [4:0:0:2]    disk    IET      ...     0     /dev/sdd
    [5:0:0:1]    disk    IET      ...     0     /dev/sde

    Look for entries with IET in their third column. In this case, the devices are /dev/sdd and /dev/sde.

Procedure 21.6: Creating the LVM2 Volume Groups
  1. Open a root shell on one of the nodes you have run the iSCSI initiator from Procedure 21.5, “Configuring iSCSI Initiators”.

  2. Prepare the physical volume for LVM2 with the command pvcreate on the disks /dev/sdd and /dev/sde:

    root # pvcreate /dev/sdd
    root # pvcreate /dev/sde
  3. Create the cluster-aware volume group on both disks:

    root # vgcreate --clustered y clustervg /dev/sdd /dev/sde
  4. Create logical volumes as needed:

    root # lvcreate -m1 --name clusterlv --size 500M clustervg
  5. Check the physical volume with pvdisplay:

      --- Physical volume ---
          PV Name               /dev/sdd
          VG Name               clustervg
          PV Size               509,88 MB / not usable 1,88 MB
          Allocatable           yes
          PE Size (KByte)       4096
          Total PE              127
          Free PE               127
          Allocated PE          0
          PV UUID               52okH4-nv3z-2AUL-GhAN-8DAZ-GMtU-Xrn9Kh
    
          --- Physical volume ---
          PV Name               /dev/sde
          VG Name               clustervg
          PV Size               509,84 MB / not usable 1,84 MB
          Allocatable           yes
          PE Size (KByte)       4096
          Total PE              127
          Free PE               127
          Allocated PE          0
          PV UUID               Ouj3Xm-AI58-lxB1-mWm2-xn51-agM2-0UuHFC
  6. Check the volume group with vgdisplay:

      --- Volume group ---
          VG Name               clustervg
          System ID
          Format                lvm2
          Metadata Areas        2
          Metadata Sequence No  1
          VG Access             read/write
          VG Status             resizable
          Clustered             yes
          Shared                no
          MAX LV                0
          Cur LV                0
          Open LV               0
          Max PV                0
          Cur PV                2
          Act PV                2
          VG Size               1016,00 MB
          PE Size               4,00 MB
          Total PE              254
          Alloc PE / Size       0 / 0
          Free  PE / Size       254 / 1016,00 MB
          VG UUID               UCyWw8-2jqV-enuT-KH4d-NXQI-JhH3-J24anD

After you have created the volumes and started your resources you should have a new device named /dev/dm-*. It is recommended to use a clustered file system on top of your LVM2 resource, for example OCFS. For more information, see Chapter 18, OCFS2.

21.2.4 Scenario: cLVM With DRBD

The following scenarios can be used if you have data centers located in different parts of your city, country, or continent.

Procedure 21.7: Creating a Cluster-Aware Volume Group With DRBD
  1. Create a primary/primary DRBD resource:

    1. First, set up a DRBD device as primary/secondary as described in Procedure 20.1, “Manually Configuring DRBD”. Make sure the disk state is up-to-date on both nodes. Check this with drbdadm status.

    2. Add the following options to your configuration file (usually something like /etc/drbd.d/r0.res):

      resource r0 {
        net {
           allow-two-primaries;
        }
        ...
      }
    3. Copy the changed configuration file to the other node, for example:

      root # scp /etc/drbd.d/r0.res venus:/etc/drbd.d/
    4. Run the following commands on both nodes:

      root # drbdadm disconnect r0
      root # drbdadm connect r0
      root # drbdadm primary r0
    5. Check the status of your nodes:

      root # drbdadm status r0
  2. Include the clvmd resource as a clone in the pacemaker configuration, and make it depend on the DLM clone resource. See Procedure 21.1, “Creating a DLM Resource” for detailed instructions. Before proceeding, confirm that these resources have started successfully on your cluster. You may use crm status or the Web interface to check the running services.

  3. Prepare the physical volume for LVM2 with the command pvcreate. For example, on the device /dev/drbd_r0 the command would look like this:

    root # pvcreate /dev/drbd_r0
  4. Create a cluster-aware volume group:

    root # vgcreate --clustered y myclusterfs /dev/drbd_r0
  5. Create logical volumes as needed. You may probably want to change the size of the logical volume. For example, create a 4 GB logical volume with the following command:

    root # lvcreate -m1 --name testlv -L 4G myclusterfs
  6. The logical volumes within the VG are now available as file system mounts or raw usage. Ensure that services using them have proper dependencies to collocate them with and order them after the VG has been activated.

After finishing these configuration steps, the LVM2 configuration can be done like on any stand-alone workstation.

21.3 Configuring Eligible LVM2 Devices Explicitly

When several devices seemingly share the same physical volume signature (as can be the case for multipath devices or DRBD), it is recommended to explicitly configure the devices which LVM2 scans for PVs.

For example, if the command vgcreate uses the physical device instead of using the mirrored block device, DRBD will be confused which may result in a split brain condition for DRBD.

To deactivate a single device for LVM2, do the following:

  1. Edit the file /etc/lvm/lvm.conf and search for the line starting with filter.

  2. The patterns there are handled as regular expressions. A leading a means to accept a device pattern to the scan, a leading r rejects the devices that follow the device pattern.

  3. To remove a device named /dev/sdb1, add the following expression to the filter rule:

    "r|^/dev/sdb1$|"

    The complete filter line will look like the following:

    filter = [ "r|^/dev/sdb1$|", "r|/dev/.*/by-path/.*|", "r|/dev/.*/by-id/.*|", "a/.*/" ]

    A filter line, that accepts DRBD and MPIO devices but rejects all other devices would look like this:

    filter = [ "a|/dev/drbd.*|", "a|/dev/.*/by-id/dm-uuid-mpath-.*|", "r/.*/" ]
  4. Write the configuration file and copy it to all cluster nodes.

21.4 For More Information

Thorough information is available from the pacemaker mailing list, available at http://www.clusterlabs.org/wiki/Help:Contents.

The official cLVM FAQ can be found at http://sources.redhat.com/cluster/wiki/FAQ/CLVM.

Print this page