Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
documentation.suse.com / SUSE Linux Enterprise High Availability Extension Documentation / Administration Guide / Storage and data replication / Distributed Lock Manager (DLM)
Applies to SUSE Linux Enterprise High Availability Extension 15 SP3

17 Distributed Lock Manager (DLM)

The Distributed Lock Manager (DLM) in the kernel is the base component used by OCFS2, GFS2, Cluster MD, and Cluster LVM (lvmlockd) to provide active-active storage at each respective layer.

17.1 Protocols for DLM communication

To avoid single points of failure, redundant communication paths are important for High Availability clusters. This is also true for DLM communication. If network bonding (Link Aggregation Control Protocol, LACP) cannot be used for any reason, we highly recommend defining a redundant communication channel (a second ring) in Corosync. For details, see Procedure 4.3, “Defining a redundant communication channel”.

DLM communicates through port 21064 using either the TCP or SCTP protocol, depending on the configuration in /etc/corosync/corosync.conf:

  • If rrp_mode is set to none (which means redundant ring configuration is disabled), DLM automatically uses TCP. However, without a redundant communication channel, DLM communication will fail if the TCP link is down.

  • If rrp_mode is set to passive (which is the typical setting), and a second communication ring in /etc/corosync/corosync.conf is configured correctly, DLM automatically uses SCTP. In this case, DLM messaging has the redundancy capability provided by SCTP.

17.2 Configuring DLM cluster resources

DLM uses the cluster membership services from Pacemaker which run in user space. Therefore, DLM needs to be configured as a clone resource that is present on each node in the cluster.

Note
Note: DLM resource for several solutions

As OCFS2, GFS2, Cluster MD, and Cluster LVM (lvmlockd) all use DLM, it is enough to configure one resource for DLM. As the DLM resource runs on all nodes in the cluster it is configured as a clone resource.

If you have a setup that includes both OCFS2 and Cluster LVM, configuring one DLM resource for both OCFS2 and Cluster LVM is enough. In this case, configure DLM using Procedure 17.1, “Configuring a base group for DLM”.

However, if you need to keep the resources that use DLM independent from one another (such as multiple OCFS2 mount points), use separate colocation and order constraints instead of a group. In this case, configure DLM using Procedure 17.2, “Configuring an independent DLM resource”.

Procedure 17.1: Configuring a base group for DLM

This configuration consists of a base group that includes several primitives and a base clone. Both base group and base clone can be used in various scenarios afterward (for both OCFS2 and Cluster LVM, for example). You only need to extend the base group with the respective primitives as needed. As the base group has internal colocation and ordering, this simplifies the overall setup as you do not need to specify several individual groups, clones and their dependencies.

  1. Log in to a node as root or equivalent.

  2. Run crm configure.

  3. Create the primitive resource for DLM:

    crm(live)configure# primitive dlm ocf:pacemaker:controld \
      op monitor interval="60" timeout="60"
  4. Create a base group for the dlm resource and further storage-related resources:

    crm(live)configure# group g-storage dlm
  5. Clone the g-storage group so that it runs on all nodes:

    crm(live)configure#  clone cl-storage g-storage \
      meta interleave=true target-role=Started
  6. Review your changes with show.

  7. If everything is correct, submit your changes with commit and leave the crm live configuration with quit.

Note
Note: Failure when disabling STONITH

Clusters without STONITH are not supported. If you set the global cluster option stonith-enabled to false for testing or troubleshooting purposes, the DLM resource and all services depending on it (such as Cluster LVM, GFS2, and OCFS2) will fail to start.

Procedure 17.2: Configuring an independent DLM resource

This configuration consists of a primitive and a clone, but no group. By adding colocation and order constraints, you can avoid introducing dependencies between multiple resources that use DLM (such as multiple OCFS2 mount points).

  1. Log in to a node as root or equivalent.

  2. Run crm configure.

  3. Create the primitive resource for DLM:

    crm(live)configure# primitive dlm ocf:pacemaker:controld \
      op start timeout=90 interval=0 \
      op stop timeout=100 interval=0 \
      op monitor interval=60 timeout=60
  4. Clone the dlm resource so that it runs on all nodes:

    crm(live)configure#  clone cl-dlm dlm meta interleave=true
  5. Review your changes with show.

  6. If everything is correct, submit your changes with commit and leave the crm live configuration with quit.