15 Distributed Lock Manager (DLM) #
The Distributed Lock Manager (DLM) in the kernel is the base component used by OCFS2, GFS2, Cluster MD, and cLVM to provide active-active storage at each respective layer.
15.1 Protocols for DLM Communication #
To avoid single points of failure, redundant communication paths are important for High Availability clusters. This is also true for DLM communication. If network bonding (Link Aggregation Control Protocol, LACP) cannot be used for any reason, we highly recommend to define a redundant communication channel (a second ring) in Corosync. For details, see Procedure 4.3, “Defining a Redundant Communication Channel”.
Depending on the configuration in /etc/corosync/corosync.conf
, DLM then decides
whether to use the TCP or SCTP protocol for its communication:
If
is set tonone
(which means redundant ring configuration is disabled), DLM automatically uses TCP. However, without a redundant communication channel, DLM communication will fail if the TCP link is down.If
is set topassive
(which is the typical setting), and a second communication ring in/etc/corosync/corosync.conf
is configured correctly, DLM automatically uses SCTP. In this case, DLM messaging has the redundancy capability provided by SCTP.
15.2 Configuring DLM Cluster Resources #
DLM uses the cluster membership services from Pacemaker which run in user space. Therefore, DLM needs to be configured as a clone resource that is present on each node in the cluster.
As OCFS2, GFS2, Cluster MD, and cLVM all use DLM, it is enough to configure one resource for DLM. As the DLM resource runs on all nodes in the cluster it is configured as a clone resource.
If you have a setup that includes both OCFS2 and cLVM, configuring one DLM resource for both OCFS2 and cLVM is enough.
The configuration consists of a base group that includes several primitives and a base clone. Both base group and base clone can be used in various scenarios afterward (for both OCFS2 and cLVM, for example). You only need to extend the base group with the respective primitives as needed. As the base group has internal colocation and ordering, this simplifies the overall setup as you do not need to specify several individual groups, clones and their dependencies.
Follow the steps below on one node in the cluster:
Start a shell and log in as
root
or equivalent.Run
crm
configure
.Enter the following to create the primitive resource for DLM:
crm(live)configure#
primitive
dlm ocf:pacemaker:controld \ op monitor interval="60" timeout="60"Create a base group for the DLM resource and further storage-related resources:
crm(live)configure#
group
g-storage dlmClone the
g-storage
group so that it runs on all nodes:crm(live)configure#
clone
cl-storage g-storage \ meta interleave=true target-role=StartedReview your changes with
show
.If everything is correct, submit your changes with
commit
and leave the crm live configuration withexit
.
Clusters without STONITH are not supported. If you set the global cluster
option stonith-enabled
to false
for
testing or troubleshooting purposes, the DLM resource and all services
depending on it (such as cLVM, GFS2, and OCFS2) will fail to start.