14 QDevice and QNetd #
QDevice and QNetd participate in quorum decisions. With
assistance from the arbitrator corosync-qnetd
,
corosync-qdevice
provides
a configurable number of votes, allowing a cluster to sustain
more node failures than the standard quorum rules allow. We
recommend deploying corosync-qnetd
and corosync-qdevice
for
clusters with an even number of nodes, and especially for two-node clusters.
14.1 Conceptual overview #
In comparison to calculating quora among cluster nodes, the QDevice-and-QNetd approach has the following benefits:
It provides better sustainability in case of node failures.
You can write your own heuristics scripts to affect votes. This is especially useful for complex setups, such as SAP applications.
It enables you to configure a QNetd server to provide votes for multiple clusters.
It allows using diskless SBD for two-node clusters.
It helps with quorum decisions for clusters with an even number of nodes under split-brain situations, especially for two-node clusters.
A setup with QDevice/QNetd consists of the following components and mechanisms:
- QNetd (
corosync-qnetd
) A systemd service (a daemon, the “QNetd server”) which is not part of the cluster. The systemd service provides a vote to the
corosync-qdevice
daemon.To improve security,
corosync-qnetd
can work with TLS for client certificate checking.- QDevice (
corosync-qdevice
) A systemd service (a daemon) on each cluster node running together with Corosync. This is the client of
corosync-qnetd
. Its primary use is to allow a cluster to sustain more node failures than standard quorum rules allow.QDevice is designed to work with different arbitrators. However, currently, only QNetd is supported.
- Algorithms
QDevice supports different algorithms, which determine the behavior of how votes are assigned. Currently, the following exist:
FFSplit (“fifty-fifty split”) is the default. It is used for clusters with an even number of nodes. If the cluster splits into two similar partitions, this algorithm provides one vote to one of the partitions, based on the results of heuristics checks and other factors.
LMS (“last man standing”) allows the only remaining node that can see the QNetd server to get the votes. So this algorithm is useful when a cluster with only one active node should remain quorate.
- Heuristics
QDevice supports a set of commands (“heuristics”). The commands are executed locally on start-up of cluster services, cluster membership change, successful connection to
corosync-qnetd
, or, optionally, at regular times. The heuristics can be set with the quorum.device.heuristics key (in thecorosync.conf
file) or with the--qdevice-heuristics-mode
option. Both know the valuesoff
(default),sync
, andon
. The difference betweensync
andon
is that you can additionally execute the above commands regularly.Only if all commands are executed successfully are the heuristics considered to have passed; otherwise, they failed. The heuristics' result is sent to
corosync-qnetd
where it is used in calculations to determine which partition should be quorate.- Tiebreaker
This is used as a fallback if the cluster partitions are completely equal even with the same heuristics results. It can be configured to be the lowest, the highest, or a specific node ID.
14.2 Requirements and prerequisites #
Before setting up QDevice and QNetd, you need to prepare the environment as follows:
In addition to the cluster nodes, you have a separate machine which will become the QNetd server. See Section 14.3, “Setting up the QNetd server”.
A different physical network than the one that Corosync uses. It is recommended for QDevice to reach the QNetd server. Ideally, the QNetd server should be in a separate rack than the main cluster, or at least on a separate PSU and not in the same network segment as the Corosync ring or rings.
14.3 Setting up the QNetd server #
The QNetd server is not part of the cluster stack, and it is also not a real member of your cluster. As such, you cannot move resources to this server.
The QNetd server is almost “state free”. Usually, you do not need to
change anything in the configuration file /etc/sysconfig/corosync-qnetd
.
By default, the corosync-qnetd service runs the daemon
as user coroqnetd
in the group coroqnetd
. This avoids
running the daemon as root
.
To create a QNetd server, proceed as follows:
On the machine that will become the QNetd server, install SUSE Linux Enterprise Server 15 SP4.
Enable the SUSE Linux Enterprise High Availability using the command listed in
SUSEConnect --list-extensions
.Install the corosync-qnetd package:
#
zypper install corosync-qnetd
You do not need to manually start the
corosync-qnetd
service. It will be started automatically when you configure QDevice on the cluster.
Your QNetd server is ready to accept connections from a QDevice client
corosync-qdevice
.
Further configuration is not needed.
14.4 Connecting QDevice clients to the QNetd server #
After you set up your QNetd server, you can set up and run the clients. You can connect the clients to the QNetd server during the installation of your cluster, or you can add them later. This procedure documents how to add them later.
On all nodes, install the corosync-qdevice package:
#
zypper install corosync-qdevice
On one of the nodes, run the following command to configure QDevice:
#
crm cluster init qdevice
Do you want to configure QDevice (y/n)?y
HOST or IP of the QNetd server to be used []QNETD_SERVER
TCP PORT of QNetd server [5403] QNetd decision ALGORITHM (ffsplit/lms) [ffsplit] QNetd TIE_BREAKER (lowest/highest/valid node id) [lowest] Whether using TLS on QDevice/QNetd (on/off/required) [on] Heuristics COMMAND to run with absolute path; For multiple commands, use ";" to separate []Confirm with
y
that you want to configure QDevice, then enter the host name or IP address of the QNetd server. For the remaining fields, you can accept the default values or change them if required.Important:SBD_WATCHDOG_TIMEOUT
for diskless SBD and QDeviceIf you use QDevice with diskless SBD, the
SBD_WATCHDOG_TIMEOUT
value must be greater than QDevice'ssync_timeout
value, or SBD will time out and fail to start.The default value for
sync_timeout
is 30 seconds. Therefore, in the file/etc/sysconfig/sbd
, make sure thatSBD_WATCHDOG_TIMEOUT
is set to a greater value, such as35
.
14.5 Setting up a QDevice with heuristics #
If you need additional control over how votes are determined, use heuristics. Heuristics are a set of commands that are executed in parallel.
For this purpose, the command crm cluster init qdevice
provides the option --qdevice-heuristics
. You can
pass one or more commands (separated by semicolons) with absolute paths.
For example, if your own command for heuristic checks is located at
/usr/sbin/my-script.sh
you can run it on
one of your cluster nodes as follows:
#
crm cluster init qdevice --qnetd-hostname=charlie \ --qdevice-heuristics=/usr/sbin/my-script.sh \ --qdevice-heuristics-mode=on
The command or commands can be written in any language such as Shell, Python, or Ruby.
If they succeed, they return 0
(zero), otherwise they return an error code.
You can also pass a set of commands. Only when all commands finish successfully (return code is zero), have the heuristics passed.
The --qdevice-heuristics-mode=on
option lets the heuristics
commands run regularly.
14.6 Checking and showing quorum status #
You can query the quorum status on one of your cluster nodes as shown in Example 14.1, “Status of QDevice”. It shows the status of your QDevice nodes.
#
corosync-quorumtool
1 Quorum information ------------------ Date: ... Quorum provider: corosync_votequorum Nodes: 2 2 Node ID: 3232235777 3 Ring ID: 3232235777/8 Quorate: Yes 4 Votequorum information ---------------------- Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Membership information ---------------------- Nodeid Votes Qdevice Name 3232235777 1 A,V,NMW 192.168.1.1 (local) 5 3232235778 1 A,V,NMW 192.168.1.2 5 0 1 Qdevice
As an alternative with an identical result, you can also use
the | |
The number of nodes we are expecting. In this example, it is a two-node cluster. | |
As the node ID is not explicitly specified in | |
The quorum status. In this case, the cluster has quorum. | |
The status for each cluster node means:
|
If you query the status of the QNetd server, you get a similar output to that shown in Example 14.2, “Status of QNetd server”:
#
corosync-qnetd-tool -lv
1 Cluster "hacluster": 2 Algorithm: Fifty-Fifty split 3 Tie-breaker: Node with lowest node ID Node ID 3232235777: 4 Client address: ::ffff:192.168.1.1:54732 HB interval: 8000ms Configured node list: 3232235777, 3232235778 Ring ID: aa10ab0.8 Membership node list: 3232235777, 3232235778 Heuristics: Undefined (membership: Undefined, regular: Undefined) TLS active: Yes (client certificate verified) Vote: ACK (ACK) Node ID 3232235778: Client address: ::ffff:192.168.1.2:43016 HB interval: 8000ms Configured node list: 3232235777, 3232235778 Ring ID: aa10ab0.8 Membership node list: 3232235777, 3232235778 Heuristics: Undefined (membership: Undefined, regular: Undefined) TLS active: Yes (client certificate verified) Vote: No change (ACK)
As an alternative with an identical result, you can also use
the | |
The name of your cluster as set in the configuration file
| |
The algorithm currently used. In this example, it is | |
This is the entry for the node with the IP address
|
14.7 For more information #
For additional information about QDevice and QNetd, see the man pages of corosync-qdevice(8) and corosync-qnetd(8).