7 Using the YaST cluster module #
The YaST cluster module allows you to set up a cluster manually or to modify options for an existing cluster.
7.1 Definition of terms #
Several key terms used in the YaST cluster module and in this chapter are defined below.
- Bind network address (
bindnetaddr
) The network address the Corosync executive should bind to. To simplify sharing configuration files across the cluster, Corosync uses network interface netmask to mask only the address bits that are used for routing the network. For example, if the local interface is
192.168.5.92
with netmask255.255.255.0
, setbindnetaddr
to192.168.5.0
. If the local interface is192.168.5.92
with netmask255.255.255.192
, setbindnetaddr
to192.168.5.64
.If
nodelist
withringX_addr
is explicitly configured in/etc/corosync/corosync.conf
,bindnetaddr
is not strictly required.Note: Network address for all nodesAs the same Corosync configuration is used on all nodes, make sure to use a network address as
bindnetaddr
, not the address of a specific network interface.conntrack
ToolsAllow interaction with the in-kernel connection tracking system for enabling stateful packet inspection for iptables. Used by SUSE Linux Enterprise High Availability to synchronize the connection status between cluster nodes. For detailed information, refer to https://conntrack-tools.netfilter.org/.
- Csync2
A synchronization tool that can be used to replicate configuration files across all nodes in the cluster, and even across Geo clusters. Csync2 can handle any number of hosts, sorted into synchronization groups. Each synchronization group has its own list of member hosts and its include/exclude patterns that define which files should be synchronized in the synchronization group. The groups, the host names belonging to each group, and the include/exclude rules for each group are specified in the Csync2 configuration file,
/etc/csync2/csync2.cfg
.For authentication, Csync2 uses the IP addresses and pre-shared keys within a synchronization group. You need to generate one key file for each synchronization group and copy it to all group members.
For more information about Csync2, refer to https://oss.linbit.com/csync2/paper.pdf
- Existing cluster
The term “existing cluster” is used to refer to any cluster that consists of at least one node. Existing clusters have a basic Corosync configuration that defines the communication channels, but they do not necessarily have resource configuration yet.
- Heuristics
QDevice supports a set of commands (“heuristics”). The commands are executed locally on start-up of cluster services, cluster membership change, successful connection to the QNetd server, or, optionally, at regular times.
Only if all commands are executed successfully are the heuristics considered to have passed; otherwise, they failed. The heuristics' result is sent to the QNetd server, where it is used in calculations to determine which partition should be quorate.
- Multicast
A technology used for a one-to-many communication within a network that can be used for cluster communication. Corosync supports both multicast and unicast.
Note: Switches and multicastTo use multicast for cluster communication, make sure your switches support multicast.
- Multicast address (
mcastaddr
) IP address to be used for multicasting by the Corosync executive. The IP address can either be IPv4 or IPv6. If IPv6 networking is used, node IDs must be specified. You can use any multicast address in your private network.
- Multicast port (
mcastport
) The port to use for cluster communication. Corosync uses two ports: the specified
mcastport
for receiving multicast, andmcastport -1
for sending multicast.- QDevice
A systemd service (a daemon) on each cluster node running together with Corosync. This is the client of the QNetd server. Its primary use is to allow a cluster to sustain more node failures than standard quorum rules allow.
QDevice is designed to work with different arbitrators. However, currently, only QNetd is supported.
- QNetd
A systemd service (a daemon, the “QNetd server”), which is not part of the cluster. The systemd service provides a vote to the QDevice daemon.
To improve security, QNetd can work with TLS for client certificate checking.
- Redundant Ring Protocol (RRP)
Allows the use of multiple redundant local area networks for resilience against partial or total network faults. This way, cluster communication can still be kept up as long as a single network is operational. Corosync supports the Totem Redundant Ring Protocol. A logical token-passing ring is imposed on all participating nodes to deliver messages in a reliable and sorted manner. A node is allowed to broadcast a message only if it holds the token.
When having defined redundant communication channels in Corosync, use RRP to tell the cluster how to use these interfaces. RRP can have three modes (
rrp_mode
):If set to
active
, Corosync uses both interfaces actively. However, this mode is deprecated.If set to
passive
, Corosync sends messages alternatively over the available networks.If set to
none
, RRP is disabled.
- Unicast
A technology for sending messages to a single network destination. Corosync supports both multicast and unicast. In Corosync, unicast is implemented as UDP-unicast (UDPU).
7.2 Starting the YaST module #
Start YaST and select
› . Alternatively, start the module from the command line:#
yast2 cluster
If you start the cluster module for the first time, it appears as a wizard, guiding you through all the steps necessary for basic setup. Otherwise, select the categories on the left panel to access the configuration options for each step.
The following list shows an overview of the available screens in the YaST cluster module. It also mentions whether the screen contains parameters that are required for successful cluster setup or whether its parameters are optional. If you are following a guided first-time setup, the screens appear in the order shown in this list.
- Communication channels (required)
Allows you to define one or two communication channels for communication between the cluster nodes. As transport protocol, either use multicast (UDP) or unicast (UDPU). For details, see Section 7.3, “Defining the communication channels”.
Important: Redundant communication pathsFor a supported cluster setup, two or more redundant communication paths are required. The preferred way is to use network device bonding. If this is impossible, you must define a second communication channel in Corosync.
- Corosync QDevice (optional but recommended for clusters with an even number of nodes)
Allows you to configure QDevice as a client of a QNetd server to participate in quorum decisions. This is recommended for clusters with an even number of nodes, and especially for two-node clusters. For details, see Section 7.4, “Configuring an arbitrator for quorum decisions”.
- Security (optional but recommended)
Allows you to define the authentication settings for the cluster. HMAC/SHA1 authentication requires a shared secret used to protect and authenticate messages. For details, see Section 7.5, “Defining authentication settings”.
- Configure Csync2 (optional but recommended)
Csync2 helps you to keep track of configuration changes and to keep files synchronized across the cluster nodes. If you are using YaST to set up the cluster for the first time, we strongly recommend configuring Csync2. If you do not use Csync2, you must manually copy all configuration files from the first node to the rest of the nodes in the cluster. For details, see Section 7.6, “Configuring Csync2 to synchronize files”.
- Configure conntrackd (optional)
Allows you to configure the user space
conntrackd
. Use the conntrack tools for stateful packet inspection for iptables. For details, see Section 7.7, “Synchronizing connection status between cluster nodes”.- Service (required)
Allows you to configure the service for bringing the cluster node online. Define whether to start the cluster services at boot time and whether to open the ports in the firewall that are needed for communication between the nodes. For details, see Section 7.8, “Configuring services”.
Certain settings in the YaST cluster module apply only to the current node. Other settings may automatically be transferred to all nodes with Csync2. Find detailed information about this in the following sections.
7.3 Defining the communication channels #
For successful communication between the cluster nodes, define at least
one communication channel. All settings defined in the YaST
/etc/corosync/corosync.conf
. Find example
files for a multicast and a unicast setup in
/usr/share/doc/packages/corosync/
.
If you are using IPv4 addresses, node IDs are optional. If you are using IPv6 addresses, node IDs are required. Instead of specifying IDs manually for each node, the YaST cluster module contains an option to automatically generate a unique ID for every cluster node.
As transport protocol, either use multicast (UDP) or unicast (UDPU) as described in the following procedures:
To configure multicast, use Procedure 7.1, “Defining the first communication channel (multicast)”.
To configure unicast, use Procedure 7.2, “Defining the first communication channel (unicast)”.
If you also need to define a second, redundant channel, configure the first channel with one of the procedures above, then continue to Procedure 7.3, “Defining a redundant communication channel”.
For deploying SUSE Linux Enterprise High Availability on public cloud platforms, use unicast as the transport protocol. Multicast is generally not supported by the cloud platforms themselves.
When using multicast, the same bindnetaddr
,
mcastaddr
, and mcastport
is used for all cluster nodes. All nodes in the cluster know each
other by using the same multicast address. For different clusters, use
different multicast addresses.
If you are modifying an existing cluster, switch to the
category.If you are following the initial setup wizard, you do not need to switch categories manually.
Set the
protocol toMulticast
.Define the
. Set the value to the subnet you will use for cluster multicast.Define the
.Define the
.To automatically generate a unique ID for every cluster node, keep
enabled.Define a
.Enter the number of quorum in case of a partitioned cluster. By default, each node has
. This is important for Corosync to calculate1
vote. The number of must match the number of nodes in your cluster.If you need to define a redundant communication channel in Corosync, continue to Procedure 7.3, “Defining a redundant communication channel”.
Otherwise, confirm your changes with
(setup wizard) or (existing cluster).
If you are modifying an existing cluster, switch to the
category.If you are following the initial setup wizard, you do not need to switch categories manually.
Set the
protocol toUnicast
.Define the
.For unicast communication, Corosync needs to know the IP addresses of all nodes in the cluster. For each node, select
and enter the following details:To modify or remove any addresses of cluster members, use the
or buttons.To automatically generate a unique ID for every cluster node keep
enabled.Define a
.Enter the number of quorum in case of a partitioned cluster. By default, each node has
. This is important for Corosync to calculate1
vote. The number of must match the number of nodes in your cluster.If you need to define a redundant communication channel in Corosync, continue to Procedure 7.3, “Defining a redundant communication channel”.
Otherwise, confirm your changes with
(setup wizard) or (existing cluster).
If network device bonding cannot be used for any reason, the second best choice is to define a redundant communication channel (a second ring) in Corosync. That way, two physically separate networks can be used for communication. If one network fails, the cluster nodes can still communicate via the other network.
The additional communication channel in
Corosync forms a second token-passing ring. The first channel you
configured is the primary ring and gets the ring number
0
. The second ring (redundant channel) gets the ring number
1
.
/etc/hosts
If multiple rings are configured in Corosync, each node can
have multiple IP addresses. This needs to be reflected in the
/etc/hosts
file of all nodes.
Configure the first channel as described in Procedure 7.1, “Defining the first communication channel (multicast)” or Procedure 7.2, “Defining the first communication channel (unicast)”.
Activate
. The redundant channel must use the same protocol as the first communication channel you defined.If you use multicast, enter the following parameters: the
to use, the and the for the redundant channel.If you use unicast, define the following parameters: the
to use, and the . Under , each entry to add a for each node that will be part of the cluster.To tell Corosync how and when to use the different channels, select the
to use:If only one communication channel is defined,
is automatically disabled (valuenone
).If set to
active
, Corosync uses both interfaces actively. However, this mode is deprecated.If set to
passive
, Corosync sends messages alternatively over the available networks.
When RRP is used, SUSE Linux Enterprise High Availability monitors the status of the current rings and automatically re-enables redundant rings after faults.
Alternatively, check the ring status manually with
corosync-cfgtool
. View the available options with-h
.Confirm your changes with
(setup wizard) or (existing cluster).
7.4 Configuring an arbitrator for quorum decisions #
QDevice and QNetd participate in quorum decisions. With assistance from
the arbitrator corosync-qnetd
,
corosync-qdevice
provides a
configurable number of votes, allowing a cluster to sustain more node
failures than the standard quorum rules allow. We recommend deploying
corosync-qnetd
and
corosync-qdevice
for clusters
with an even number of nodes, and especially for two-node clusters.
For more information, see Chapter 18, QDevice and QNetd.
Before you configure QDevice, you must set up a QNetd server. See Section 18.3, “Setting up the QNetd server”.
If you are modifying an existing cluster, switch to the
category.If you are following the initial setup wizard, you do not need to switch categories manually.
Activate
.In the
field, enter the IP address or host name of the QNetd server.Select the mode for
:Use
if TLS is not required and should not be tried.Use
to attempt to connect with TLS, but connect without TLS if it is not available.Use
to make TLS mandatory. QDevice will exit with an error if TLS is not available.
Accept the default values for
and . If you need to change these values, you can do so with the commandcrm cluster init qdevice
after you finish setting up the cluster.Select the
:Use
to disable heuristics.Use
to run heuristics on a regular basis, as set by the .Use
to only run heuristics during startup, when cluster membership changes, and on connection to QNetd.
If you set the
to or , add your heuristics commands to the list:Select
. A new window opens.Enter an
for the command.Enter the command in the
field. This can be a single command or the path to a script, and can be written in any language such as Shell, Python, or Ruby.Select
to close the window.
Confirm your changes with
(setup wizard) or (existing cluster).
7.5 Defining authentication settings #
To define the authentication settings for the cluster, you can use HMAC/SHA1 authentication. This requires a shared secret used to protect and authenticate messages. The authentication key (password) you specify is used on all nodes in the cluster.
If you are modifying an existing cluster, switch to the
category.If you are following the initial setup wizard, you do not need to switch categories manually.
Activate
.For a newly created cluster, select
. An authentication key is created and written to/etc/corosync/authkey
.If you want the current machine to join an existing cluster, do not generate a new key file. Instead, copy the
/etc/corosync/authkey
from one of the nodes to the current machine (either manually or with Csync2).Confirm your changes with
(setup wizard) or (existing cluster). YaST writes the configuration to/etc/corosync/corosync.conf
.
7.6 Configuring Csync2 to synchronize files #
Instead of copying the configuration files to all nodes
manually, use the csync2
tool for replication across
all nodes in the cluster. Csync2 helps you to keep track of configuration changes
and to keep files synchronized across the cluster nodes:
You can define a list of files that are important for operation.
You can show changes to these files (against the other cluster nodes).
You can synchronize the configured files with a single command.
With a simple shell script in
~/.bash_logout
, you can be reminded about unsynchronized changes before logging out of the system.
Find detailed information about Csync2 at https://oss.linbit.com/csync2/ and https://oss.linbit.com/csync2/paper.pdf.
Csync2 only pushes changes. It does not continuously
synchronize files between the machines. Each time you update files that need
to be synchronized, you need to push the changes to the other machines.
Using csync2
to push changes is described later, after
the cluster configuration with YaST is complete.
If you are modifying an existing cluster, switch to the
category.If you are following the initial setup wizard, you do not need to switch categories manually.
To specify the synchronization group, select
in the group and enter the local host names of all nodes in your cluster. For each node, you must use exactly the strings that are returned by thehostname
command.Tip: Host name resolutionIf host name resolution does not work properly in your network, you can also specify a combination of host name and IP address for each cluster node. To do so, use the string HOSTNAME@IP such as
alice@192.168.2.100
, for example. Csync2 then uses the IP addresses when connecting.Select
to create a key file for the synchronization group. The key file is written to/etc/csync2/key_hagroup
.To populate the
list with the files that usually need to be synchronized among all nodes, select .To
, or files from the list of files to be synchronized, use the respective buttons. You must enter the absolute path for each file.Activate Csync2 by selecting
. This enables Csync2 to start automatically at boot time.Confirm your changes with
(setup wizard) or (existing cluster). YaST writes the Csync2 configuration to/etc/csync2/csync2.cfg
.
7.7 Synchronizing connection status between cluster nodes #
To enable stateful packet inspection for iptables, configure and use the conntrack tools. This requires the following basic steps:
conntrackd
with YaST #
Use the YaST cluster module to configure the user space
conntrackd
(see Figure 7.6, “YaST conntrackd
”). It needs a
dedicated network interface that is not used for other communication
channels. The daemon can be started via a resource agent afterward.
—
If you are modifying an existing cluster, switch to the
category.If you are following the initial setup wizard, you do not need to switch categories manually.
Select a
for synchronizing the connection status. The IPv4 address of the selected interface is automatically detected and shown in YaST. It must already be configured and it must support multicast.Define the
to be used for synchronizing the connection status.In
, define a numeric ID for the group to synchronize the connection status to.Select
to create the configuration file forconntrackd
.If you modified any options for an existing cluster, confirm your changes and close the cluster module.
Confirm your changes with
(setup wizard) or (existing cluster).
After having configured the conntrack tools, you can use them for Linux Virtual Server (see Load balancing).
conntrackd
—#7.8 Configuring services #
In the YaST cluster module, define whether to start certain services on a node at boot time. You can also use the module to start and stop the services manually. To bring the cluster nodes online and start the cluster resource manager, Pacemaker must be running as a service.
The configuration in this section only applies to the current machine, not to all cluster nodes.
If you are modifying an existing cluster, switch to the
category. You can use the options in this section to start and stop cluster services on this node.If you are following the initial setup wizard, you do not need to switch categories manually. You will start the cluster services later, so you can skip straight to Step 5.
To start the cluster services each time this cluster node is booted, select
. If you select , you must start the cluster services manually each time this node is booted.To start or stop the cluster services immediately, select
or .To start or stop QDevice immediately, select
or .To open the ports in the firewall that are needed for cluster communication, activate
.Confirm your changes with
(setup wizard) or (existing cluster).If you are following the initial setup wizard, this completes the initial configuration and exits YaST. Continue to Section 7.9, “Transferring the configuration to all nodes”.
7.9 Transferring the configuration to all nodes #
After the cluster configuration with YaST is complete, use csync2
to copy the configuration files to the rest of the cluster nodes. To receive the files,
nodes must be included in the group you configured in
Procedure 7.6, “Configuring Csync2 with YaST”.
Before running Csync2 for the first time, you need to make the following preparations:
Make sure passwordless SSH is configured between the nodes. This is required for cluster communication.
Copy the file
/etc/csync2/csync2.cfg
manually to all nodes in the cluster.Copy the file
/etc/csync2/key_hagroup
manually to all nodes in the cluster. It is needed for authentication by Csync2. However, do not regenerate the file on the other nodes—it needs to be the same file on all nodes.Run the following command on all nodes to enable and start the service now:
#
systemctl enable --now csync2.socket
Use the following procedure to transfer the configuration files to all cluster nodes:
To synchronize all files once, run the following command on the machine that you want to copy the configuration from:
#
csync2 -xv
This synchronizes all the files once by pushing them to the other nodes. If all files are synchronized successfully, Csync2 finishes with no errors.
If one or several files that are to be synchronized have been modified on other nodes (not only on the current one), Csync2 reports a conflict with an output similar to the one below:
While syncing file /etc/corosync/corosync.conf: ERROR from peer hex-14: File is also marked dirty here! Finished with 1 errors.
If you are sure that the file version on the current node is the “best” one, you can resolve the conflict by forcing this file and resynchronizing:
#
csync2 -f /etc/corosync/corosync.conf
#
csync2 -xv
For more information on the Csync2 options, run
#
csync2 --help
Csync2 only pushes changes. It does not continuously synchronize files between the machines.
Each time you update files that need to be synchronized, you need to
push the changes to the other machines by running csync2 -xv
on the machine where you did the changes. If you run
the command on any of the other machines with unchanged files, nothing
happens.
7.10 Bringing the cluster online #
After the initial cluster configuration is done, start the cluster services on all cluster nodes to bring the stack online:
Log in to an existing node.
Start the cluster services on all cluster nodes:
#
crm cluster start --all
This command requires passwordless SSH access between the nodes. You can also start individual nodes with
crm cluster start
.Check the cluster status with the
crm status
command. If all nodes are online, the output should be similar to the following:#
crm status
Cluster Summary: * Stack: corosync * Current DC: alice (version ...) - partition with quorum * Last updated: ... * Last change: ... by hacluster via crmd on bob * 2 nodes configured * 1 resource instance configured Node List: * Online: [ alice bob ] ...This output indicates that the cluster resource manager is started and is ready to manage resources.
To be supported, a SUSE Linux Enterprise High Availability cluster must have STONITH (node fencing) enabled. A node fencing mechanism can be either a physical device (a power switch) or a mechanism like SBD in combination with a watchdog. Before you continue using the cluster, configure one or more STONITH devices as described in Chapter 16, Fencing and STONITH or Chapter 17, Storage protection and SBD.