Administration Guide

7.1 Definition of terms #

Several key terms used in the YaST cluster module and in this chapter are defined below.

Bind network address (bindnetaddr)

The network address the Corosync executive should bind to. To simplify sharing configuration files across the cluster, Corosync uses network interface netmask to mask only the address bits that are used for routing the network. For example, if the local interface is 192.168.5.92 with netmask 255.255.255.0, set bindnetaddr to 192.168.5.0. If the local interface is 192.168.5.92 with netmask 255.255.255.192, set bindnetaddr to 192.168.5.64.

If nodelist with ringX_addr is explicitly configured in /etc/corosync/corosync.conf, bindnetaddr is not strictly required.

Note: Network address for all nodes

As the same Corosync configuration is used on all nodes, make sure to use a network address as bindnetaddr, not the address of a specific network interface.

conntrack Tools

Allow interaction with the in-kernel connection tracking system for enabling stateful packet inspection for iptables. Used by SUSE Linux Enterprise High Availability to synchronize the connection status between cluster nodes. For detailed information, refer to https://conntrack-tools.netfilter.org/.

Csync2

A synchronization tool that can be used to replicate configuration files across all nodes in the cluster, and even across Geo clusters. Csync2 can handle any number of hosts, sorted into synchronization groups. Each synchronization group has its own list of member hosts and its include/exclude patterns that define which ﬁles should be synchronized in the synchronization group. The groups, the host names belonging to each group, and the include/exclude rules for each group are specified in the Csync2 configuration file, /etc/csync2/csync2.cfg.

For authentication, Csync2 uses the IP addresses and pre-shared keys within a synchronization group. You need to generate one key file for each synchronization group and copy it to all group members.

For more information about Csync2, refer to https://oss.linbit.com/csync2/paper.pdf

Existing cluster

The term “existing cluster” is used to refer to any cluster that consists of at least one node. Existing clusters have a basic Corosync configuration that defines the communication channels, but they do not necessarily have resource configuration yet.

Heuristics

QDevice supports a set of commands (“heuristics”). The commands are executed locally on start-up of cluster services, cluster membership change, successful connection to the QNetd server, or, optionally, at regular times.

Only if all commands are executed successfully are the heuristics considered to have passed; otherwise, they failed. The heuristics' result is sent to the QNetd server, where it is used in calculations to determine which partition should be quorate.

Multicast

A technology used for a one-to-many communication within a network that can be used for cluster communication. Corosync supports both multicast and unicast.

Note: Switches and multicast

To use multicast for cluster communication, make sure your switches support multicast.

Multicast address (mcastaddr)

IP address to be used for multicasting by the Corosync executive. The IP address can either be IPv4 or IPv6. If IPv6 networking is used, node IDs must be specified. You can use any multicast address in your private network.

Multicast port (mcastport)

The port to use for cluster communication. Corosync uses two ports: the specified mcastport for receiving multicast, and mcastport -1 for sending multicast.

QDevice

A systemd service (a daemon) on each cluster node running together with Corosync. This is the client of the QNetd server. Its primary use is to allow a cluster to sustain more node failures than standard quorum rules allow.

QDevice is designed to work with different arbitrators. However, currently, only QNetd is supported.

QNetd

A systemd service (a daemon, the “QNetd server”), which is not part of the cluster. The systemd service provides a vote to the QDevice daemon.

To improve security, QNetd can work with TLS for client certificate checking.

Redundant Ring Protocol (RRP)

Allows the use of multiple redundant local area networks for resilience against partial or total network faults. This way, cluster communication can still be kept up as long as a single network is operational. Corosync supports the Totem Redundant Ring Protocol. A logical token-passing ring is imposed on all participating nodes to deliver messages in a reliable and sorted manner. A node is allowed to broadcast a message only if it holds the token.

When having defined redundant communication channels in Corosync, use RRP to tell the cluster how to use these interfaces. RRP can have three modes (rrp_mode):

If set to active, Corosync uses both interfaces actively. However, this mode is deprecated.
If set to passive, Corosync sends messages alternatively over the available networks.
If set to none, RRP is disabled.

Unicast

A technology for sending messages to a single network destination. Corosync supports both multicast and unicast. In Corosync, unicast is implemented as UDP-unicast (UDPU).

7.2 Starting the YaST Cluster module #

Start YaST and select High Availability › Cluster. Alternatively, start the module from the command line:

# yast2 cluster

If you start the cluster module for the first time, it appears as a wizard, guiding you through all the steps necessary for basic setup. Otherwise, select the categories on the left panel to access the configuration options for each step.

The following list shows an overview of the available screens in the YaST cluster module. It also mentions whether the screen contains parameters that are required for successful cluster setup or whether its parameters are optional. If you are following a guided first-time setup, the screens appear in the order shown in this list.

Communication channels (required): Allows you to define one or two communication channels for communication between the cluster nodes. As transport protocol, either use multicast (UDP) or unicast (UDPU). For details, see Section 7.3, “Defining the communication channels”.
Important: Redundant communication paths
For a supported cluster setup, two or more redundant communication paths are required. The preferred way is to use network device bonding. If this is impossible, you must define a second communication channel in Corosync.
Corosync QDevice (optional but recommended for clusters with an even number of nodes): Allows you to configure QDevice as a client of a QNetd server to participate in quorum decisions. This is recommended for clusters with an even number of nodes, and especially for two-node clusters. For details, see Section 7.4, “Configuring an arbitrator for quorum decisions”.
Security (optional but recommended): Allows you to define the authentication settings for the cluster. HMAC/SHA1 authentication requires a shared secret used to protect and authenticate messages. For details, see Section 7.5, “Defining authentication settings”.
Configure Csync2 (optional but recommended): Csync2 helps you to keep track of configuration changes and to keep files synchronized across the cluster nodes. If you are using YaST to set up the cluster for the first time, we strongly recommend configuring Csync2. If you do not use Csync2, you must manually copy all configuration files from the first node to the rest of the nodes in the cluster. For details, see Section 7.6, “Configuring Csync2 to synchronize files”.
Configure conntrackd (optional): Allows you to configure the user space conntrackd. Use the conntrack tools for stateful packet inspection for iptables. For details, see Section 7.7, “Synchronizing connection status between cluster nodes”.
Service (required): Allows you to configure the service for bringing the cluster node online. Define whether to start the cluster services at boot time and whether to open the ports in the firewall that are needed for communication between the nodes. For details, see Section 7.8, “Configuring services”.

Note: Settings in the YaST Cluster module

Certain settings in the YaST cluster module apply only to the current node. Other settings may automatically be transferred to all nodes with Csync2. Find detailed information about this in the following sections.

7.3 Defining the communication channels #

For successful communication between the cluster nodes, define at least one communication channel. All settings defined in the YaST Communication Channels screen are written to /etc/corosync/corosync.conf. Find example files for a multicast and a unicast setup in /usr/share/doc/packages/corosync/.

If you are using IPv4 addresses, node IDs are optional. If you are using IPv6 addresses, node IDs are required. Instead of specifying IDs manually for each node, the YaST cluster module contains an option to automatically generate a unique ID for every cluster node.

As transport protocol, either use multicast (UDP) or unicast (UDPU) as described in the following procedures:

To configure multicast, use Procedure 7.1, “Defining the first communication channel (multicast)”.
To configure unicast, use Procedure 7.2, “Defining the first communication channel (unicast)”.
If you also need to define a second, redundant channel, configure the first channel with one of the procedures above, then continue to Procedure 7.3, “Defining a redundant communication channel”.

Note: Public clouds: use unicast

For deploying SUSE Linux Enterprise High Availability on public cloud platforms, use unicast as the transport protocol. Multicast is generally not supported by the cloud platforms themselves.

Procedure 7.1: Defining the first communication channel (multicast) #

When using multicast, the same bindnetaddr, mcastaddr, and mcastport is used for all cluster nodes. All nodes in the cluster know each other by using the same multicast address. For different clusters, use different multicast addresses.

If you are modifying an existing cluster, switch to the Communication Channels category.
If you are following the initial setup wizard, you do not need to switch categories manually.
Set the Transport protocol to Multicast.
Define the Bind Network Address. Set the value to the subnet you will use for cluster multicast.
Define the Multicast Address.
Define the Port.
To automatically generate a unique ID for every cluster node, keep Auto Generate Node ID enabled.
Define a Cluster Name.
Enter the number of Expected Votes. This is important for Corosync to calculate quorum in case of a partitioned cluster. By default, each node has 1 vote. The number of Expected Votes must match the number of nodes in your cluster.
If you need to define a redundant communication channel in Corosync, continue to Procedure 7.3, “Defining a redundant communication channel”.
Otherwise, confirm your changes with Next (setup wizard) or Finish (existing cluster).

The Communication Channels screen shows the settings for configuring the Corosync communication channel or channels. In this example, Multicast is selected. A bind network address and a multicast address have been added, but the member addresses for each node in the cluster are not required.

Figure 7.1: YaST Cluster—multicast configuration #

Procedure 7.2: Defining the first communication channel (unicast) #

If you are modifying an existing cluster, switch to the Communication Channels category.
If you are following the initial setup wizard, you do not need to switch categories manually.
Set the Transport protocol to Unicast.
Define the Port.
For unicast communication, Corosync needs to know the IP addresses of all nodes in the cluster. For each node, select Add and enter the following details:
- IP Address
- Redundant IP Address (only required if you use a second communication channel in Corosync)
- Node ID (only required if the option Auto Generate Node ID is disabled)
To modify or remove any addresses of cluster members, use the Edit or Del buttons.
To automatically generate a unique ID for every cluster node keep Auto Generate Node ID enabled.
Define a Cluster Name.
Enter the number of Expected Votes. This is important for Corosync to calculate quorum in case of a partitioned cluster. By default, each node has 1 vote. The number of Expected Votes must match the number of nodes in your cluster.
If you need to define a redundant communication channel in Corosync, continue to Procedure 7.3, “Defining a redundant communication channel”.
Otherwise, confirm your changes with Next (setup wizard) or Finish (existing cluster).

The Communication Channels screen shows the settings for configuring the Corosync communication channel or channels. In this example, Unicast is selected. A bind network address and a multicast address are not required, but member addresses for each node in the cluster must be added.

Figure 7.2: YaST Cluster—unicast configuration #

Procedure 7.3: Defining a redundant communication channel #

If network device bonding cannot be used for any reason, the second best choice is to define a redundant communication channel (a second ring) in Corosync. That way, two physically separate networks can be used for communication. If one network fails, the cluster nodes can still communicate via the other network.

The additional communication channel in Corosync forms a second token-passing ring. The first channel you configured is the primary ring and gets the ring number 0. The second ring (redundant channel) gets the ring number 1.

Important: Redundant rings and /etc/hosts

If multiple rings are configured in Corosync, each node can have multiple IP addresses. This needs to be reflected in the /etc/hosts file of all nodes.

Configure the first channel as described in Procedure 7.1, “Defining the first communication channel (multicast)” or Procedure 7.2, “Defining the first communication channel (unicast)”.
Activate Redundant Channel. The redundant channel must use the same protocol as the first communication channel you defined.
If you use multicast, enter the following parameters: the Bind Network Address to use, the Multicast Address and the Port for the redundant channel.
If you use unicast, define the following parameters: the Bind Network Address to use, and the Port. Under Member Address, Edit each entry to add a Redundant IP for each node that will be part of the cluster.
To tell Corosync how and when to use the different channels, select the rrp_mode to use:
- If only one communication channel is defined, rrp_mode is automatically disabled (value none).
- If set to active, Corosync uses both interfaces actively. However, this mode is deprecated.
- If set to passive, Corosync sends messages alternatively over the available networks.
When RRP is used, SUSE Linux Enterprise High Availability monitors the status of the current rings and automatically re-enables redundant rings after faults.
Alternatively, check the ring status manually with corosync-cfgtool. View the available options with -h.
Confirm your changes with Next (setup wizard) or Finish (existing cluster).

7.4 Configuring an arbitrator for quorum decisions #

QDevice and QNetd participate in quorum decisions. With assistance from the arbitrator corosync-qnetd, corosync-qdevice provides a configurable number of votes, allowing a cluster to sustain more node failures than the standard quorum rules allow. We recommend deploying corosync-qnetd and corosync-qdevice for clusters with an even number of nodes, and especially for two-node clusters. For more information, see Chapter 18, QDevice and QNetd.

Requirements #

Before you configure QDevice, you must set up a QNetd server. See Section 18.3, “Setting up the QNetd server”.

Procedure 7.4: Configuring QDevice and QNetd #

If you are modifying an existing cluster, switch to the Corosync QDevice category.
If you are following the initial setup wizard, you do not need to switch categories manually.
Activate Enable Corosync QDevice.
In the QNetd server host field, enter the IP address or host name of the QNetd server.
Select the mode for TLS:
- Use off if TLS is not required and should not be tried.
- Use on to attempt to connect with TLS, but connect without TLS if it is not available.
- Use required to make TLS mandatory. QDevice will exit with an error if TLS is not available.
Accept the default values for Algorithm and Tie breaker. If you need to change these values, you can do so with the command crm cluster init qdevice after you finish setting up the cluster.
Select the Heuristics Mode:
- Use off to disable heuristics.
- Use on to run heuristics on a regular basis, as set by the Heuristics Interval.
- Use sync to only run heuristics during startup, when cluster membership changes, and on connection to QNetd.
If you set the Heuristics Mode to on or sync, add your heuristics commands to the Heuristics Executables list:
1. Select Add. A new window opens.
2. Enter an Execute Name for the command.
3. Enter the command in the Execute Script field. This can be a single command or the path to a script, and can be written in any language such as Shell, Python, or Ruby.
4. Select OK to close the window.
Confirm your changes with Next (setup wizard) or Finish (existing cluster).

The Corosync QDevice screen shows the settings for configuring QDevice. The Enable Corosync QDevice check box is activated, and the cursor is in the QNetd server host field.

Figure 7.3: YaST Cluster—Corosync QDevice #

7.5 Defining authentication settings #

To define the authentication settings for the cluster, you can use HMAC/SHA1 authentication. This requires a shared secret used to protect and authenticate messages. The authentication key (password) you specify is used on all nodes in the cluster.

Procedure 7.5: Enabling secure authentication #

If you are modifying an existing cluster, switch to the Security category.
If you are following the initial setup wizard, you do not need to switch categories manually.
Activate Enable Security Auth.
For a newly created cluster, select Generate Auth Key File. An authentication key is created and written to /etc/corosync/authkey.
If you want the current machine to join an existing cluster, do not generate a new key file. Instead, copy the /etc/corosync/authkey from one of the nodes to the current machine (either manually or with Csync2).
Confirm your changes with Next (setup wizard) or Finish (existing cluster). YaST writes the configuration to /etc/corosync/corosync.conf.

The Security screen shows the settings for configuring Corosync's authentication. The Enable Security Auth checkbox is activated, and you can select the type of Crypto Hash and Crypto Cipher. There is also an option to select Generate Auth Key File.

Figure 7.4: YaST Cluster—security #

7.6 Configuring Csync2 to synchronize files #

Instead of copying the configuration files to all nodes manually, use the csync2 tool for replication across all nodes in the cluster. Csync2 helps you to keep track of configuration changes and to keep files synchronized across the cluster nodes:

You can define a list of files that are important for operation.
You can show changes to these files (against the other cluster nodes).
You can synchronize the configured files with a single command.
With a simple shell script in ~/.bash_logout, you can be reminded about unsynchronized changes before logging out of the system.

Find detailed information about Csync2 at https://oss.linbit.com/csync2/ and https://oss.linbit.com/csync2/paper.pdf.

Important: Pushing synchronization after any changes

Csync2 only pushes changes. It does not continuously synchronize files between the machines. Each time you update files that need to be synchronized, you need to push the changes to the other machines. Using csync2 to push changes is described later, after the cluster configuration with YaST is complete.

Procedure 7.6: Configuring Csync2 with YaST #

If you are modifying an existing cluster, switch to the Configure Csync2 category.
If you are following the initial setup wizard, you do not need to switch categories manually.
To specify the synchronization group, select Add in the Sync Host group and enter the local host names of all nodes in your cluster. For each node, you must use exactly the strings that are returned by the hostname command.
Tip: Host name resolution
If host name resolution does not work properly in your network, you can also specify a combination of host name and IP address for each cluster node. To do so, use the string HOSTNAME@IP such as alice@192.168.2.100, for example. Csync2 then uses the IP addresses when connecting.
Select Generate Pre-Shared-Keys to create a key file for the synchronization group. The key file is written to /etc/csync2/key_hagroup.
To populate the Sync File list with the files that usually need to be synchronized among all nodes, select Add Suggested Files.
To Edit, Add or Remove files from the list of files to be synchronized, use the respective buttons. You must enter the absolute path for each file.
Activate Csync2 by selecting Turn Csync2 ON. This enables Csync2 to start automatically at boot time.
Confirm your changes with Next (setup wizard) or Finish (existing cluster). YaST writes the Csync2 configuration to /etc/csync2/csync2.cfg.

The Configure Csync2 screen shows the settings to configure the transfer of files between nodes. On the left is the Sync Host list, where you need to add all the cluster nodes. On the right is the Sync File list, which is automatically populated with configuration files. There are also options to select Generate Pre-Shared-Keys and Turn Csync2 ON.

Figure 7.5: YaST Cluster—Csync2 #

7.7 Synchronizing connection status between cluster nodes #

To enable stateful packet inspection for iptables, configure and use the conntrack tools. This requires the following basic steps:

Procedure 7.7: Configuring the conntrackd with YaST #

Use the YaST cluster module to configure the user space conntrackd (see Figure 7.6, “YaST Cluster—conntrackd”). It needs a dedicated network interface that is not used for other communication channels. The daemon can be started via a resource agent afterward.

If you are modifying an existing cluster, switch to the Configure conntrackd category.
If you are following the initial setup wizard, you do not need to switch categories manually.
Select a Dedicated Interface for synchronizing the connection status. The IPv4 address of the selected interface is automatically detected and shown in YaST. It must already be configured and it must support multicast.
Define the Multicast Address to be used for synchronizing the connection status.
In Group Number, define a numeric ID for the group to synchronize the connection status to.
Select Generate /etc/conntrackd/conntrackd.conf to create the configuration file for conntrackd.
If you modified any options for an existing cluster, confirm your changes and close the cluster module.
Confirm your changes with Next (setup wizard) or Finish (existing cluster).

After having configured the conntrack tools, you can use them for Linux Virtual Server (see Load balancing).

The Configure conntrackd screen shows the settings for configuring the conntrack tools. You can select a Dedicated Interface, enter a multicast address and group number, and generate the /etc/conntrackd/conntrackd.conf file.

Figure 7.6: YaST Cluster—conntrackd #

7.8 Configuring services #

In the YaST cluster module, define whether to start certain services on a node at boot time. You can also use the module to start and stop the services manually. To bring the cluster nodes online and start the cluster resource manager, Pacemaker must be running as a service.

The configuration in this section only applies to the current machine, not to all cluster nodes.

Procedure 7.8: Enabling the cluster services #

If you are modifying an existing cluster, switch to the Service category. You can use the options in this section to start and stop cluster services on this node.
If you are following the initial setup wizard, you do not need to switch categories manually. You will start the cluster services later, so you can skip straight to Step 5.
To start the cluster services each time this cluster node is booted, select Enable cluster. If you select Disable cluster, you must start the cluster services manually each time this node is booted.
To start or stop the cluster services immediately, select Start Now or Stop Now.
To start or stop QDevice immediately, select Start Now or Stop Now.
To open the ports in the firewall that are needed for cluster communication, activate Open Port in Firewall.
Confirm your changes with Next (setup wizard) or Finish (existing cluster).
If you are following the initial setup wizard, this completes the initial configuration and exits YaST. Continue to Section 7.9, “Transferring the configuration to all nodes”.

The Service screen lets you start or stop services. You can enable or disable the cluster starting at boot time, start or stop Pacemaker and Corosync, and start or stop QDevice. You can also open the required firewall port.

Figure 7.7: YaST Cluster—services #

7.9 Transferring the configuration to all nodes #

After the cluster configuration with YaST is complete, use csync2 to copy the configuration files to the rest of the cluster nodes. To receive the files, nodes must be included in the Sync Host group you configured in Procedure 7.6, “Configuring Csync2 with YaST”.

Before running Csync2 for the first time, you need to make the following preparations:

Procedure 7.9: Preparing for initial synchronization with Csync2 #

Make sure passwordless SSH is configured between the nodes. This is required for cluster communication.
Copy the file /etc/csync2/csync2.cfg manually to all nodes in the cluster.
Copy the file /etc/csync2/key_hagroup manually to all nodes in the cluster. It is needed for authentication by Csync2. However, do not regenerate the file on the other nodes—it needs to be the same file on all nodes.
Run the following command on all nodes to enable and start the service now:
```
# systemctl enable --now csync2.socket
```

Use the following procedure to transfer the configuration files to all cluster nodes:

Procedure 7.10: Synchronizing changes with Csync2 #

To synchronize all files once, run the following command on the machine that you want to copy the configuration from:
```
# csync2 -xv
```
This synchronizes all the files once by pushing them to the other nodes. If all files are synchronized successfully, Csync2 finishes with no errors.
If one or several files that are to be synchronized have been modified on other nodes (not only on the current one), Csync2 reports a conflict with an output similar to the one below:
```
While syncing file /etc/corosync/corosync.conf:
ERROR from peer hex-14: File is also marked dirty here!
Finished with 1 errors.
```
If you are sure that the file version on the current node is the “best” one, you can resolve the conflict by forcing this file and resynchronizing:
```
# csync2 -f /etc/corosync/corosync.conf
# csync2 -xv
```

For more information on the Csync2 options, run

# csync2 --help

Important: Pushing synchronization after any changes

Csync2 only pushes changes. It does not continuously synchronize files between the machines.

Each time you update files that need to be synchronized, you need to push the changes to the other machines by running csync2 -xv on the machine where you did the changes. If you run the command on any of the other machines with unchanged files, nothing happens.

7.10 Bringing the cluster online #

After the initial cluster configuration is done, start the cluster services on all cluster nodes to bring the stack online:

Procedure 7.11: Starting cluster services and checking the status #

Log in to an existing node.
Start the cluster services on all cluster nodes:
```
# crm cluster start --all
```
This command requires passwordless SSH access between the nodes. You can also start individual nodes with crm cluster start.

Check the cluster status with the crm status command. If all nodes are online, the output should be similar to the following:

# crm status
Cluster Summary:
  * Stack: corosync
  * Current DC: alice (version ...) - partition with quorum
  * Last updated: ...
  * Last change:  ... by hacluster via crmd on bob
  * 2 nodes configured
  * 1 resource instance configured

Node List:
  * Online: [ alice bob ]
...

This output indicates that the cluster resource manager is started and is ready to manage resources.

To be supported, a SUSE Linux Enterprise High Availability cluster must have STONITH (node fencing) enabled. A node fencing mechanism can be either a physical device (a power switch) or a mechanism like SBD in combination with a watchdog. Before you continue using the cluster, configure one or more STONITH devices as described in Chapter 16, Fencing and STONITH or Chapter 17, Storage protection and SBD.

7 Using the YaST cluster module #

7.1 Definition of terms #

7.2 Starting the YaST Cluster module #

7.3 Defining the communication channels #

7.4 Configuring an arbitrator for quorum decisions #

7.5 Defining authentication settings #

7.6 Configuring Csync2 to synchronize files #

7.7 Synchronizing connection status between cluster nodes #

7.8 Configuring services #

7.9 Transferring the configuration to all nodes #

7.10 Bringing the cluster online #