Applies to SUSE Linux Enterprise High Availability 15 SP5

4 Using the YaST cluster module #

The YaST cluster module allows you to set up a cluster manually (from scratch) or to modify options for an existing cluster.

However, if you prefer an automated approach for setting up a cluster, refer to Installation and Setup Quick Start. It describes how to install the needed packages and leads you to a basic two-node cluster, which is set up with the bootstrap scripts provided by the crm shell.

You can also use a combination of both setup methods, for example: set up one node with YaST cluster and then use one of the bootstrap scripts to integrate more nodes (or vice versa).

4.1 Definition of terms #

Several key terms used in the YaST cluster module and in this chapter are defined below.

Bind network address (bindnetaddr)

The network address the Corosync executive should bind to. To simplify sharing configuration files across the cluster, Corosync uses network interface netmask to mask only the address bits that are used for routing the network. For example, if the local interface is 192.168.5.92 with netmask 255.255.255.0, set bindnetaddr to 192.168.5.0. If the local interface is 192.168.5.92 with netmask 255.255.255.192, set bindnetaddr to 192.168.5.64.

If nodelist with ringX_addr is explicitly configured in /etc/corosync/corosync.conf, bindnetaddr is not strictly required.

Note: Network address for all nodes

As the same Corosync configuration is used on all nodes, make sure to use a network address as bindnetaddr, not the address of a specific network interface.

conntrack Tools

Allow interaction with the in-kernel connection tracking system for enabling stateful packet inspection for iptables. Used by SUSE Linux Enterprise High Availability to synchronize the connection status between cluster nodes. For detailed information, refer to https://conntrack-tools.netfilter.org/.

Csync2

A synchronization tool that can be used to replicate configuration files across all nodes in the cluster, and even across Geo clusters. Csync2 can handle any number of hosts, sorted into synchronization groups. Each synchronization group has its own list of member hosts and its include/exclude patterns that define which ﬁles should be synchronized in the synchronization group. The groups, the host names belonging to each group, and the include/exclude rules for each group are specified in the Csync2 configuration file, /etc/csync2/csync2.cfg.

For authentication, Csync2 uses the IP addresses and pre-shared keys within a synchronization group. You need to generate one key file for each synchronization group and copy it to all group members.

For more information about Csync2, refer to https://oss.linbit.com/csync2/paper.pdf

Existing cluster

The term “existing cluster” is used to refer to any cluster that consists of at least one node. Existing clusters have a basic Corosync configuration that defines the communication channels, but they do not necessarily have resource configuration yet.

Multicast

A technology used for a one-to-many communication within a network that can be used for cluster communication. Corosync supports both multicast and unicast.

Note: Switches and multicast

To use multicast for cluster communication, make sure your switches support multicast.

Multicast address (mcastaddr)

IP address to be used for multicasting by the Corosync executive. The IP address can either be IPv4 or IPv6. If IPv6 networking is used, node IDs must be specified. You can use any multicast address in your private network.

Multicast port (mcastport)

The port to use for cluster communication. Corosync uses two ports: the specified mcastport for receiving multicast, and mcastport -1 for sending multicast.

Redundant Ring Protocol (RRP)

Allows the use of multiple redundant local area networks for resilience against partial or total network faults. This way, cluster communication can still be kept up as long as a single network is operational. Corosync supports the Totem Redundant Ring Protocol. A logical token-passing ring is imposed on all participating nodes to deliver messages in a reliable and sorted manner. A node is allowed to broadcast a message only if it holds the token.

When having defined redundant communication channels in Corosync, use RRP to tell the cluster how to use these interfaces. RRP can have three modes (rrp_mode):

If set to active, Corosync uses both interfaces actively. However, this mode is deprecated.
If set to passive, Corosync sends messages alternatively over the available networks.
If set to none, RRP is disabled.

Unicast

A technology for sending messages to a single network destination. Corosync supports both multicast and unicast. In Corosync, unicast is implemented as UDP-unicast (UDPU).

4.2 YaST Cluster module #

Start YaST and select High Availability › Cluster. Alternatively, start the module from command line:

sudo yast2 cluster

The following list shows an overview of the available screens in the YaST cluster module. It also mentions whether the screen contains parameters that are required for successful cluster setup or whether its parameters are optional.

Communication channels (required): Allows you to define one or two communication channels for communication between the cluster nodes. As transport protocol, either use multicast (UDP) or unicast (UDPU). For details, see Section 4.3, “Defining the communication channels”.
Important: Redundant communication paths
For a supported cluster setup two or more redundant communication paths are required. The preferred way is to use network device bonding as described in Chapter 16, Network device bonding.
If this is impossible, you need to define a second communication channel in Corosync.
Security (optional but recommended): Allows you to define the authentication settings for the cluster. HMAC/SHA1 authentication requires a shared secret used to protect and authenticate messages. For details, see Section 4.4, “Defining authentication settings”.
Configure Csync2 (optional but recommended): Csync2 helps you to keep track of configuration changes and to keep files synchronized across the cluster nodes. For details, see Section 4.7, “Transferring the configuration to all nodes”.
Configure conntrackd (optional): Allows you to configure the user space conntrackd. Use the conntrack tools for stateful packet inspection for iptables. For details, see Section 4.5, “Synchronizing connection status between cluster nodes”.
Service (required): Allows you to configure the service for bringing the cluster node online. Define whether to start the cluster services at boot time and whether to open the ports in the firewall that are needed for communication between the nodes. For details, see Section 4.6, “Configuring services”.

If you start the cluster module for the first time, it appears as a wizard, guiding you through all the steps necessary for basic setup. Otherwise, click the categories on the left panel to access the configuration options for each step.

Note: Settings in the YaST Cluster module

Some settings in the YaST cluster module apply only to the current node. Other settings may automatically be transferred to all nodes with Csync2. Find detailed information about this in the following sections.

4.3 Defining the communication channels #

For successful communication between the cluster nodes, define at least one communication channel. As transport protocol, either use multicast (UDP) or unicast (UDPU) as described in Procedure 4.1 or Procedure 4.2, respectively. To define a second, redundant channel (Procedure 4.3), both communication channels must use the same protocol.

Note: Public clouds: use unicast

For deploying SUSE Linux Enterprise High Availability in public cloud platforms, use unicast as transport protocol. Multicast is generally not supported by the cloud platforms themselves.

All settings defined in the YaST Communication Channels screen are written to /etc/corosync/corosync.conf. Find example files for a multicast and a unicast setup in /usr/share/doc/packages/corosync/.

If you are using IPv4 addresses, node IDs are optional. If you are using IPv6 addresses, node IDs are required. Instead of specifying IDs manually for each node, the YaST cluster module contains an option to automatically generate a unique ID for every cluster node.

Procedure 4.1: Defining the first communication channel (multicast) #

When using multicast, the same bindnetaddr, mcastaddr, and mcastport is used for all cluster nodes. All nodes in the cluster know each other by using the same multicast address. For different clusters, use different multicast addresses.

Start the YaST cluster module and switch to the Communication Channels category.
Set the Transport protocol to Multicast.
Define the Bind Network Address. Set the value to the subnet you will use for cluster multicast.
Define the Multicast Address.
Define the Port.
To automatically generate a unique ID for every cluster node keep Auto Generate Node ID enabled.
Define a Cluster Name.
Enter the number of Expected Votes. This is important for Corosync to calculate quorum in case of a partitioned cluster. By default, each node has 1 vote. The number of Expected Votes must match the number of nodes in your cluster.
Confirm your changes.
If needed, define a redundant communication channel in Corosync as described in Procedure 4.3, “Defining a redundant communication channel”.

Figure 4.1: YaST Cluster—multicast configuration #

To use unicast instead of multicast for cluster communication, proceed as follows.

Procedure 4.2: Defining the first communication channel (unicast) #

Start the YaST cluster module and switch to the Communication Channels category.
Set the Transport protocol to Unicast.
Define the Port.
For unicast communication, Corosync needs to know the IP addresses of all nodes in the cluster. For each node that will be part of the cluster, click Add and enter the following details:
- IP Address
- Redundant IP Address (only required if you use a second communication channel in Corosync)
- Node ID (only required if the option Auto Generate Node ID is disabled)
To modify or remove any addresses of cluster members, use the Edit or Del buttons.
To automatically generate a unique ID for every cluster node keep Auto Generate Node ID enabled.
Define a Cluster Name.
Enter the number of Expected Votes. This is important for Corosync to calculate quorum in case of a partitioned cluster. By default, each node has 1 vote. The number of Expected Votes must match the number of nodes in your cluster.
Confirm your changes.
If needed, define a redundant communication channel in Corosync as described in Procedure 4.3, “Defining a redundant communication channel”.

Figure 4.2: YaST Cluster—unicast configuration #

If network device bonding cannot be used for any reason, the second best choice is to define a redundant communication channel (a second ring) in Corosync. That way, two physically separate networks can be used for communication. If one network fails, the cluster nodes can still communicate via the other network.

The additional communication channel in Corosync forms a second token-passing ring. In /etc/corosync/corosync.conf, the first channel you configured is the primary ring and gets the ring number 0. The second ring (redundant channel) gets the ring number 1.

When having defined redundant communication channels in Corosync, use RRP to tell the cluster how to use these interfaces. With RRP, two physically separate networks are used for communication. If one network fails, the cluster nodes can still communicate via the other network.

RRP can have three modes:

If set to active, Corosync uses both interfaces actively. However, this mode is deprecated.
If set to passive, Corosync sends messages alternatively over the available networks.
If set to none, RRP is disabled.

Procedure 4.3: Defining a redundant communication channel #

Important: Redundant rings and /etc/hosts

If multiple rings are configured in Corosync, each node can have multiple IP addresses. This needs to be reflected in the /etc/hosts file of all nodes.

Start the YaST cluster module and switch to the Communication Channels category.
Activate Redundant Channel. The redundant channel must use the same protocol as the first communication channel you defined.
If you use multicast, enter the following parameters: the Bind Network Address to use, the Multicast Address and the Port for the redundant channel.
If you use unicast, define the following parameters: the Bind Network Address to use, and the Port. Enter the IP addresses of all nodes that will be part of the cluster.
To tell Corosync how and when to use the different channels, select the rrp_mode to use:
- If only one communication channel is defined, rrp_mode is automatically disabled (value none).
- If set to active, Corosync uses both interfaces actively. However, this mode is deprecated.
- If set to passive, Corosync sends messages alternatively over the available networks.
When RRP is used, SUSE Linux Enterprise High Availability monitors the status of the current rings and automatically re-enables redundant rings after faults.
Alternatively, check the ring status manually with corosync-cfgtool. View the available options with -h.
Confirm your changes.

4.4 Defining authentication settings #

To define the authentication settings for the cluster, you can use HMAC/SHA1 authentication. This requires a shared secret used to protect and authenticate messages. The authentication key (password) you specify is used on all nodes in the cluster.

Procedure 4.4: Enabling secure authentication #

Start the YaST cluster module and switch to the Security category.
Activate Enable Security Auth.
For a newly created cluster, click Generate Auth Key File. An authentication key is created and written to /etc/corosync/authkey.
If you want the current machine to join an existing cluster, do not generate a new key file. Instead, copy the /etc/corosync/authkey from one of the nodes to the current machine (either manually or with Csync2).
Confirm your changes. YaST writes the configuration to /etc/corosync/corosync.conf.

Figure 4.3: YaST Cluster—security #

4.5 Synchronizing connection status between cluster nodes #

To enable stateful packet inspection for iptables, configure and use the conntrack tools. This requires the following basic steps:

Procedure 4.5: Configuring the conntrackd with YaST #

Use the YaST cluster module to configure the user space conntrackd (see Figure 4.4, “YaST Cluster—conntrackd”). It needs a dedicated network interface that is not used for other communication channels. The daemon can be started via a resource agent afterward.

Start the YaST cluster module and switch to the Configure conntrackd category.
Define the Multicast Address to be used for synchronizing the connection status.
In Group Number, define a numeric ID for the group to synchronize the connection status to.
Click Generate /etc/conntrackd/conntrackd.conf to create the configuration file for conntrackd.
If you modified any options for an existing cluster, confirm your changes and close the cluster module.
For further cluster configuration, click Next and proceed with Section 4.6, “Configuring services”.
Select a Dedicated Interface for synchronizing the connection status. The IPv4 address of the selected interface is automatically detected and shown in YaST. It must already be configured and it must support multicast.

Figure 4.4: YaST Cluster—conntrackd #

After having configured the conntrack tools, you can use them for Linux Virtual Server (see Load balancing).

4.6 Configuring services #

In the YaST cluster module define whether to start certain services on a node at boot time. You can also use the module to start and stop the services manually. To bring the cluster nodes online and start the cluster resource manager, Pacemaker must be running as a service.

Procedure 4.6: Enabling the cluster services #

In the YaST cluster module, switch to the Service category.
To start the cluster services each time this cluster node is booted, select the respective option in the Booting group. If you select Off in the Booting group, you must start the cluster services manually each time this node is booted. To start the cluster services manually, use the command:
```
# crm cluster start
```
To start or stop the cluster services immediately, click the respective button.
To open the ports in the firewall that are needed for cluster communication on the current machine, activate Open Port in Firewall.
Confirm your changes. Note that the configuration only applies to the current machine, not to all cluster nodes.

Figure 4.5: YaST Cluster—services #

4.7 Transferring the configuration to all nodes #

Instead of copying the resulting configuration files to all nodes manually, use the csync2 tool for replication across all nodes in the cluster.

This requires the following basic steps:

Csync2 helps you to keep track of configuration changes and to keep files synchronized across the cluster nodes:

You can define a list of files that are important for operation.
You can show changes to these files (against the other cluster nodes).
You can synchronize the configured files with a single command.
With a simple shell script in ~/.bash_logout, you can be reminded about unsynchronized changes before logging out of the system.

Find detailed information about Csync2 at https://oss.linbit.com/csync2/ and https://oss.linbit.com/csync2/paper.pdf.

4.7.1 Configuring Csync2 with YaST #

Procedure 4.7: Configuring Csync2 with YaST #

Start the YaST cluster module and switch to the Csync2 category.
To specify the synchronization group, click Add in the Sync Host group and enter the local host names of all nodes in your cluster. For each node, you must use exactly the strings that are returned by the hostname command.
Tip: Host name resolution
If host name resolution does not work properly in your network, you can also specify a combination of host name and IP address for each cluster node. To do so, use the string HOSTNAME@IP such as alice@192.168.2.100, for example. Csync2 will then use the IP addresses when connecting.
Click Generate Pre-Shared-Keys to create a key file for the synchronization group. The key file is written to /etc/csync2/key_hagroup. After it has been created, it must be copied manually to all members of the cluster.
To populate the Sync File list with the files that usually need to be synchronized among all nodes, click Add Suggested Files.
To Edit, Add or Remove files from the list of files to be synchronized use the respective buttons. You must enter the absolute path for each file.
Activate Csync2 by clicking Turn Csync2 ON. This executes the following command to start Csync2 automatically at boot time:
```
# systemctl enable csync2.socket
```
Click Finish. YaST writes the Csync2 configuration to /etc/csync2/csync2.cfg.

Figure 4.6: YaST Cluster—Csync2 #

4.7.2 Synchronizing changes with Csync2 #

Before running Csync2 for the first time, you need to make the following preparations:

Procedure 4.8: Preparing for initial synchronization with Csync2 #

Copy the file /etc/csync2/csync2.cfg manually to all nodes after you have configured it as described in Section 4.7.1, “Configuring Csync2 with YaST”.
Copy the file /etc/csync2/key_hagroup that you have generated on one node in Step 3 of Section 4.7.1 to all nodes in the cluster. It is needed for authentication by Csync2. However, do not regenerate the file on the other nodes—it needs to be the same file on all nodes.
Execute the following command on all nodes to start the service now:
```
# systemctl start csync2.socket
```

Procedure 4.9: Synchronizing the configuration files with Csync2 #

To initially synchronize all files once, execute the following command on the machine that you want to copy the configuration from:
```
# csync2 -xv
```
This synchronizes all the files once by pushing them to the other nodes. If all files are synchronized successfully, Csync2 finishes with no errors.
If one or several files that are to be synchronized have been modified on other nodes (not only on the current one), Csync2 reports a conflict with an output similar to the one below:
```
While syncing file /etc/corosync/corosync.conf:
ERROR from peer hex-14: File is also marked dirty here!
Finished with 1 errors.
```
If you are sure that the file version on the current node is the “best” one, you can resolve the conflict by forcing this file and resynchronizing:
```
# csync2 -f /etc/corosync/corosync.conf
# csync2 -x
```

For more information on the Csync2 options, run

# csync2 -help

Note: Pushing synchronization after any changes

Csync2 only pushes changes. It does not continuously synchronize files between the machines.

Each time you update files that need to be synchronized, you need to push the changes to the other machines by running csync2 -xv on the machine where you did the changes. If you run the command on any of the other machines with unchanged files, nothing happens.

4.8 Bringing the cluster online #

Before starting the cluster, make sure passwordless SSH is configured between the nodes. If you did not already configure passwordless SSH before setting up the cluster, you can do so now by using the ssh stage of the bootstrap scripts:

On the first node, run crm cluster init ssh.
On the rest of the nodes, run crm cluster join ssh -c NODE1.

After the initial cluster configuration is done, start the cluster services on all cluster nodes to bring the stack online:

Procedure 4.10: Starting cluster services and checking the status #

Log in to an existing node.
Start the cluster services on all cluster nodes:
```
# crm cluster start --all
```

Check the cluster status with the crm status command. If all nodes are online, the output should be similar to the following:

# crm status
Cluster Summary:
  * Stack: corosync
  * Current DC: alice (version ...) - partition with quorum
  * Last updated: ...
  * Last change:  ... by hacluster via crmd on bob
  * 2 nodes configured
  * 1 resource instance configured

Node List:
  * Online: [ alice bob ]
...

This output indicates that the cluster resource manager is started and is ready to manage resources.

After the basic configuration is done and the nodes are online, you can start to configure cluster resources. Use one of the cluster management tools like the crm shell (crmsh) or Hawk2. For more information, see Section 5.5, “Introduction to crmsh” or Section 5.4, “Introduction to Hawk2”.