Applies to SUSE Linux Enterprise High Availability 15 SP5

2 System requirements and recommendations #

Revision History: SUSE Linux Enterprise High Availability Documentation

The following section informs you about system requirements and prerequisites for SUSE® Linux Enterprise High Availability. It also includes recommendations for cluster setup.

2.1 Hardware requirements #

The following list specifies hardware requirements for a cluster based on SUSE® Linux Enterprise High Availability. These requirements represent the minimum hardware configuration. Additional hardware might be necessary, depending on how you intend to use your cluster.

Servers

1 to 32 Linux servers with software as specified in Section 2.2, “Software requirements”.

The servers can be bare metal or virtual machines. They do not require identical hardware (memory, disk space, etc.), but they must have the same architecture. Cross-platform clusters are not supported.

Using pacemaker_remote, the cluster can be extended to include additional Linux servers beyond the 32-node limit.

Communication channels

At least two TCP/IP communication media per cluster node. The network equipment must support the communication means you want to use for cluster communication: multicast or unicast. The communication media should support a data rate of 100 Mbit/s or higher. For a supported cluster setup two or more redundant communication paths are required. This can be done via:

Network Device Bonding (preferred).
A second communication channel in Corosync.

For details, refer to Chapter 16, Network device bonding and Procedure 4.3, “Defining a redundant communication channel”, respectively.

Node fencing/STONITH

To avoid a “split-brain” scenario, clusters need a node fencing mechanism. In a split-brain scenario, cluster nodes are divided into two or more groups that do not know about each other (because of a hardware or software failure or because of a cut network connection). A fencing mechanism isolates the node in question (usually by resetting or powering off the node). This is also called STONITH (“Shoot the other node in the head”). A node fencing mechanism can be either a physical device (a power switch) or a mechanism like SBD (STONITH by disk) in combination with a watchdog. Using SBD requires shared storage.

Unless SBD is used, each node in the High Availability cluster must have at least one STONITH device. We strongly recommend multiple STONITH devices per node.

Important: No Support Without STONITH

You must have a node fencing mechanism for your cluster.
The global cluster options stonith-enabled and startup-fencing must be set to true. When you change them, you lose support.

2.2 Software requirements #

All nodes that will be part of the cluster need at least the following modules and extensions:

Basesystem Module 15 SP5
Server Applications Module 15 SP5
SUSE Linux Enterprise High Availability 15 SP5

Depending on the system role you select during installation, the following software patterns are installed by default:

HA Node system role

High Availability (sles_ha)

Enhanced Base System (enhanced_base)

HA GEO Node system role

Geo Clustering for High Availability (ha_geo)

Enhanced Base System (enhanced_base)

Note: Minimal installation

An installation via those system roles results in a minimal installation only. You might need to add more packages manually, if required.

For machines that originally had another system role assigned, you need to manually install the sles_ha or ha_geo patterns and any further packages that you need.

2.3 Storage requirements #

Some services require shared storage. If using an external NFS share, it must be reliably accessible from all cluster nodes via redundant communication paths.

To make data highly available, a shared disk system (Storage Area Network, or SAN) is recommended for your cluster. If a shared disk subsystem is used, ensure the following:

The shared disk system is properly set up and functional according to the manufacturer’s instructions.
The disks contained in the shared disk system should be configured to use mirroring or RAID to add fault tolerance to the shared disk system.
If you are using iSCSI for shared disk system access, ensure that you have properly configured iSCSI initiators and targets.
When using DRBD* to implement a mirroring RAID system that distributes data across two machines, make sure to only access the device provided by DRBD—never the backing device. To leverage the redundancy it is possible to use the same NICs as the rest of the cluster.

When using SBD as STONITH mechanism, additional requirements apply for the shared storage. For details, see Section 13.3, “Requirements and restrictions”.

2.4 Other requirements and recommendations #

For a supported and useful High Availability setup, consider the following recommendations:

Number of cluster nodes

For clusters with more than two nodes, it is strongly recommended to use an odd number of cluster nodes to have quorum. For more information about quorum, see Section 5.2, “Quorum determination”. A regular cluster can contain up to 32 nodes. With the pacemaker_remote service, High Availability clusters can be extended to include additional nodes beyond this limit. See Pacemaker Remote Quick Start for more details.

Time synchronization

Cluster nodes must synchronize to an NTP server outside the cluster. Since SUSE Linux Enterprise High Availability 15, chrony is the default implementation of NTP. For more information, see the Administration Guide for SUSE Linux Enterprise Server 15 SP5.

The cluster might not work properly if the nodes are not synchronized, or even if they are synchronized but have different timezones configured. In addition, log files and cluster reports are very hard to analyze without synchronization. If you use the bootstrap scripts, you will be warned if NTP is not configured yet.

Network Interface Card (NIC) names

Must be identical on all nodes.

Host name and IP address

Use static IP addresses.
Only the primary IP address is supported.
List all cluster nodes in the /etc/hosts file with their fully qualified host name and short host name. It is essential that members of the cluster can find each other by name. If the names are not available, internal cluster communication will fail.
For details on how Pacemaker gets the node names, see also https://clusterlabs.org/projects/pacemaker/doc/2.1/Pacemaker_Explained/html/nodes.html#where-pacemaker-gets-the-node-name.

SSH

All cluster nodes must be able to access each other via SSH. Tools like crm report (for troubleshooting) and Hawk2's History Explorer require passwordless SSH access between the nodes, otherwise they can only collect data from the current node.

Note: Regulatory requirements

If passwordless SSH access does not comply with regulatory requirements, you can use the work-around described in Appendix D, Running cluster reports without root access for running crm report.

For the History Explorer there is currently no alternative for passwordless login.