The following section informs you about system requirements, and some prerequisites for SUSE® Linux Enterprise High Availability Extension. It also includes recommendations for cluster setup.
The following list specifies hardware requirements for a cluster based on SUSE® Linux Enterprise High Availability Extension. These requirements represent the minimum hardware configuration. Additional hardware might be necessary, depending on how you intend to use your cluster.
1 to 32 Linux servers with software as specified in Section 2.2, “Software Requirements”.
The servers can be bare metal or virtual machines. They do not require identical hardware (memory, disk space, etc.), but they must have the same architecture. Cross-platform clusters are not supported.
Using pacemaker_remote
, the cluster can be
extended to include additional Linux servers beyond the 32-node limit.
At least two TCP/IP communication media per cluster node. The network equipment must support the communication means you want to use for cluster communication: multicast or unicast. The communication media should support a data rate of 100 Mbit/s or higher. For a supported cluster setup two or more redundant communication paths are required. This can be done via:
Network Device Bonding (to be preferred).
A second communication channel in Corosync.
Network fault tolerance on infrastructure layer (for example, hypervisor).
For details, refer to Chapter 13, Network Device Bonding and Procedure 4.3, “Defining a Redundant Communication Channel”, respectively.
To avoid a “split brain” scenario, clusters need a node fencing mechanism. In a split brain scenario, cluster nodes are divided into two or more groups that do not know about each other (because of a hardware or software failure or because of a cut network connection). A fencing mechanism isolates the node in question (usually by resetting or powering off the node). This is also called STONITH (“Shoot the other node in the head”). A node fencing mechanism can be either a physical device (a power switch) or a mechanism like SBD (STONITH by disk) in combination with a watchdog. Using SBD requires shared storage.
Unless SBD is used, each node in the High Availability cluster must have at least one STONITH device. We strongly recommend multiple STONITH devices per node.
You must have a node fencing mechanism for your cluster.
The global cluster options
stonith-enabled
and
startup-fencing
must be set to
true
.
When you change them, you lose support.
On all nodes that will be part of the cluster the following software must be installed.
SUSE® Linux Enterprise Server 12 SP5 (with all available online updates)
SUSE Linux Enterprise High Availability Extension 12 SP5 (with all available online updates)
(Optional) For Geo clusters: Geo Clustering for SUSE Linux Enterprise High Availability Extension 12 SP5 (with all available online updates)
Some services require shared storage. If using an external NFS share, it must be reliably accessible from all cluster nodes via redundant communication paths.
To make data highly available, a shared disk system (Storage Area Network, or SAN) is recommended for your cluster. If a shared disk subsystem is used, ensure the following:
The shared disk system is properly set up and functional according to the manufacturer’s instructions.
The disks contained in the shared disk system should be configured to use mirroring or RAID to add fault tolerance to the shared disk system.
If you are using iSCSI for shared disk system access, ensure that you have properly configured iSCSI initiators and targets.
When using DRBD* to implement a mirroring RAID system that distributes data across two machines, make sure to only access the device provided by DRBD—never the backing device. Use bonded NICs. To leverage the redundancy it is possible to use the same NICs as the rest of the cluster.
When using SBD as STONITH mechanism, additional requirements apply for the shared storage. For details, see Section 11.3, “Requirements”.
For a supported and useful High Availability setup, consider the following recommendations:
For clusters with more than two nodes, it is strongly recommended to use an odd number of cluster nodes to have quorum. For more information about quorum, see Section 6.2, “Quorum Determination”.
Cluster nodes must synchronize to an NTP server outside the cluster. For more information, see https://documentation.suse.com/sles-12/html/SLES-all/cha-netz-xntp.html.
If nodes are not synchronized, the cluster may not work properly. In addition, log files and cluster reports are very hard to analyze without synchronization. If you use the bootstrap scripts, you will be warned if NTP is not configured yet.
Must be identical on all nodes.
Use static IP addresses.
List all cluster nodes in the /etc/hosts
file
with their fully qualified host name and short host name. It is essential that
members of the cluster can find each other by name. If the names are not
available, internal cluster communication will fail.
For details on how Pacemaker gets the node names, see also http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-node-name.html.
All cluster nodes must be able to access each other via SSH. Tools
like crm report
(for troubleshooting) and
Hawk2's require passwordless
SSH access between the nodes,
otherwise they can only collect data from the current node.
If passwordless SSH access does not comply with regulatory
requirements, you can use the work-around described in
Appendix D, Running Cluster Reports Without root
Access for running
crm report
.
For the
there is currently no alternative for passwordless login.