SUSE Linux Enterprise High Availability 15 SP3

Pacemaker Remote Quick Start #

Publication Date: December 12, 2024

This document guides you through the setup of a High Availability cluster with a remote node or a guest node, managed by Pacemaker and pacemaker_remote. Remote in pacemaker_remote does not refer to physical distance, but to the special status of nodes that do not run the complete cluster stack and thus are not regular members of the cluster.

Revision History: Documentação da SUSE Linux Enterprise High Availability Extension

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”.

For SUSE trademarks, see https://www.suse.com/company/legal/. All third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its affiliates. Asterisks (*) denote third-party trademarks.

All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its affiliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof.

1 Conceptual overview and terminology #

A regular cluster can contain up to 32 nodes. With the pacemaker_remote service, High Availability clusters can be extended to include additional nodes beyond this limit.

The pacemaker_remote service can be operated as a physical node (called remote node) or as a virtual node (called guest node). Unlike normal cluster nodes, both remote and guest nodes are managed by the cluster as resources. As such, they are not bound to the 32 node limitation of the cluster stack. However, from the resource management point of view, they behave as regular cluster nodes.

Remote nodes do not need to have the full cluster stack installed, as they only run the pacemaker_remote service. The service acts as a proxy, allowing the cluster stack on the “regular” cluster nodes to connect to the service. Thus, the node that runs the pacemaker_remote service is effectively integrated into the cluster as a remote node (see Terminology).

Terminology #

Cluster node

A node that runs the complete cluster stack, see Figure 1, “Regular cluster stack (two-node cluster)”.

Figure 1: Regular cluster stack (two-node cluster) #

A regular cluster node may perform the following tasks:

Run cluster resources.
Run all command line tools, such as crm, crm_mon.
Execute fencing actions.
Count toward cluster quorum.
Serve as the cluster's designated coordinator (DC).

Pacemaker remote (systemd service: pacemaker_remote)

A service daemon that makes it possible to use a node as a Pacemaker node without deploying the full cluster stack. Note that pacemaker_remote is the name of the systemd service. However, the name of the daemon is pacemaker-remoted (with a trailing d after its name).

Remote node

A physical machine that runs the pacemaker_remote daemon. A special resource (ocf:pacemaker:remote) needs to run on one of the cluster nodes to manage communication between the cluster node and the remote node (see Section 3, “Use case 1: setting up a cluster with remote nodes”).

Guest node

A virtual machine that runs the pacemaker_remote daemon. A guest node is created using a resource agent such as ocf:pacemaker:VirtualDomain with the remote-node meta attribute (see Section 4, “Use case 2: setting up a cluster with guest nodes”).

For a physical machine that contains several guest nodes, the process is as follows:

On the cluster node, virtual machines are launched by Pacemaker.
The cluster connects to the pacemaker_remote service of the virtual machines.
The virtual machines are integrated into the cluster by pacemaker_remote.

It is important to distinguish between several roles that a virtual machine can take in the High Availability cluster:

A virtual machine can run a full cluster stack. In this case, the virtual machine is a regular cluster node and is not itself managed by the cluster.
A virtual machine can be managed by the cluster as a resource, without the cluster being aware of the services that run inside the virtual machine. In this case, the virtual machine is opaque to the cluster.
A virtual machine can be a cluster resource and run pacemaker_remote, which allows the cluster to manage services inside the virtual machine. In this case, the virtual machine is a guest node and is transparent to the cluster.

Remote nodes and guest nodes can run cluster resources and most command line tools. However, they have the following limitations:

They cannot execute fencing actions.
They do not affect quorum.
They cannot serve as Designated Coordinator (DC).

2 Usage scenario #

The procedures in this document describe the process of setting up a minimal cluster with the following characteristics:

Two cluster nodes running SUSE Linux Enterprise High Availability 12 GA or higher. In this guide, their host names are alice and bob.
Depending on the setup you choose, your cluster will end up with one of the following nodes:
- One remote node running pacemaker_remote (the remote node is named charlie in this document).
  Or:
- One guest node running pacemaker_remote (the guest node is named doro in this document).
Pacemaker to manage guest nodes and remote nodes.
Failover of resources from one node to the other if the active host breaks down (active/passive setup).

3 Use case 1: setting up a cluster with remote nodes #

In the following example setup, a remote node charlie is used.

3.1 Preparing the cluster nodes and the remote node #

To prepare the cluster nodes and remote node, proceed as follows:

Install and set up a basic two-node cluster as described in the Inicialização Rápida de Instalação e Configuração. This will lead to a two-node cluster with two physical hosts, alice and bob.
On a physical host (charlie) that you want to use as remote node, install SUSE Linux Enterprise Server 15 SP3 and add SUSE Linux Enterprise High Availability 15 SP3 as extension. However, do not install the High Availability installation pattern, because the remote node needs only individual packages (see Section 3.3).
On all cluster nodes, check /etc/hosts and add an entry for charlie.

3.2 Configuring an authentication key #

On the cluster node alice proceed as follows:

Create a specific authentication key for the pacemaker_remote service:
```
# dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4k count=1
```
The key for the pacemaker_remote service is different from the cluster authentication key that you create in the YaST cluster module.
Synchronize the authentication key among all cluster nodes and your future remote node with scp:
```
# scp -r -p /etc/pacemaker/ bob:/etc
# scp -r -p /etc/pacemaker/ charlie:/etc
```
The key needs to be kept synchronized all the time.

3.3 Configuring the remote node #

The following procedure configures the physical host charlie as a remote node:

On charlie, proceed as follows:
1. In the firewall settings, open the TCP port 3121 for pacemaker_remote.
2. Install the pacemaker-remote and crmsh packages:
```
# zypper in pacemaker-remote crmsh
```
3. Enable and start the pacemaker_remote service on charlie:
```
# systemctl enable pacemaker_remote
# systemctl start pacemaker_remote
```
On alice or bob, verify the host connection to the remote node by using ssh:
```
# ssh -p 3121 charlie
```
This SSH connection will fail, but how it fails shows if the setup is working:
Working setup
ssh_exhange_identification: read: Connection reset by peer.
Broken setup
ssh: connect to host charlie port 3121: No route to host ssh: connect to host charlie port 3121: Connection refused
If you see either of those two messages, the setup does not work. Use the -v option for ssh and execute the command again to see debugging messages. This can be helpful to find connection, authentication, or configuration problems. Multiple -v options increase the verbosity.
If needed, add more remote nodes and configure them as described above.

3.4 Integrating the remote node into the cluster #

To integrate the remote node into the cluster, proceed as follows:

On node alice, create a ocf:pacemaker:remote primitive:

# crm configure
crm(live)configure# primitive charlie ocf:pacemaker:remote \
     params server=charlie reconnect_interval=15m \
     op monitor interval=30s
crm(live)configure# commit
crm(live)configure# quit

Check the status of the cluster with the command crm status. It should contain a running cluster with nodes that are all accessible:

# crm status
[...]
Online: [ alice bob ]
RemoteOnline: [ charlie ]

Full list of resources:
charlie (ocf:pacemaker:remote): Started alice
 [...]

3.5 Starting resources on the remote node #

After the remote node is integrated into the cluster, you can start resources on the remote node in the same way as on any cluster node.

Warning: Restrictions regarding groups and constraints

Never involve a remote node connection resource in a resource group, colocation constraint, or order constraint. This may lead to unexpected behavior on cluster transitions.

Fencing remote nodes. Remote nodes are fenced in the same way as cluster nodes. Configure fencing resources for use with remote nodes in the same way as with cluster nodes.

Remote nodes do not take part in initiating a fencing action. Only cluster nodes can execute a fencing operation against another node.

4 Use case 2: setting up a cluster with guest nodes #

In the following example setup, KVM is used for setting up the virtual guest node (doro).

4.1 Preparing the cluster nodes and the guest node #

To prepare the cluster nodes and guest node, proceed as follows:

Install and set up a basic two-node cluster as described in the Inicialização Rápida de Instalação e Configuração. This will lead to a two-node cluster with two physical hosts, alice and bob.
Create a KVM guest on alice. For details, see the Virtualization Guide for SUSE Linux Enterprise Server 15 SP3.
On the KVM guest (doro) that you want to use as guest node, install SUSE Linux Enterprise Server 15 SP3 and add SUSE Linux Enterprise High Availability 15 SP3 as extension. However, do not install the High Availability installation pattern, because the remote node needs only individual packages (see Section 4.3).
On all cluster nodes, check /etc/hosts and add an entry for doro.

4.2 Configuring an authentication key #

On the cluster node alice proceed as follows:

Create a specific authentication key for the pacemaker_remote service:
```
# mkdir -p --mode=0755 /etc/pacemaker
# dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4k count=1
```
The key for the pacemaker_remote service is different from the cluster authentication key that you create in the YaST cluster module.
Synchronize the authentication key among all cluster nodes and your guest node with scp:
```
# scp -r -p /etc/pacemaker/ bob:/etc
# scp -p /etc/pacemaker/ doro:/etc
```
The key needs to be kept synchronized all the time.

4.3 Configuring the guest node #

The following procedure configures doro as a guest node on your cluster node alice:

On doro, proceed as follows:
1. In the firewall settings, open the TCP port 3121 for pacemaker_remote.
2. Install the pacemaker-remote and crmsh packages:
```
# zypper in pacemaker-remote crmsh
```
3. Enable and start the pacemaker_remote service on alice:
```
# systemctl enable pacemaker_remote
# systemctl start pacemaker_remote
```
On alice or bob, verify the host connection to the guest by running ssh:
```
# ssh -p 3121 doro
```
This SSH connection will fail, but how it fails shows if the setup is working:
Working setup
ssh_exhange_identification: read: Connection reset by peer.
Broken setup
ssh: connect to host doro port 3121: No route to host ssh: connect to host doro port 3121: Connection refused
If you see either of those two messages, the setup does not work. Use the -v option for ssh and execute the command again to see debugging messages. This can be helpful to find connection, authentication, or configuration problems. Multiple -v options increase the verbosity.
If needed, add more guest nodes and configure them as described above.
Shut down the guest node and proceed with Section 4.4, “Integrating a guest node into the cluster”.

4.4 Integrating a guest node into the cluster #

To integrate the guest node into the cluster, proceed as follows:

Dump the XML configuration of the KVM guest(s) that you need in the next step:

# virsh list --all
 Id    Name         State
-----------------------------------
 -     doro       shut off
# virsh dumpxml doro > /etc/pacemaker/doro.xml

On node alice, create a VirtualDomain resource to launch the virtual machine. Use the dumped configuration from Step 2:
```
# crm configure
crm(live)configure# primitive vm-doro ocf:heartbeat:VirtualDomain \
  params hypervisor="qemu:///system" \
         config="/etc/pacemaker/doro.xml" \
         meta remote-node=doro
```
Pacemaker will automatically monitor pacemaker_remote connections for failure, so it is not necessary to create a recurring monitor on the VirtualDomain resource.
Tip: Enabling live migration
To enable live migration of the resource, set the meta attribute allow-migrate to true. The default is false.
Check the status of the cluster with the command crm status. It should contain a running cluster with nodes that are all accessible.

4.5 Testing the setup #

To demonstrate how resources are executed, use a dummy resource. It serves for testing purposes only.

Create a dummy resource:

# crm configure primitive fake1 ocf:pacemaker:Dummy

Check the cluster status with the crm status command. You should see something like the following:

# crm status
[...]
Online: [ alice bob ]
GuestOnline: [ doro@alice ]

Full list of resources:
vm-doro (ocf:heartbeat:VirtualDomain): Started alice
fake1           (ocf:pacemaker:Dummy): Started bob

To move the Dummy primitive to the guest node (doro), use the following command:

# crm resource move fake1 doro

The status will change to this:

# crm status
[...]
Online: [ alice bob ]
GuestOnline: [ doro@alice ]

Full list of resources:
vm-doro (ocf:heartbeat:VirtualDomain): Started alice
fake1           (ocf:pacemaker:Dummy): Started doro

To test whether fencing works, kill the pacemaker-remoted daemon on doro:
```
# kill -9 $(pidof pacemaker-remoted)
```

After a few seconds, check the status of the cluster again. It should look like this:

# crm status
[...]
Online: [ alice bob ]

Full list of resources:
vm-doro (ocf::heartbeat:VirtualDomain): Started alice
fake1           (ocf:pacemaker:Dummy): Stopped

Failed Actions:
* doro_monitor_30000 on alice 'unknown error' (1): call=8, status=Error, exitreason='none',
    last-rc-change='Tue Jul 18 13:11:51 2017', queued=0ms, exec=0ms

5 Upgrading the cluster and pacemaker_remote nodes #

Find comprehensive information on different scenarios and supported upgrade paths at Chapter 28, Upgrading your cluster and updating software packages. For detailed information about any changes and new features of the product you are upgrading to, refer to its release notes. They are available from https://www.suse.com/releasenotes/.

6 For more information #

More documentation for this product is available at https://documentation.suse.com/sle-ha-15/. For further configuration and administration tasks, see the comprehensive Administration Guide.

Upstream documentation is available from http://www.clusterlabs.org/pacemaker/doc/. See the document Pacemaker Remote—Scaling High Availability Clusters.