This document describes how to set up highly available NFS storage in a two-node cluster, using the following components of SUSE Linux Enterprise High Availability Extension 12 SP5: DRBD* (Distributed Replicated Block Device), LVM (Logical Volume Manager), and Pacemaker, the cluster resource management framework.
This document will help you set up a highly available NFS server. The cluster used to for the highly available NFS storage has the following properties:
Two nodes: alice
(IP: 192.168.1.1
)
and bob
(IP: 192.168.1.2
),
connected to each other via network.
Two floating, virtual IP addresses (192.168.1.10
and 192.168.2.1
), allowing clients to connect to
the service no matter which physical node it is running on.
One IP address is used for cluster administration with Hawk2, the other
IP address is used exclusively for the NFS exports.
A shared storage device, used as an SBD fencing mechanism. This avoids split brain scenarios.
Failover of resources from one node to the other if the active host breaks down (active/passive setup).
Local storage on each host. The data is synchronized between the hosts using DRBD on top of LVM.
A file system exported through NFS.
After installing and setting up the basic two-node cluster, and extending it with storage and cluster resources for NFS, you will have a highly available NFS storage server.
Before you proceed, install and set up a basic two-node cluster. This task is described in Installation and Setup Quick Start. The Installation and Setup Quick Start describes how to use the ha-cluster-bootstrap package to set up a cluster with minimal effort.
LVM (Logical Volume Manager) enables flexible distribution of hard disk space over several file systems.
To prepare your disks for LVM, do the following:
Create an LVM volume group and replace /dev/sdbX
with your corresponding device for LVM:
root #
pvcreate
/dev/sdbX
Create an LVM Volume Group nfs
that includes this physical volume:
root #
vgcreate
nfs /dev/sdbX
Create one or more logical volumes in the volume group
nfs
. This example assumes a 20 gigabyte volume,
named work
:
root #
lvcreate
-n work -L 20G nfs
Activate the volume group:
root #
vgchange
-ay nfs
After you have successfully executed the above steps, your system
will make visible the following device: /dev/VOLGROUP/LOGICAL_VOLUME
.
In this case it will be /dev/nfs/work
.
This section describes how to set up a DRBD device on top of LVM. The configuration of LVM as a back-end of DRBD has some benefits:
Easier setup than with LVM on top of DRBD.
Easier administration in case the LVM disks need to be resized or more disks are added to the volume group.
As the LVM volume group is named nfs
, the
DRBD resource uses the same name.
For consistency reasons, it is highly recommended to follow this advice:
Use the directory /etc/drbd.d/
for your
configuration.
Name the file according to the purpose of the resource.
Put your resource configuration in a file with a .res
extension. In the following
examples, the file /etc/drbd.d/nfs.res
is
used.
Proceed as follows:
Create the file /etc/drbd.d/nfs.res
with the
following contents:
resource nfs { device /dev/drbd0; 1 disk /dev/nfs/work; 2 meta-disk internal; 3 net { protocol C; 4 } connection-mesh { 5 hosts alice bob; } on alice { 6 address 192.168.1.1:7790; node-id 0; } on bob { 6 address 192.168.1.2:7790; node-id 1; } }
The DRBD device that applications are supposed to access. | |
The lower-level block device used by DRBD to store the actual data. This is the LVM device that was created in Section 3, “Creating an LVM Device”. | |
Where the metadata format is stored. Using
| |
The specified protocol to be used for this connection. For protocol
| |
Defines all nodes of a mesh.
The | |
Contains the IP address and a unique identifier for each node. |
Open /etc/csync2/csync2.cfg
and check whether the
following two lines exist:
include /etc/drbd.conf; include /etc/drbd.d/*.res;
If not, add them to the file.
Copy the file to the other nodes:
root #
csync2
-xv
For information about Csync2, refer to Section 4.5, “Transferring the Configuration to All Nodes”.
After you have prepared your DRBD configuration, proceed as follows:
If you use a firewall in your cluster, open port
7790
in your firewall configuration.
The first time you do this, execute the following
commands on both nodes (in our example, alice
and bob
):
root #
drbdadm
create-md nfsroot #
drbdadm
up nfs
This initializes the metadata storage and creates the
/dev/drbd0
device.
If the DRBD devices on all nodes have the same data, skip the initial resynchronization. Use the following command:
root #
drbdadm
new-current-uuid --clear-bitmap nfs/0
Make alice
primary:
root #
drbdadm
primary --force nfs
Check the DRBD status:
root #
drbdadm
status nfs
This returns the following message:
nfs role:Primary disk:UpToDate alice role:Secondary peer-disk:UpToDate
After the synchronization is complete, you can access the DRBD resource
on the block device /dev/drbd0
. Use this device
for creating your file system.
Find more information about DRBD in Chapter 20, DRBD.
After you have finished Section 4.2, “Activating the DRBD Device”,
you should see a DRBD device on /dev/drbd0
:
root #
mkfs.ext3
/dev/drbd0
A resource might fail back to its original node when that node is back online and in the cluster. To prevent a resource from failing back to the node that it was running on, or to specify a different node for the resource to fail back to, change its resource stickiness value. You can either specify resource stickiness when you are creating a resource or afterward.
To adjust the option, open the crm shell as root
(or any
non-root
user that is part of the
haclient
group) and run the
following commands:
root #
crm
configurecrm(live)configure#
rsc_defaults
resource-stickiness="200"crm(live)configure#
commit
For more information about global cluster options, refer to Section 6.2, “Quorum Determination”.
The following sections cover the configuration of the required resources for a highly available NFS cluster. The configuration steps use the crm shell. The following list shows the necessary cluster resources:
These resources are used to replicate data. The multi-state resource is switched from and to the Primary and Secondary roles as deemed necessary by the cluster resource manager.
With this resource, Pacemaker ensures that the NFS server daemons are always available.
One or more NFS exports, typically corresponding to the file system.
The following configuration examples assume that
192.168.2.1
is the virtual
IP address to use for an NFS server which serves clients in the
192.168.2.x/24
subnet.
The service exports data served from
/srv/nfs/work
.
Into this export directory, the cluster will mount
ext3
file systems from the DRBD device
/dev/drbd0
.
This DRBD device sits on top of an LVM logical volume with the name
nfs
.
To configure these resources, run the following commands from the crm shell:
crm(live)#
configure
crm(live)configure#
primitive
drbd_nfs \ ocf:linbit:drbd \ params drbd_resource="nfs" \ op monitor interval="15" role="Master" \ op monitor interval="30" role="Slave"crm(live)configure#
ms
ms-drbd_nfs drbd_nfs \ meta master-max="1" master-node-max="1" clone-max="2" \ clone-node-max="1" notify="true"crm(live)configure#
commit
This will create a Pacemaker multi-state resource corresponding to the
DRBD resource nfs
. Pacemaker should now activate your
DRBD resource on both nodes and promote it to the master role on one of
them.
Check the state of the cluster with the crm status
command, or run drbdadm status
.
In the crm shell, the resource for the NFS server
daemons must be configured as a clone of a
systemd
resource type.
crm(live)configure#
primitive
nfsserver \ systemd:nfs-server \ op monitor interval="30s"crm(live)configure#
clone
cl-nfsserver nfsserver \ meta interleave=truecrm(live)configure#
commit
After you have committed this configuration, Pacemaker should start the NFS Kernel server processes on both nodes.
Configure the file system type resource as follows (but do not commit this configuration yet):
crm(live)configure#
primitive
fs_work \ ocf:heartbeat:Filesystem \ params device=/dev/drbd0 \ directory=/srv/nfs/work \ fstype=ext3 \ op monitor interval="10s"
Combine these resources into a Pacemaker resource group:
crm(live)configure#
group
g-nfs fs_work
Add the following constraints to make sure that the group is started on the same node on which the DRBD multi-state resource is in the master role:
crm(live)configure#
order
o-drbd_before_nfs inf: \ ms-drbd_nfs:promote g-nfs:startcrm(live)configure#
colocation
col-nfs_on_drbd inf: \ g-nfs ms-drbd_nfs:Master
Commit this configuration:
crm(live)configure#
commit
After these changes have been committed, Pacemaker mounts the DRBD device
to /srv/nfs/work
on the same node. Confirm this with
mount
(or by looking at /proc/mounts
).
When your DRBD, LVM, and file system resources are working properly,
continue with the resources managing your NFS exports. To create highly
available NFS export resources, use the exportfs
resource type.
To export the /srv/nfs/work
directory to clients,
use the following primitive:
Create NFS exports with the following commands:
crm(live)configure#
primitive
exportfs_work \ ocf:heartbeat:exportfs \ params directory="/srv/nfs/work" \ options="rw,mountpoint" \ clientspec="192.168.2.0/24" \ wait_for_leasetime_on_stop=true \ fsid=100 \ op monitor interval="30s"
After you have created these resources, append them to the existing
g-nfs
resource group:
crm(live)configure#
modgroup
g-nfs add exportfs_work
Commit this configuration:
crm(live)configure#
commit
Pacemaker will export the NFS virtual file system root and the two other exports.
Confirm that the NFS exports are set up properly:
root #
exportfs
-v /srv/nfs/work IP_ADDRESS_OF_CLIENT(OPTIONS)
The initial installation creates an administrative virtual IP address for Hawk2. Although you could use this IP address for your NFS exports too, create another one exclusively for NFS exports. This makes it easier to apply security restrictions later. Use the following commands in the crm shell:
crm(live)configure#
primitive
vip_nfs IPaddr2 \ params ip=192.168.2.1 cidr_netmask=24 \ op monitor interval=10 timeout=20crm(live)configure#
modgroup
g-nfs add vip_nfscrm(live)configure#
commit
This section outlines how to use the highly available NFS service from an NFS client.
To connect to the NFS service, make sure to use the virtual IP address to connect to the cluster rather than a physical IP configured on one of the cluster nodes' network interfaces. For compatibility reasons, use the full path of the NFS export on the server.
In its simplest form, the command to mount the NFS export looks like this:
root #
mount
-t nfs 192.168.2.1:/srv/nfs/work /home/work
To configure a specific transport protocol (proto
)
and maximum read and write request sizes (rsize
and
wsize
), use:
root #
mount
-o rsize=32768,wsize=32768 \ 192.168.2.1:/srv/nfs/work /home/work
In case you need to be compatible with NFS version 3, include the value
vers=3
after the -o
option.
For further NFS mount options, consult the nfs
man page.
Copyright © 2006– 2024 SUSE LLC und Mitwirkende. Alle Rechte vorbehalten.
Es wird die Genehmigung erteilt, dieses Dokument unter den Bedingungen der GNU Free Documentation License, Version 1.2 oder (optional) Version 1.3 zu vervielfältigen, zu verbreiten und/oder zu verändern; die unveränderlichen Abschnitte hierbei sind der Urheberrechtshinweis und die Lizenzbedingungen. Eine Kopie dieser Lizenz (Version 1.2) finden Sie im Abschnitt “GNU Free Documentation License”.
Die SUSE-Marken finden Sie unter http://www.suse.com/company/legal/. Alle anderen Marken von Drittanbietern sind Besitz ihrer jeweiligen Eigentümer. Markensymbole (®, ™ usw.) kennzeichnen Marken von SUSE und der Tochtergesellschaften. Sternchen (*) kennzeichnen Marken von Drittanbietern.
Alle Informationen in diesem Buch wurden mit größter Sorgfalt zusammengestellt. Doch auch dadurch kann hundertprozentige Richtigkeit nicht gewährleistet werden. Weder SUSE LLC noch ihre Tochtergesellschaften noch die Autoren noch die Übersetzer können für mögliche Fehler und deren Folgen haftbar gemacht werden.