3 Deploying compute nodes #
High Performance Computing clusters consist of one or more sets of identical compute nodes. In large clusters, each set could contain thousands of machines. To help deploy so many compute nodes as clusters scale up, the HPC module provides the deployment tool Warewulf.
3.1 About Warewulf #
Warewulf is a deployment system for compute nodes in High Performance Computing clusters. Compute nodes
are booted and deployed over the network with a kernel and node image provided by Warewulf.
To generate the node image, Warewulf uses a Warewulf container,
which is a base operating system container with a kernel and an init
implementation installed. Warewulf configures images for the individual compute nodes using
node profiles and Warewulf overlays.
Node profiles are used to apply the same configuration
to multiple nodes. Node profiles can include settings such as the container
to use, overlays to apply, and IPMI details. New nodes automatically use the
default
node profile. You can also create additional node profiles,
for example, if two groups of nodes require different containers.
Warewulf overlays are compiled for each individual compute node:
System (or
wwinit
) overlays are applied to nodes at boot time by thewwinit
process, beforesystemd
starts. These overlays are required to start the compute nodes, and contain basic node-specific configuration to start the first network interface. System overlays are not updated during runtime.Runtime (or generic) overlays are updated periodically at runtime by the
wwclient
service. The default is once per minute. These overlays are used to apply configuration changes to the nodes.The Host overlay is used for configuration that applies to the Warewulf server itself, such as adding entries to
/etc/hosts
or setting up the DHCP service and NFS exports.
System and runtime overlays can be overlayed on top of each other. For example, instead of altering a configuration setting in an overlay, you can override it with a new overlay. You can set a list of system and runtime overlays to apply to individual nodes, or to multiple nodes via profiles.
3.2 Deploying compute nodes with Warewulf #
The Warewulf server has a static IP address.
The compute nodes are set to PXE boot.
The Warewulf server is accessible from an external network, but is connected to the compute nodes via an internal cluster network used for deployment. This is important because Warewulf configures DHCP and TFTP on the Warewulf server, which might conflict with DHCP on the external network.
On the Warewulf server, install Warewulf:
>
sudo zypper install warewulf4The installation creates a basic configuration for the Warewulf server in the file
/etc/warewulf/warewulf.conf
. Review this file to make sure the details are correct. In particular, check the following settings:ipaddr: 192.168.1.250 netmask: 255.255.255.0 network: 192.168.1.0
ipaddr
is the IP address of the Warewulf server on the internal cluster network to be used for node deployment.netmask
andnetwork
must match this network.Additionally, check that the DHCP range is in the cluster network:
dhcp: range start: 192.168.1.21 range end: 192.168.1.50
In the file
/etc/sysconfig/dhcpd
, check thatDHCPD_INTERFACE
has the correct value. This must be the interface on which the cluster network is running.Start and enable the Warewulf service:
>
sudo systemctl enable --now warewulfdConfigure the services required by Warewulf:
>
sudo wwctl configure --allThis command performs the following tasks:
Configures DHCP and enables the DHCP service.
Writes the required PXE files to the TFTP root directory and enables the TFTP service.
Updates the
/etc/hosts
file.Configures an NFS server on the Warewulf server and enables the NFS service.
Creates host keys and user keys for passwordless SSH access to the nodes.
When the configuration is finished, log out of the Warewulf server and back into it. This creates an SSH key pair to allow passwordless login to the deployed compute nodes. If you require a password to secure the private key, set it now:
>
ssh-keygen -p -f $HOME/.ssh/clusterImporting the Warewulf container from the SUSE registry requires SUSE Customer Center credentials. Set your SCC credentials as environment variables before you import the container:
>
export WAREWULF_OCI_USERNAME=USER@EXAMPLE.COM>
export WAREWULF_OCI_PASSWORD=REGISTRATION_CODEImport the Warewulf container from the SUSE registry:
>
sudo wwctl container import \ docker://registry.suse.com/suse/hpc/warewulf4-x86_64/sle-hpc-node:15.6 \ hpcnode15.6 --setdefaultThe
--setdefault
argument sets this as the default container in thedefault
node profile.Configure the networking details for the
default
profile:>
sudo wwctl profile set -y default --netname default \ --netmask 255.255.255.0 --gateway 192.168.1.250To see the details of this profile, run the following command:
>
sudo wwctl profile list -a defaultAdd compute nodes to Warewulf. For example, to add ten discoverable nodes with preconfigured IP addresses, run the following command:
>
sudo wwctl node add node[01-10] \ 1 --netdev eth0 -I 192.168.1.100 \ 2 --discoverable=true 3One or more node names. Node names must be unique. If you have node groups or multiple clusters, add descriptors to the node names, for example
node01.cluster01
.The IP address for the first node. Subsequent nodes are given incremental IP addresses.
Allows Warewulf to assign a MAC address to the nodes when they boot for the first time.
To view the settings for these nodes, run the following command:
>
sudo wwctl node list -a node[01-10]Add the nodes to the
/etc/hosts
file:>
sudo wwctl configure hostfileRebuild the container image to make sure it is ready to use:
>
sudo wwctl container build hpcnode15.6Build the default system and runtime overlays:
>
sudo wwctl overlay buildThis command compiles overlays for all the nodes.
You can now boot the compute nodes with PXE. Warewulf provides all the required information.
3.3 Advanced Warewulf tasks #
3.3.1 Using Warewulf with UEFI Secure Boot #
To boot compute nodes with UEFI Secure Boot enabled, the packages shim and grub2-x86_64-efi must be installed in the Warewulf container. For the container you imported in Procedure 3.1, “Deploying compute nodes with Warewulf”, this should already be the default. Use the following procedure to verify that the packages are installed:
Open a shell in the Warewulf container:
>
sudo wwctl container shell hpcnode15.6Search for the packages and check their installation status in the
S
column:[hpcnode15.6] Warewulf>
zypper search shim grub2If shim and grub2-x86_64-efi are not installed, install them now:
[hpcnode15.6] Warewulf>
zypper install shim grub2-x86_64-efiExit the container's shell:
[hpcnode15.6] Warewulf>
exitIf any changes were made, Warewulf automatically rebuilds the container.
We recommend rebuilding the container again manually to make sure the changes are applied:
>
sudo wwctl container build hpcnode15.6
By default, Warewulf boots nodes via iPXE, which cannot be used when UEFI Secure Boot is enabled. Use the following procedure to switch to GRUB 2 as the boot method:
Open the file
/etc/warewulf/warewulf.conf
and change the value ofgrubboot
totrue
:warewulf: [...] grubboot: true
Reconfigure DHCP and TFTP to recognize the configuration change:
>
sudo wwctl configure dhcp>
sudo wwctl configure tftpRebuild the system and runtime overlays:
>
sudo wwctl overlay build
3.3.2 Configuring local node storage #
Nodes provisioned by Warewulf are ephemeral, so local disk storage is not required. However, local storage can still be useful, for example, as scratch storage for computational tasks.
Warewulf can set up and manage local storage for compute nodes via the disk provisioning tool Ignition. Before booting the compute nodes, you must install Ignition in the Warewulf container and add the disk details to either a node profile or individual nodes. A node or profile can have multiple disks.
Use the following procedure to install Ignition in the Warewulf container:
Open a shell in the Warewulf container:
>
sudo wwctl container shell hpcnode15.6Install the ignition and gptfdisk packages:
[hpcnode15.6] Warewulf>
zypper install ignition gptfdiskExit the container's shell:
[hpcnode15.6] Warewulf>
exitWarewulf automatically rebuilds the container.
We recommend rebuilding the container again manually to make sure the changes are applied:
>
sudo wwctl container build hpcnode15.6
The following examples demonstrate how to add a disk to a compute node's configuration file. To set up the disk, Ignition requires details about the physical storage device, the partitions on the disk, and the file system to use.
To add disks to a profile instead of an individual node, use the same commands but
replace wwctl node set NODENAME
with
wwctl profile set PROFILENAME
.
>
sudo wwctl node set node01 \
--diskname /dev/vda1 --diskwipe \
--partname scratch2 --partcreate \
--fsname scratch3 --fsformat btrfs4 --fspath /scratch5 --fswipe
This is the last partition, so does not require a partition size or number; it will be extended to the maximum possible size.
The path to the physical storage device. | |
The name of the partition. This is used as the partition label, for example, in
| |
The path to the partition that will contain the file system, using the
| |
The type of file system to use. Ignition fails if no type is defined. | |
The absolute path for the mount point. This is mandatory if you intend to mount the file system. |
For more information about the available options, run wwctl node set --help
.
3.4 For more information #
Node profiles: https://warewulf.org/docs/development/contents/profiles.html
Warewulf overlays: https://warewulf.org/docs/development/contents/overlays.html
The node provisioning process: https://warewulf.org/docs/development/contents/provisioning.html