24 Management Cluster #
This section covers how to do various Day 2
operations on a management cluster
.
24.1 RKE2 upgrade #
To ensure disaster recovery, we advise to do a backup of the RKE2 cluster data. For information on how to do this, check here. The default location for the rke2
binary is /opt/rke2/bin
.
You can upgrade the RKE2 version using the RKE2 installation script as follows:
curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION=vX.Y.Z+rke2rN sh -
Remember to restart the rke2
process after installing:
# For server nodes:
systemctl restart rke2-server
# For agent nodes:
systemctl restart rke2-agent
To avoid any unforseen upgrade problems, use the following node upgrade order:
Server nodes - should be upgraded one node at a time.
Agent nodes - should be upgraded after all server node upgrades have finished. Can be upgraded in parallel.
For further information, see the RKE2 upgrade documentation.
24.2 OS upgrade #
This section assumes that you have registered your system to https://scc.suse.com.
SUSE regularly releases new SLE Micro
package updates. To retrieve the updated package versions SLE Micro uses transactional-upgrade
.
transactional-upgrade
provides an application and library to update a Linux operating system in a transactional way, i.e. the update will be performed in the background while the system continues running as it is. Only after you reboot the system will the update take effect. For further information, see the transactional-update
GitHub GitHub page.
To update all packages in the system, execute:
transactional-update
Since rebooting the node will result in it being unavailable for some time, if you are running a multi-node cluster, you can cordon and drain the node before the reboot.
To cordon a node, execute:
kubectl cordon <node>
This will result in the node being taken out of the default scheduling mechanism, ensuring that no pods will be assigned to it by mistake.
To drain a node, execute:
kubectl drain <node>
This will ensure that all workloads on the node will be transferred to other available nodes.
Depending on what workloads you are running on the node, you might also need to provide additional flags (e.g. --delete-emptydir-data
, --ignore-daemonsets
) to the command.
Reboot node:
sudo reboot
After a successful reboot, the packages on your node will be updated. The only thing left is to bring the node back to the default scheduling mechanism with the uncordon command.
Uncordon node:
kubectl uncordon <node>
In case you want to revert the update, use the above steps with the following transactional-update
command:
transactional-update rollback last
24.3 Helm upgrade #
This section assumes you have installed helm
on your system. For helm
installation instructions, check here.
This section covers how to upgrade both an EIB (Section 24.3.1, “EIB deployed helm chart”) and non-EIB (Section 24.3.2, “Non-EIB deployed helm chart”) deployed helm chart.
24.3.1 EIB deployed helm chart #
EIB deploys helm charts defined in it’s image definition file (Section 3.3, “Creating the image definition file”) by using RKE2’s manifest auto-deploy functionality.
In order to upgrade a chart that is deployed in such a manner, you need to upgrade the chart manifest file that EIB will create under the /var/lib/rancher/rke2/server/manifests
directory on your initializer
node.
To ensure disaster recovery, we advise that you always backup your chart manifest file as well as follow any documentation related to disaster recovery that your chart offers.
To upgrade the chart manifest file, follow these steps:
Locate the
initializer
nodeFor
multi-node clusters
- in your EIB image definition file, you should have specified theinitializer: true
property for one of your nodes. If you have not specified this property, the initializer node will be the first server node in your node list.For
single-node clusters
- the initializer is the currently running node.
SSH to the
initializer
node:ssh root@<node_ip>
Pull the helm chart:
For helm charts hosted in a helm chart repository:
helm repo add <chart_repo_name> <chart_repo_urls> helm pull <chart_repo_name>/<chart_name> # Alternatively if you want to pull a specific verison helm pull <chart_repo_name>/<chart_name> --version=X.Y.Z
For OCI-based helm charts:
helm pull oci://<chart_oci_url> # Alternatively if you want to pull a specific verison helm pull oci://<chart_oci_url> --version=X.Y.Z
Encode the pulled
.tgz
archive so that it can be passed to aHelmChart
CR config:base64 -w 0 <chart_name>-X.Y.Z.tgz > <chart_name>-X.Y.Z.txt
Make a copy of the chart manifest file that we will edit:
cp /var/lib/rancher/rke2/server/manifests/<chart_name>.yaml ./<chart_name>.yaml
Change the
chartContent
andversion
configurations of thebar.yaml
file:sed -i -e "s|chartContent:.*|chartContent: $(<chart-name-X.Y.Z.txt)|" -e "s|version:.*|version: X.Y.Z|" <chart_name>.yaml
NoteIf you need to do any additional upgrade changes to the chart (e.g. adding new custom chart values), you need to manually edit the chart manifest file.
Replace the original chart manifest file:
cp <chart_name>.yaml /var/lib/rancher/rke2/server/manifests/
The above commands will trigger an upgrade of the helm chart. The upgrade will be handled by the helm-controller.
To track the helm chart upgrade you need to view the logs of the pod that the helm-controller
creates for the chart upgrade. Refer to the Examples (Section 24.3.1.1, “Examples”) section for more information.
24.3.1.1 Examples #
The examples in this section assume that you have already located and connected to your initializer
node.
This section offer examples on how to upgrade a:
Rancher (Section 24.3.1.1.1, “Rancher upgrade”) helm chart
Metal3 (Section 24.3.1.1.2, “Metal3 upgrade”) helm chart
24.3.1.1.1 Rancher upgrade #
To ensure disaster recovery, we advise to do a Rancher backup. For information on how to do this, check here.
This example shows how to upgrade Rancher to the 2.8.5
version.
Add the
Rancher Prime
Helm repository:helm repo add rancher-prime https://charts.rancher.com/server-charts/prime
Pull the latest
Rancher Prime
helm chart version:helm pull rancher-prime/rancher --version=2.8.5
Encode
.tgz
archive so that it can be passed to aHelmChart
CR config:base64 -w 0 rancher-2.8.5.tgz > rancher-2.8.5-encoded.txt
Make a copy of the
rancher.yaml
file that we will edit:cp /var/lib/rancher/rke2/server/manifests/rancher.yaml ./rancher.yaml
Change the
chartContent
andversion
configurations of therancher.yaml
file:sed -i -e "s|chartContent:.*|chartContent: $(<rancher-2.8.5-encoded.txt)|" -e "s|version:.*|version: 2.8.5|" rancher.yaml
NoteIf you need to do any additional upgrade changes to the chart (e.g. adding new custom chart values), you need to manually edit the
rancher.yaml
file.Replace the original
rancher.yaml
file:cp rancher.yaml /var/lib/rancher/rke2/server/manifests/
To verify the update:
List pods in
default
namespace:kubectl get pods -n default # Example output NAME READY STATUS RESTARTS AGE helm-install-cert-manager-7v7nm 0/1 Completed 0 88m helm-install-rancher-p99k5 0/1 Completed 0 3m21s
Look at the logs of the
helm-install-rancher-*
pod:kubectl logs <helm_install_rancher_pod> -n default # Example kubectl logs helm-install-rancher-p99k5 -n default
Verify
Rancher
pods are running:kubectl get pods -n cattle-system # Example output NAME READY STATUS RESTARTS AGE helm-operation-mccvd 0/2 Completed 0 3m52s helm-operation-np8kn 0/2 Completed 0 106s helm-operation-q8lf7 0/2 Completed 0 2m53s rancher-648d4fbc6c-qxfpj 1/1 Running 0 5m27s rancher-648d4fbc6c-trdnf 1/1 Running 0 9m57s rancher-648d4fbc6c-wvhbf 1/1 Running 0 9m57s rancher-webhook-649dcc48b4-zqjs7 1/1 Running 0 100s
Verify
Rancher
version upgrade:kubectl get settings.management.cattle.io server-version # Example output NAME VALUE server-version v2.8.5
24.3.1.1.2 Metal3 upgrade #
This example shows how to upgrade Metal3 to the 0.7.3
version.
Pull the latest
Metal3
helm chart version:helm pull oci://registry.suse.com/edge/metal3-chart --version 0.7.3
Encode
.tgz
archive so that it can be passed to aHelmChart
CR config:base64 -w 0 metal3-chart-0.7.3.tgz > metal3-chart-0.7.3-encoded.txt
Make a copy of the
Metal3
manifest file that we will edit:cp /var/lib/rancher/rke2/server/manifests/metal3.yaml ./metal3.yaml
Change the
chartContent
andversion
configurations of theMetal3
manifest file:sed -i -e "s|chartContent:.*|chartContent: $(<metal3-chart-0.7.3-encoded.txt)|" -e "s|version:.*|version: 0.7.3|" metal3.yaml
NoteIf you need to do any additional upgrade changes to the chart (e.g. adding new custom chart values), you need to manually edit the
metal3.yaml
file.Replace the original
Metal3
manifest file:cp metal3.yaml /var/lib/rancher/rke2/server/manifests/
To verify the update:
List pods in
default
namespace:kubectl get pods -n default # Example output NAME READY STATUS RESTARTS AGE helm-install-metal3-7p7bl 0/1 Completed 0 27s
Look at the logs of the
helm-install-rancher-*
pod:kubectl logs <helm_install_rancher_pod> -n default # Example kubectl logs helm-install-metal3-7p7bl -n default
Verify
Metal3
pods are running:kubectl get pods -n metal3-system # Example output NAME READY STATUS RESTARTS AGE baremetal-operator-controller-manager-785f99c884-9z87p 2/2 Running 2 (25m ago) 36m metal3-metal3-ironic-96fb66cdd-lkss2 4/4 Running 0 3m54s metal3-metal3-mariadb-55fd44b648-q6zhk 1/1 Running 0 36m
Verify the
HelmChart
resource version is upgraded:kubectl get helmchart metal3 -n default # Example output NAME JOB CHART TARGETNAMESPACE VERSION REPO HELMVERSION BOOTSTRAP metal3 helm-install-metal3 metal3-system 0.7.3
24.3.2 Non-EIB deployed helm chart #
Get the values for the currently running helm chart
.yaml
file and make any changes to them if necessary:helm get values <chart_name> -n <chart_namespace> -o yaml > <chart_name>-values.yaml
Update the helm chart:
# For charts using a chart repository helm upgrade <chart_name> <chart_repo_name>/<chart_name> \ --namespace <chart_namespace> \ -f <chart_name>-values.yaml \ --version=X.Y.Z # For OCI based charts helm upgrade <chart_name> oci://<oci_registry_url>/<chart_name> \ --namespace <chart_namespace> \ -f <chart_name>-values.yaml \ --version=X.Y.Z
Verify the chart upgrade. Depending on the chart you may need to verify different resources. For examples of chart upgrades, see the Examples (Section 24.3.2.1, “Examples”) section.
24.3.2.1 Examples #
This section offer examples on how to upgrade a:
Rancher (Section 24.3.2.1.1, “Rancher”) helm chart
Metal3 (Section 24.3.2.1.2, “Metal3”) helm chart
24.3.2.1.1 Rancher #
To ensure disaster recovery, we advise to do a Rancher backup. For information on how to do this, check here.
This example shows how to upgrade Rancher to the 2.8.5
version.
Get the values for the current Rancher release and print them to a
rancher-values.yaml
file:helm get values rancher -n cattle-system -o yaml > rancher-values.yaml
Update the helm chart:
helm upgrade rancher rancher-prime/rancher \ --namespace cattle-system \ -f rancher-values.yaml \ --version=2.8.5
Verify
Rancher
version upgrade:kubectl get settings.management.cattle.io server-version # Example output NAME VALUE server-version v2.8.5
For additional information on the Rancher helm chart upgrade, check here.
24.3.2.1.2 Metal3 #
This example shows how to upgrade Metal3 to the 0.7.3
version.
Get the values for the current Rancher release and print them to a
rancher-values.yaml
file:helm get values metal3 -n metal3-system -o yaml > metal3-values.yaml
Update the helm chart:
helm upgrade metal3 oci://registry.suse.com/edge/metal3-chart \ --namespace metal3-system \ -f metal3-values.yaml \ --version=0.7.3
Verify
Metal3
pods are running:kubectl get pods -n metal3-system # Example output NAME READY STATUS RESTARTS AGE baremetal-operator-controller-manager-785f99c884-fvsx4 2/2 Running 0 12m metal3-metal3-ironic-96fb66cdd-j9mgf 4/4 Running 0 2m41s metal3-metal3-mariadb-55fd44b648-7fmvk 1/1 Running 0 12m
Verify
Metal3
helm release version change:helm ls -n metal3-system # Expected output NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION metal3 metal3-system 2 2024-06-17 12:43:06.774802846 +0000 UTC deployed metal3-0.7.3 1.16.0
24.4 Cluster API upgrade #
The Cluster API (CAPI) controllers on a Metal3 management cluster are not currently managed via Helm, this section describes the upgrade process.
This section assumes you have installed clusterctl
and configured on your system as described in the Metal3 quickstart (Chapter 1, BMC automated deployments with Metal3)
When upgrading to Edge 3.0.2 from any previous version it will be necessary to upgrade the RKE2 providers:
clusterctl upgrade apply --bootstrap "rke2:v0.4.1" --control-plane "rke2:v0.4.1"
Please ensure the versions selected align with those described in the Release Notes (Chapter 33, Release Notes), usage of other upstream releases is not supported.