Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]

5 Cluster Updates

5.1 Update Requirements

Important
Important

Attempting a cluster update without updating the installed packages pattern on the management node, can lead to an incomplete or failed update.

Before updating a SUSE CaaS Platform cluster, it’s required to update packages installed by the SUSE-CaaSP-Management pattern on the management workstation.

The cluster update depends on updated skuba, but might also require new helm / Terraform or other dependencies which will be updated with the refreshed pattern.

Run sudo zypper update on the management workstation before any attempt to update the cluster.

5.2 Updating Kubernetes Components

Updating of Kubernetes and it’s components from one minor version to the next (for example from 1.16 to 1.17) is handled by skuba. The reason for this is that minor updates require special plan and apply procedures. These procedures differ for patch updates (for example 1.16.1 to 1.16.2), which are handled by skuba-update as described in Section 5.4, “Base OS Updates”.

Important
Important

Generally speaking: If you have other deployments not installed via Kubernetes or helm, update them last in the upgrade process.

However, if your applications/deployments in their current versions are incompatible with the Kubernetes version that you are upgrading to, you must update these applications/deployments to a compatible version before attempting a cluster upgrade.

Refer to the individual application/deployment for the requirements for Kubernetes version and dependencies.

The general procedure should look like this:

  1. Check if all current versions of applications and deployments in the cluster will work on the new Kubernetes version you plan to install.

    • If an application/deployment is incompatible with the new Kubernetes version, update the application/deployment before performing any of the other upgrade steps.

  2. Update the packages and reboot your management workstation to get all the latest changes to skuba, helm and their dependencies.

  3. Run skuba addon upgrade plan and then skuba addon upgrade apply on the management workstation.

  4. Apply all the configuration files that you modified for addons, the upgrade will have reset the configurations to defaults.

  5. Check if there are addon upgrades available for the current cluster using skuba addon upgrade plan.

    • Check if all the deployments in the cluster are compatible with the Kubernetes release that will be installed (refer to your individual deployments' documentation). If the deployment is not compatible you must update it to ensure it working with the updated Kubernetes.

  6. Upgrade all master nodes by sequentially running skuba node upgrade plan and skuba node upgrade apply.

    • Make sure to wait until all PODs/deployments/DaemonSets are up and running as expected before moving to the next node.

  7. Upgrade all worker nodes by sequentially running skuba node upgrade plan and skuba node upgrade apply.

    • Make sure to wait until all PODs/deployments/DaemonSets are up and running as expected before moving to the next node.

  8. Check if new addons are available for the new version using skuba addon upgrade plan.

  9. Once all nodes are up to date, update helm and tiller (as needed) and subsequently the helm deployments.

5.2.1 Update Management Workstation

Run sudo zypper up on your management workstation to get the latest version of skuba and its dependencies. Reboot the machine to make sure that all system changes are correctly applied.

5.2.2 Generating an Overview of Available Platform Updates

In order to get an overview of the updates available, you can run:

skuba cluster upgrade plan

This will show you a list of updates (if available) for different components installed on the cluster. If the cluster is already running the latest available versions, the output should look like this:

Current Kubernetes cluster version: 1.16.2
Latest Kubernetes version: 1.16.2

Congratulations! You are already at the latest version available

If the cluster has a new patch-level or minor Kubernetes version available, the output should look like this:

Current Kubernetes cluster version: 1.15.2
Latest Kubernetes version: 1.16.2

Upgrade path to update from 1.15.2 to 1.16.2:
 - 1.15.2 -> 1.16.2

Similarly, you can also fetch this information on a per-node basis with the following command:

skuba node upgrade plan <NODE>

For example, if the cluster has a node named worker0 which is running the latest available versions, the output should look like this:

Current Kubernetes cluster version: 1.16.2
Latest Kubernetes version: 1.16.2

Node worker0 is up to date

On the other hand, if this same node has a new patch-level or minor Kubernetes version available, the output should look like this:

Current Kubernetes cluster version: 1.15.2
Latest Kubernetes version: 1.16.2

Current Node version: 1.15.2

Component versions in worker0
  - kubelet: 1.15.2 -> 1.16.2
  - cri-o: 1.15.0 -> 1.16.0

You will get a similar output if there is a version available on a master node (named master0 in this example):

Current Kubernetes cluster version: 1.15.2
Latest Kubernetes version: 1.16.2

Current Node version: 1.15.2

Component versions in master0
  - apiserver: 1.15.2 -> 1.16.2
  - controller-manager: 1.15.2 -> 1.16.2
  - scheduler: 1.15.2 -> 1.16.2
  - etcd: 3.3.11 -> 3.3.15
  - kubelet: 1.15.2 -> 1.16.2
  - cri-o: 1.15.0 -> 1.16.0

It may happen that the Kubernetes version on the control plane is too outdated for the update to progress. In this case, you would get output similar to the following:

Current Kubernetes cluster version: 1.15.0
Latest Kubernetes version: 1.15.0

Unable to plan node upgrade: at least one control plane does not tolerate the current cluster version
Tip
Tip

The control plane consists of these components:

  • apiserver

  • controller-manager

  • scheduler

  • etcd

  • kubelet

  • cri-o

5.2.3 Generating an Overview of Available Addon Updates

Note
Note

Due to changes to the way skuba handles addons some existing components might be shown as new addon in the status output. This is expected and no cause for concern. For any upgrade afterwards the addon will be considered known and only show available upgrades.

In order to get an overview of the addon updates available, you can run:

skuba addon upgrade plan

This will show you a list of updates (if available) for different addons installed on the cluster:

Current Kubernetes cluster version: 1.15.2
Latest Kubernetes version: 1.16.2

Addon upgrades:
  - cilium: 1.5.3 -> 1.5.3 (manifest version from 0 to 4)
  - kured: 1.2.0 -> 1.3.0
  - dex: 2.16.0 -> 2.16.1
  - gangway: 3.1.0 -> 3.1.0 (manifest version from 1 to 2)
  - psp: 1.0.0 -> 1.1.0

If the cluster is already running the latest available versions, the output should look like this:

Current Kubernetes cluster version: 1.16.2
Latest Kubernetes version: 1.16.2

Congratulations! Addons are already at the latest version available

Before updating the nodes you must apply the addon upgrades to your management workstation. Please run:

skuba addon upgrade apply

5.3 Updating Nodes

Note
Note

It is recommended to use a load balancer with active health checks and pool management that will take care of adding/removing nodes to/from the pool during this process.

Updates have to be applied separately to each node, starting with the control plane all the way down to the worker nodes.

Note that the upgrade via skuba node upgrade apply will:

  • Upgrade the containerized control plane.

  • Upgrade the rest of the Kubernetes system stack (kubelet, cri-o).

  • Restart services.

During the upgrade to a newer version, the API server will be unavailable.

During the upgrade all the pods in the worker node will be restarted so it is recommended to drain the pods if your application requires high availability. In most cases, the restart is handled by replicaSet.

5.3.1 How To Update Nodes

  1. Upgrade the master nodes:

    skuba node upgrade apply --target <MASTER_NODE_IP> --user <USER> --sudo
  2. When all master nodes are upgraded, upgrade the worker nodes as well:

    skuba node upgrade apply --target <WORKER_NODE_IP> --user <USER> --sudo
  3. Verify that your cluster nodes are upgraded by running:

    skuba cluster upgrade plan
Tip
Tip

The upgrade via skuba node upgrade apply will:

  • upgrade the containerized control plane.

  • upgrade the rest of the Kubernetes system stack (kubelet, cri-o).

  • restart services.

5.3.2 Check for Upgrades to New Version

Once you have upgraded all nodes, please run skuba cluster upgrade plan again. This will show if any upgrades are available that required the versions you just installed. If there are upgrades available please repeat the procedure until no more new upgrades are shown.

5.4 Base OS Updates

Base operating system updates are handled by skuba-update, which works together with the kured reboot daemon.

5.4.1 Disabling Automatic Updates

Nodes added to a cluster have the service skuba-update.timer, which is responsible for running automatic updates, activated by default. This service calls the skuba-update utility and it can be configured with the /etc/sysconfig/skuba-update file. To disable the automatic updates on a node, simply ssh to it and then configure the skuba-update service by editing the /etc/sysconfig/skuba-update file with the following runtime options:

## Path           : System/Management
## Description    : Extra switches for skuba-update
## Type           : string
## Default        : ""
## ServiceRestart : skuba-update
#
SKUBA_UPDATE_OPTIONS="--annotate-only"
Tip
Tip

It is not required to reload or restart skuba-update.timer.

The --annotate-only flag makes the skuba-update utility only check if updates are available and annotate the node accordingly. When this flag is activated no updates are installed at all.

When OS updates are disabled, then you will have to manage OS updates manually. In order to do so, you will have to call skuba-update manually on each node.

Warning
Warning

Do not use zypper up/zypper patch commands as these do not manage the Kubernetes annotations used by kured. If you perform a manual update using these commands you might render your cluster unusable.

After that, rebooting the node will depend on whether you have also disabled reboots or not. If you have disabled reboots for this node, then you will have to follow the instructions as given in Section 5.4.2, “Completely Disabling Reboots”. Otherwise, you will have to wait until kured performs the reboot of the node

5.4.2 Completely Disabling Reboots

If you would like to take care of reboots manually, either as a temporary measure or permanently, you can disable them by creating a lock:

kubectl -n kube-system annotate ds kured weave.works/kured-node-lock='{"nodeID":"manual"}'

This command modifies an annotation (annotate) on the daemonset (ds) named kured.

When automatic reboots are disabled, you will have to manage reboots yourself. In order to do this, you will have to follow some steps whenever you want to issue a reboot marker for a node. First of all, you will have to cordon and drain the node:

kubectl cordon <NODE_ID>
kubectl drain --force=true \
  --ignore-daemonsets=true \ 1
  --delete-local-data=false \ 2
  --grace-period 600 \ 3
  --timeout=900s \ 4
  <NODE_ID>

1

Core components like kured and cilium are running as DaemonSet and draining those pods will fail if this is not set to true.

2

Continues even if there are pods using emptyDir (local data that will be deleted when the node is drained; e.g: metrics-server).

3

Running applications will be notified of termination and given 10 minutes (600 seconds) to safely store data.

4

Draining of the node will fail after 15 minutes (900 seconds) have elapsed without success.

Important
Important

Depending on your deployed applications, you must adjust the values for --grace-period and --timeout to grant the applications enough time to safely shut down without losing data. The values here are meant to represent a conservative default for an application like SUSE Cloud Application Platform.

If you do not set these values, applications might never finish and draining of the pod will hang indefinitely.

Only then you will be able to manually reboot the node safely.

Once the node is back, remember to uncordon it so it is scheduleable again:

kubectl cordon <NODE_ID>

Perform the above steps first on control plane nodes, and afterwards on worker nodes.

Tip
Tip

If the node that should be rebooted does not contain any workload you can skip the above steps and simply reboot the node.

5.4.3 Manual Unlock

In exceptional circumstances, such as a node experiencing a permanent failure whilst rebooting, manual intervention may be required to remove the cluster lock:

kubectl -n kube-system annotate ds kured weave.works/kured-node-lock-

This command modifies an annotation (annotate) on the daemonset (ds) named kured. It explicitly performs an "unset" (-) for the value for the annotation named weave.works/kured-node-lock.

Print this page