13.3 Cloud Lifecycle Manager Maintenance Update Procedure #
Procedure 13.2: Preparing for Update #
Ensure that the update repositories have been properly set up on all nodes. The easiest way to provide the required repositories on the Cloud Lifecycle Manager Server is to set up an SMT server as described in Book “Installing with Cloud Lifecycle Manager”, Chapter 4 “Installing and Setting Up an SMT Server on the Cloud Lifecycle Manager server (Optional)”. Alternatives to setting up an SMT server are described in Book “Installing with Cloud Lifecycle Manager”, Chapter 5 “Software Repository Setup”.
Read the Release Notes for the security and maintenance updates that will be installed.
Have a backup strategy in place. For further information, see Chapter 14, Backup and Restore.
Ensure that you have a known starting state by resolving any unexpected alarms.
Determine if you need to reboot your cloud after updating the software. Rebooting is highly recommended to ensure that all affected services are restarted. Reboot may be required after installing Linux kernel updates, but it can be skipped if the impact on running services is non-existent or well understood.
Review steps in Section 13.1.4.1, “Adding a Neutron Network Node” and Section 13.1.1.2, “Rolling Reboot of the Cloud” to minimize the impact on existing workloads. These steps are critical when the Neutron services are not provided via external SDN controllers.
Before the update, prepare your working loads by consolidating all of your instances to one or more Compute Nodes. After the update is complete on the 324 evacuated Compute Nodes, reboot them and move the images from the remaining Compute Nodes to the newly booted ones. Then, update the remaining Compute Nodes.
13.3.1 Performing the Update #
Before you proceed, get the status of all your services:
ardana >
cd ~/scratch/ansible/next/ardana/ansibleardana >
ansible-playbook -i hosts/verb_hosts ardana-status.yml
If status check returns an error for a specific service, run the
SERVICE-reconfigure.yml
playbook. Then run the
SERVICE-status.yml
playbook to check that the issue has been resolved.
Update and reboot all nodes in the cloud one by one. Start with the deployer node, then follow the order recommended in Section 13.1.1.2, “Rolling Reboot of the Cloud”.
Note
The described workflow also covers cases in which the deployer node is also provisioned as an active cloud node.
To minimize the impact on the existing workloads, the node should first be prepared for an update and a subsequent reboot by following the steps leading up to stopping services listed in Section 13.1.1.2, “Rolling Reboot of the Cloud”, such as migrating singleton agents on Control Nodes and evacuating Compute Nodes. Do not stop services running on the node, as they need to be running during the update.
Procedure 13.3: Update Instructions #
Install all available security and maintenance updates on the deployer using the
zypper patch
command.Initialize the Cloud Lifecycle Manager and prepare the update playbooks.
Run the
ardana-init
initialization script to update the deployer.Redeploy cobbler:
ardana >
cd ~/openstack/ardana/ansibleardana >
ansible-playbook -i hosts/localhost cobbler-deploy.ymlRun the configuration processor:
ardana >
cd ~/openstack/ardana/ansibleardana >
ansible-playbook -i hosts/localhost config-processor-run.ymlUpdate your deployment directory:
ardana >
cd ~/openstack/ardana/ansibleardana >
ansible-playbook -i hosts/localhost ready-deployment.yml
Installation and management of updates can be automated with the following playbooks:
ardana-update-pkgs.yml
ardana-update.yml
ardana-update-status.yml
Important
Some playbooks are being deprecated. To determine how your system is affected, run:
ardana >
rpm -qa ardana-ansibleThe result will be
ardana-ansible-8.0+git.
followed by a version number string.If the first part of the version number string is greater than or equal to 1553878455 (for example, ardana-ansible-8.0+git.1553878455.7439e04), use the newly introduced parameters:
pending_clm_update
pending_service_update
pending_system_reboot
If the first part of the version number string is less than 1553878455 (for example, ardana-ansible-8.0+git.1552032267.5298d45), use the following parameters:
update_status_var
update_status_set
update_status_reset
ardana-reboot.yml
Confirm version changes by running
hostnamectl
before and after running theardana-update-pkgs
playbook on each node.ardana >
hostnamectlNotice that the
Boot ID:
andKernel:
information has changed.By default, the
ardana-update-pkgs.yml
playbook will install patches and updates that do not require a system reboot. Patches and updates that do require a system reboot will be installed later in this process.ardana >
cd ~/scratch/ansible/next/ardana/ansibleardana >
ansible-playbook -i hosts/verb_hosts ardana-update-pkgs.yml \ --limit TARGET_NODE_NAMEThere may be a delay in the playbook output at the following task while updates are pulled from the deployer.
TASK: [ardana-upgrade-tools | pkg-update | Download and install package updates] ***
After running the
ardana-update-pkgs.yml
playbook to install patches and updates not requiring reboot, check the status of remaining tasks.ardana >
ansible-playbook -i hosts/verb_hosts ardana-update-status.yml \ --limit TARGET_NODE_NAMETo install patches that require reboot, run the
ardana-update-pkgs.yml
playbook with the parameter-e zypper_update_include_reboot_patches=true
.ardana >
ansible-playbook -i hosts/verb_hosts ardana-update-pkgs.yml \ --limit TARGET_NODE_NAME \ -e zypper_update_include_reboot_patches=trueIf the output of
ardana-update-pkgs.yml
indicates that a reboot is required, runardana-reboot.yml
after completing theardana-update.yml
step below. Runningardana-reboot.yml
will cause cloud service interruption.Note
To update a single package (for example, apply a PTF on a single node or on all nodes), run
zypper update PACKAGE
.To install all package updates using
zypper update
.Update services:
ardana >
ansible-playbook -i hosts/verb_hosts ardana-update.yml \ --limit TARGET_NODE_NAMEIf indicated by the
ardana-update-status.yml
playbook, reboot the node.There may also be a warning to reboot after running the
ardana-update-pkgs.yml
.This check can be overridden by setting the
SKIP_UPDATE_REBOOT_CHECKS
environment variable or theskip_update_reboot_checks
Ansible variable.ansible-playbook -i hosts/verb_hosts ardana-reboot.yml \ --limit TARGET_NODE_NAME
To recheck pending system reboot status at a later time, run the following commands:
ardana >
cd ~/scratch/ansible/next/ardana/ansibleardana >
ansible-playbook -i hosts/verb_hosts ardana-update-status.yml \ --limit ardana-cp1-c1-m2The pending system reboot status can be reset by running:
ardana >
cd ~/scratch/ansible/next/ardana/ansibleardana >
ansible-playbook -i hosts/verb_hosts ardana-update-status.yml \ --limit ardana-cp1-c1-m2 \ -e pending_system_reboot=offMultiple servers can be patched at the same time with
ardana-update-pkgs.yml
by setting the option-e skip_single_host_checks=true
.Warning
When patching multiple servers at the same time, take care not to compromise HA capability by updating an entire cluster (controller, database, monitor, logging) at the same time.
If multiple nodes are specified on the command line (with
--limit
), services on those servers will experience outages as the packages are shutdown and updated. On Compute Nodes (or group of Compute Nodes) migrate the workload off if you plan to update it. The same applies to Control Nodes: move singleton services off of the control plane node that will be updated.Important
Do not reboot all of your controllers at the same time.
When the node comes up after the reboot, run the
spark-start.yml
file:ardana >
cd ~/scratch/ansible/next/ardana/ansibleardana >
ansible-playbook -i hosts/verb_hosts spark-start.ymlVerify that Spark is running on all Control Nodes:
ardana >
cd ~/scratch/ansible/next/ardana/ansibleardana >
ansible-playbook -i hosts/verb_hosts spark-status.ymlAfter all nodes have been updated, check the status of all services:
ardana >
cd ~/scratch/ansible/next/ardana/ansibleardana >
ansible-playbook -i hosts/verb_hosts ardana-status.yml
13.3.2 Summary of the Update Playbooks #
- ardana-update-pkgs.yml
Top-level playbook automates the installation of package updates on a single node. It also works for multiple nodes, if the single-node restriction is overridden by setting the SKIP_SINGLE_HOST_CHECKS environment variable
ardana-update-pkgs.yml -e skip_single_host_checks=true
.Provide the following
-e
options to modify default behavior:zypper_update_method
(default: patch)patch
will install all patches for the system. Patches are intended for specific bug and security fixes.update
will install all packages that have a higher version number than the installed packages.dist-upgrade
replaces each package installed with the version from the repository and deletes packages not available in the repositories.
zypper_update_repositories
(default: all) restricts the list of repositories usedzypper_update_gpg_checks
(default: true) enables GPG checks. If set totrue
, checks if packages are correctly signed.zypper_update_licenses_agree
(default: false) automatically agrees with licenses. If set totrue
, zypper automatically accepts third party licenses.zypper_update_include_reboot_patches
(default: false) includes patches that require reboot. Setting this totrue
installs patches that require a reboot (such as kernel or glibc updates).
- ardana-update.yml
Top level playbook that automates the update of all the services. Runs on all nodes by default, or can be limited to a single node by adding
--limit nodename
.- ardana-reboot.yml
Top-level playbook that automates the steps required to reboot a node. It includes pre-boot and post-boot phases, which can be extended to include additional checks.
- ardana-update-status.yml
This playbook can be used to check or reset the update-related status variables maintained by the update playbooks. The main reason for having this mechanism is to allow the update status to be checked at any point during the update procedure. It is also used heavily by the automation scripts to orchestrate installing maintenance updates on multiple nodes.