Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
Applies to SUSE Enterprise Storage 6

6 Upgrading from Previous Releases Edit source

This chapter introduces steps to upgrade SUSE Enterprise Storage 5.5 to version 6. Note that version 5.5 is basically 5 with all latest patches applied.

Note
Note: Upgrade from Older Releases Not Supported

Upgrading from SUSE Enterprise Storage versions older than 5.5 is not supported. You first need to upgrade to the latest version of SUSE Enterprise Storage 5.5 and then follow the steps in this chapter.

6.1 Points to Consider before the Upgrade Edit source

  • Read the release notes - there you can find additional information on changes since the previous release of SUSE Enterprise Storage. Check the release notes to see whether:

    • Your hardware needs special considerations.

    • Any used software packages have changed significantly.

    • Special precautions are necessary for your installation.

    The release notes also provide information that could not make it into the manual on time. They also contain notes about known issues.

    After having installed the package release-notes-ses, find the release notes locally in the directory /usr/share/doc/release-notes or online at https://www.suse.com/releasenotes/.

  • Your password must be changed to meet SUSE Enterprise Storage 6 requirements. Ensure you change the username and password on all initiators as well. For more information on changing your password, see Section 10.4.4.3, “CHAP Authentication”.

  • In case you previously upgraded from version 4, verify that the upgrade to version 5 was completed successfully:

    • Check for the existence of the file

      /srv/salt/ceph/configuration/files/ceph.conf.import

      It is created by the engulf process during the upgrade from SES 4 to 5. Also, the configuration_init: default-import option is set in the file

      /srv/pillar/ceph/proposals/config/stack/default/ceph/cluster.yml

      If configuration_init is still set to default-import, the cluster is using ceph.conf.import as its configuration file and not DeepSea's default ceph.conf which is compiled from files in

      /srv/salt/ceph/configuration/files/ceph.conf.d/

      Therefore you need to inspect ceph.conf.import for any custom configuration, and possibly move the configuration to one of the files in

      /srv/salt/ceph/configuration/files/ceph.conf.d/

      Then remove the configuration_init: default-import line from

      /srv/pillar/ceph/proposals/config/stack/default/ceph/cluster.yml
      Warning
      Warning: Default DeepSea Configuration

      If you do not merge the configuration from ceph.conf.import and remove the configuration_init: default-import option, any default configuration settings we ship as part of DeepSea (stored in /srv/salt/ceph/configuration/files/ceph.conf.j2) will not be applied to the cluster.

    • Run the salt-run upgrade.check command to verify that the cluster uses the new bucket type straw2 and that the Admin Node is not a storage node. The default is straw2 for any newly created buckets.

      Important
      Important

      The new straw2 bucket type fixes several limitations in the original straw bucket type. The previous straw buckets would change some mappings that should have changed when a weight was adjusted. straw2 achieves the original goal of only changing mappings to or from the bucket item whose weight has changed.

      Changing a bucket type from straw to straw2 results in a small amount of data movement, depending on how much the bucket item weights vary from each other. When the weights are all the same, no data will move. When an item's weight varies significantly there will be more movement. To migrate, execute:

      cephadm@adm > ceph osd getcrushmap -o backup-crushmap
      cephadm@adm > ceph osd crush set-all-straw-buckets-to-straw2

      If there are problems, you can revert this change with:

      cephadm@adm > ceph osd setcrushmap -i backup-crushmap

      Moving to straw2 buckets unlocks a few recent features, such as the crush-compat balancer mode that was added in Luminous.

    • Check that Ceph 'jewel' profile is used:

      cephadm@adm > ceph osd crush dump | grep profile
  • In case old RBD kernel clients (older than SUSE Linux Enterprise Server 12 SP3) are being used, refer to Section 12.9, “Mapping RBD Using Old Kernel Clients”. We recommend upgrading old RBD kernel clients if possible.

  • If openATTIC is located on the Admin Node, it will be unavailable after you upgrade the node. The new Ceph Dashboard will not be available until you deploy it by using DeepSea.

  • The cluster upgrade may take a long time—approximately the time it takes to upgrade one machine multiplied by the number of cluster nodes.

  • A single node cannot be upgraded while running the previous SUSE Linux Enterprise Server release, but needs to be rebooted into the new version's installer. Therefore the services that the node provides will be unavailable for some time. The core cluster services will still be available—for example if one MON is down during upgrade, there are still at least two active MONs. Unfortunately, single instance services, such as a single iSCSI Gateway, will be unavailable.

  • Certain types of daemons depend upon others. For example, Ceph Object Gateways depend upon Ceph MON and OSD daemons. We recommend upgrading in this order:

    1. Admin Node

    2. Ceph Monitors/Ceph Managers

    3. Metadata Servers

    4. Ceph OSDs

    5. Object Gateways

    6. iSCSI Gateways

    7. NFS Ganesha

    8. Samba Gateways

  • If you used AppArmor in either 'complain' or 'enforce' mode, you need to set a Salt pillar variable before upgrading. Because SUSE Linux Enterprise Server 15 SP1 ships with AppArmor by default, AppArmor management was integrated into DeepSea stage 0. The default behavior in SUSE Enterprise Storage 6 is to remove AppArmor and related profiles. If you want to retain the behavior configured in SUSE Enterprise Storage 5.5, verify that one of the following lines is present in the /srv/pillar/ceph/stack/global.yml file before starting the upgrade:

    apparmor_init: default-enforce

    or

    apparmor_init: default-complain
  • From SUSE Enterprise Storage 6, MDS names starting with a digit are no longer allowed and MDS daemons will refuse to start. You can check whether your daemons have such names either by running the ceph fs status command, or by restarting an MDS and checking its logs for the following message:

    deprecation warning: MDS id 'mds.1mon1' is invalid and will be forbidden in
    a future version.  MDS names may not start with a numeric digit.

    If you see the above message, the MDS names will need to be migrated before attempting to upgrade to SUSE Enterprise Storage 6. DeepSea provides an orchestration to automate such a migration. MDS names starting with a digit will be prepended with 'mds.':

    root@master # salt-run state.orch ceph.mds.migrate-numerical-names
    Tip
    Tip: Custom Configuration Bound to MDS Names

    If you have configuration settings that are bound to MDS names and your MDS daemons have names starting with a digit, verify that your configuration settings apply to the new names as well (with the 'mds.' prefix). Consider the following example section in the /etc/ceph/ceph.conf file:

    [mds.123-my-mds] # config setting specific to MDS name with a name starting with a digit
    mds cache memory limit = 1073741824
    mds standby for name = 456-another-mds

    The ceph.mds.migrate-numerical-names orchestrator will change the MDS daemon name '123-my-mds' to 'mds.123-my-mds'. You need to adjust the configuration to reflect the new name:

    [mds.mds,123-my-mds] # config setting specific to MDS name with the new name
    mds cache memory limit = 1073741824
    mds standby for name = mds.456-another-mds

    This will add MDS daemons with the new names before removing the old MDS daemons. The number of MDS daemons will double for a short time. Clients will be able to access CephFS only after a short pause to failover. Therefore plan the migration for times when you expect little or no CephFS load.

6.2 Back Up Cluster Data Edit source

Although creating backups of a cluster's configuration and data is not mandatory, we strongly recommend backing up important configuration files and cluster data. Refer to Chapter 3, Backing Up Cluster Configuration and Data for more details.

6.3 Migrate from ntpd to chronyd Edit source

SUSE Linux Enterprise Server 15 SP1 no longer uses ntpd to synchronize the local host time. Instead, chronyd is used. You need to migrate the time synchronization daemon on each cluster node. You can migrate to chronyd either before the cluster, or upgrade the cluster and migrate to chronyd afterward.

Warning
Warning

Before you continue, review your current ntpd settings and determine if you want to keep using the same time server. Keep in mind that the default behaviour will convert to using chronyd.

If you want to manually maintain the chronyd configuration, follow the instructions below and ensure you disable ntpd time configuration. See Procedure 7.1, “Disabling Time Synchronization” for more information.

Procedure 6.1: Migrate to chronyd before the Cluster Upgrade
  1. Install the chrony package:

    root@minion > zypper install chrony
  2. Edit the chronyd configuration file /etc/chrony.conf and add NTP sources from the current ntpd configuration in /etc/ntp.conf.

    Tip
    Tip: More Details on chronyd Configuration

    Refer to https://documentation.suse.com/sles/15-SP1/html/SLES-all/cha-ntp.html to find more details about how to include time sources in chronyd configuration.

  3. Disable and stop the ntpd service:

    root@minion > systemctl disable ntpd.service && systemctl stop ntpd.service
  4. Start and enable the chronyd service:

    root@minion > systemctl start chronyd.service && systemctl enable chronyd.service
  5. Verify the status of chrony:

    root@minion > chronyc tracking
Procedure 6.2: Migrate to chronyd after the Cluster Upgrade
  1. During cluster upgrade, add the following software repositories:

    • SLE-Module-Legacy15-SP1-Pool

    • SLE-Module-Legacy15-SP1-Updates

  2. Upgrade the cluster to version 6.

  3. Edit the chronyd configuration file /etc/chrony.conf and add NTP sources from the current ntpd configuration in /etc/ntp.conf.

    Tip
    Tip: More Details on chronyd Configuration

    Refer to https://documentation.suse.com/sles/15-SP1/html/SLES-all/cha-ntp.html to find more details about how to include time sources in chronyd configuration.

  4. Disable and stop the ntpd service:

    root@minion > systemctl disable ntpd.service && systemctl stop ntpd.service
  5. Start and enable the chronyd service:

    root@minion > systemctl start chronyd.service && systemctl enable chronyd.service
  6. Migrate from ntpd to chronyd.

  7. Verify the status of chrony:

    root@minion > chronyc tracking
  8. Remove the legacy software repositories that you added to keep ntpd in the system during the upgrade process.

6.4 Patch Cluster Prior to Upgrade Edit source

Apply the latest patches to all cluster nodes prior to upgrade.

6.4.1 Required Software Repositories Edit source

Check that required repositories are configured on each host of the cluster. To list all available repositories, run

root@minion > zypper lr
Important
Important: Remove SUSE Enterprise Storage 5.5 LTSS Repositories

Upgrades will fail if LTSS repositories are configured in SUSE Enterprise Storage 5.5. Find their IDs and remove them from the system; for example:

root # zypper lr
[...]
12 | SUSE_Linux_Enterprise_Server_LTSS_12_SP3_x86_64:SLES12-SP3-LTSS-Debuginfo-Updates
13 | SUSE_Linux_Enterprise_Server_LTSS_12_SP3_x86_64:SLES12-SP3-LTSS-Updates
[...]
root # zypper rr 12 13
Tip
Tip: Upgrade Without Using SCC, SMT, or RMT

If your nodes are not subscribed to one of the supported software channel providers that handle automatic channel adjustment—such as SMT, RMT, or SCC—you may need to enable additional software modules and channels.

SUSE Enterprise Storage 5.5 requires:

  • SLES12-SP3-Installer-Updates

  • SLES12-SP3-Pool

  • SLES12-SP3-Updates

  • SUSE-Enterprise-Storage-5-Pool

  • SUSE-Enterprise-Storage-5-Updates

NFS/SMB Gateway on SLE-HA on SUSE Linux Enterprise Server 12 SP3 requires:

  • SLE-HA12-SP3-Pool

  • SLE-HA12-SP3-Updates

6.4.2 Repository Staging Systems Edit source

If you are using one of the repository staging systems—SMT, or RMT—create a new frozen patch level for the current and the new SUSE Enterprise Storage version.

Find more information in:

6.4.3 Patch the Whole Cluster to the Latest Patches Edit source

  1. Apply the latest patches of SUSE Enterprise Storage 5.5 and SUSE Linux Enterprise Server 12 SP3 to each Ceph cluster node. Verify that correct software repositories are connected to each cluster node (see Section 6.4.1, “Required Software Repositories”) and run DeepSea stage 0:

    root@master # salt-run state.orch ceph.stage.0
  2. After stage 0 completes, verify that each cluster node's status includes 'HEALTH_OK'. If not, resolve the problem before possible reboots in next steps.

  3. Run zypper ps to check for processes that may run with outdated libraries or binaries, and reboot if there are any.

  4. Verify that the running kernel is the latest available and reboot if not. Check outputs of the following commands:

    cephadm@adm > uname -a
    cephadm@adm > rpm -qa kernel-default
  5. Verify that the ceph package is version 12.2.12 or newer. Verify that the deepsea package is version 0.8.9 or newer.

  6. If you previously used any of the bluestore_cache settings, they are no longer effective from ceph version 12.2.10. The new setting bluestore_cache_autotune which is set to 'true' by default disables manual cache sizing. To turn on the old behavior, you need to set bluestore_cache_autotune=false. Refer to Section 14.2.1, “Automatic Cache Sizing” for details.

6.5 Verify the Current Environment Edit source

  • If the system has obvious problems, fix them before starting the upgrade. Upgrading never fixes existing system problems.

  • Check cluster performance. You can use commands such as rados bench, ceph tell osd.* bench, or iperf3.

  • Verify access to gateways (such as iSCSI Gateway or Object Gateway) and RADOS Block Device.

  • Document specific parts of the system setup, such as network setup, partitioning, or installation details.

  • Use supportconfig to collect important system information and save it outside cluster nodes. Find more information in https://documentation.suse.com/sles/12-SP5/single-html/SLES-admin/#sec-admsupport-supportconfig.

  • Ensure there is enough free disk space on each cluster node. Check free disk space with df -h. When needed, free disk space by removing unneeded files/directories or removing obsolete OS snapshots. If there is not enough free disk space, do not continue with the upgrade until having freed enough disk space.

6.6 Check the Cluster's State Edit source

  • Check the cluster health command before starting the upgrade procedure. Do not start the upgrade unless each cluster node reports 'HEALTH_OK'.

  • Verify that all services are running:

    • Salt master and Salt master daemons.

    • Ceph Monitor and Ceph Manager daemons.

    • Metadata Server daemons.

    • Ceph OSD daemons.

    • Object Gateway daemons.

    • iSCSI Gateway daemons.

The following commands provide details of the cluster state and specific configuration:

ceph -s

Prints a brief summary of Ceph cluster health, running services, data usage, and I/O statistics. Verify that it reports 'HEALTH_OK' before starting the upgrade.

ceph health detail

Prints details if Ceph cluster health is not OK.

ceph versions

Prints versions of running Ceph daemons.

ceph df

Prints total and free disk space on the cluster. Do not start the upgrade if the cluster's free disk space is less than 25% of the total disk space.

salt '*' cephprocesses.check results=true

Prints running Ceph processes and their PIDs sorted by Salt minions.

ceph osd dump | grep ^flags

Verify that 'recovery_deletes' and 'purged_snapdirs' flags are present. If not, you can force a scrub on all placement groups by running the following command. Be aware that this forced scrub may possibly have a negative impact on your Ceph clients’ performance.

cephadm@adm > ceph pg dump pgs_brief | cut -d " " -f 1 | xargs -n1 ceph pg scrub

6.7 Offline Upgrade of CTDB Clusters Edit source

CTDB provides a clustered database used by Samba Gateways. The CTDB protocol does not support clusters of nodes communicating with different protocol versions. Therefore, CTDB nodes need to be taken offline prior to performing a SUSE Enterprise Storage upgrade.

CTDB refuses to start if it is running alongside an incompatible version. For example, if you start a SUSE Enterprise Storage 6 CTDB version while SUSE Enterprise Storage 5.5 CTDB versions are running, then it will fail.

To take the CTDB offline, stop the SLE-HA cloned CTDB resource. For example:

root@master # crm resource stop cl-ctdb

This will stop the resource across all gateway nodes (assigned to the cloned resource). Verify all the services are stopped by running the following command:

root@master # crm status
Note
Note

Ensure CTDB is brought offline prior to the SUSE Enterprise Storage 5.5 to SUSE Enterprise Storage 6 upgrade of the CTDB and Samba Gateway packages. SLE-HA may also specify requirements for the upgrade of the underlying pacemaker/Linux-HA cluster, these should be tracked separately.

The SLE-HA cloned CTDB resource can be restarted once the new packages have been installed on all Samba Gateway nodes and the underlying pacemaker/Linux-HA cluster is up. To restart the CTDB resource run the following command:

root@master # crm resource start cl-ctdb

6.8 Per Node Upgrade—Basic Procedure Edit source

To ensure the core cluster services are available during the upgrade, you need to upgrade the cluster nodes sequentially one by one. There are two ways you can perform the upgrade of a node: either using the installer DVD or using the distribution migration system.

After upgrading each node, we recommend running rpmconfigcheck to check for any updated configuration files that have been edited locally. If the command returns a list of file names with a suffix .rpmnew, .rpmorig, or .rpmsave, compare these files against the current configuration files to ensure that no local changes have been lost. If necessary, update the affected files. For more information on working with .rpmnew, .rpmorig, and .rpmsave files, refer to https://documentation.suse.com/sles/15-SP1/single-html/SLES-admin/#sec-rpm-packages-manage.

Tip
Tip: Orphaned Packages

After a node is upgraded, a number of packages will be in an 'orphaned' state without a parent repository. This happens because python3 related packages do not make python2 packages obsolete.

Find more information about listing orphaned packages in https://documentation.suse.com/sles/12-SP5/single-html/SLES-admin/#sec-zypper-softup-orphaned.

6.8.1 Manual Node Upgrade Using the Installer DVD Edit source

  1. Reboot the node from the SUSE Linux Enterprise Server 15 SP1 installer DVD/image.

  2. On the YaST command line, add the option YAST_ACTIVATE_LUKS=0. This option ensures that no password is asked for for encrypted disks.

    Warning
    Warning

    This must not be enabled by default as it would break full-disk encryption on the system disk or part of the system disk. This parameter only works if it is provided the installer. If not provided, you will be prompted for an encryption password for each individual disk partition.

    This is only supported since the 3rd Quarterly Update of SLES 15 SP1. You need SLE-15-SP1-Installer-DVD-*-QU3-DVD1.iso media or newer.

  3. Select Upgrade from the boot menu.

  4. On the Select the Migration Target screen, verify that 'SUSE Linux Enterprise Server 15 SP1' is selected and activate the Manually Adjust the Repositories for Migration check box.

    Select the Migration Target
    Figure 6.1: Select the Migration Target
  5. Select the following modules to install:

    • SUSE Enterprise Storage 6 x86_64

    • Basesystem Module 15 SP1 x86_64

    • Desktop Applications Module 15 SP1 x86_64

    • Legacy Module 15 SP1 x86_64

    • Server Applications Module 15 SP1 x86_64

  6. On the Previously Used Repositories screen, verify that the correct repositories are selected. If the system is not registered with SCC/SMT, you need to add the repositories manually.

    SUSE Enterprise Storage 6 requires:

    • SLE-Module-Basesystem15-SP1-Pool

    •  SLE-Module-Basesystem15-SP1-Updates

    •  SLE-Module-Server-Applications15-SP1-Pool

    •  SLE-Module-Server-Applications15-SP1-Updates

    • SLE-Module-Desktop-Applications15-SP1-Pool

    • SLE-Module-Desktop-Applications15-SP1-Updates

    •  SLE-Product-SLES15-SP1-Pool

    •  SLE-Product-SLES15-SP1-Updates

    •  SLE15-SP1-Installer-Updates

    •  SUSE-Enterprise-Storage-6-Pool

    •  SUSE-Enterprise-Storage-6-Updates

    If you intend to migrate ntpd to chronyd after SES migration (refer to Section 6.3, “Migrate from ntpd to chronyd), include the following repositories:

    • SLE-Module-Legacy15-SP1-Pool

    • SLE-Module-Legacy15-SP1-Updates

    NFS/SMB Gateway on SLE-HA on SUSE Linux Enterprise Server 15 SP1 requires:

    • SLE-Product-HA15-SP1-Pool

    • SLE-Product-HA15-SP1-Updates

  7. Review the Installation Settings and start the installation procedure by clicking Update.

6.8.2 Node Upgrade Using the SUSE Distribution Migration System Edit source

The Distribution Migration System (DMS) provides an upgrade path for an installed SUSE Linux Enterprise system from one major version to another. The following procedure utilizes DMS to upgrade SUSE Enterprise Storage 5.5 to version 6, including the underlying SUSE Linux Enterprise Server 12 SP3 to SUSE Linux Enterprise Server 15 SP1 migration.

Refer to https://documentation.suse.com/suse-distribution-migration-system/1.0/single-html/distribution-migration-system/ to find both general and detailed information about DMS.

  1. Install the migration RPM packages. They adjust the GRUB boot loader to automatically trigger the upgrade on next reboot. Install the SLES15-SES-Migration and suse-migration-sle15-activation packages:

    root@minion > zypper install SLES15-SES-Migration suse-migration-sle15-activation
    1. If the node being upgraded is registered with a repository staging system such as SCC, SMT, RMT, or SUSE Manager, create the /etc/sle-migration-service.yml with the following content:

      use_zypper_migration: true
      preserve:
        rules:
          - /etc/udev/rules.d/70-persistent-net.rules
    2. If the node being upgraded is not registered with a repository staging system such as SCC, SMT, RMT, or SUSE Manager, perform the following changes:

      1. Create the /etc/sle-migration-service.yml with the following content:

        use_zypper_migration: false
        preserve:
          rules:
            - /etc/udev/rules.d/70-persistent-net.rules
      2. Disable or remove the SLE 12 SP3 and SES 5 repos, and add the SLE 15 SP1 and SES6 repos. Find the list of related repositories in Section 6.4.1, “Required Software Repositories”.

  2. Reboot to start the upgrade. While the upgrade is running, you can log in to the upgraded node via ssh as the migration user using the existing SSH key from the host system as described in https://documentation.suse.com/suse-distribution-migration-system/1.0/single-html/distribution-migration-system/. For SUSE Enterprise Storage, if you have physical access or direct console access to the machine, you can also log in as root on the system console using the password sesupgrade. The node will reboot automatically after the upgrade.

    Tip
    Tip: Upgrade Failure

    If the upgrade fails, inspect /var/log/distro_migration.log. Fix the problem, re-install the migration RPM packages, and reboot the node.

6.9 Upgrade the Admin Node Edit source

Tip
Tip: Status of Cluster Nodes

After the Admin Node is upgraded, you can run the salt-run upgrade.status command to view useful information about cluster nodes. The command lists the Ceph and OS versions of all nodes, and recommends the order in which to upgrade any nodes that are still running old versions.

root@master # salt-run upgrade.status
The newest installed software versions are:
  ceph: ceph version 14.2.1-468-g994fd9e0cc (994fd9e0ccc50c2f3a55a3b7a3d4e0ba74786d50) nautilus (stable)
  os: SUSE Linux Enterprise Server 15 SP1

Nodes running these software versions:
  admin.ceph (assigned roles: master)
  mon2.ceph (assigned roles: admin, mon, mgr)

Nodes running older software versions must be upgraded in the following order:
   1: mon1.ceph (assigned roles: admin, mon, mgr)
   2: mon3.ceph (assigned roles: admin, mon, mgr)
   3: data1.ceph (assigned roles: storage)
[...]

6.10 Upgrade Ceph Monitor/Ceph Manager Nodes Edit source

  • If your cluster does not use MDS roles, upgrade MON/MGR nodes one by one.

  • If your cluster uses MDS roles, and MON/MGR and MDS roles are co-located, you need to shrink the MDS cluster and then upgrade the co-located nodes. Refer to Section 6.11, “Upgrade Metadata Servers” for more details.

  • If your cluster uses MDS roles and they run on dedicated servers, upgrade all MON/MGR nodes one by one, then shrink the MDS cluster and upgrade it. Refer to Section 6.11, “Upgrade Metadata Servers” for more details.

Note
Note: Ceph Monitor Upgrade

Due to a limitation in the Ceph Monitor design, once two MONs have been upgraded to SUSE Enterprise Storage 6 and have formed a quorum, the third MON (while still on SUSE Enterprise Storage 5.5) will not rejoin the MON cluster if it restarted for any reason, including a node reboot. Therefore, when two MONs have been upgraded it is best to upgrade the rest as soon as possible.

Use the procedure described in Section 6.8, “Per Node Upgrade—Basic Procedure”.

6.11 Upgrade Metadata Servers Edit source

You need to shrink the Metadata Server (MDS) cluster. Because of incompatible features between the SUSE Enterprise Storage 5.5 and 6 versions, the older MDS daemons will shut down as soon as they see a single SES 6 level MDS join the cluster. Therefore it is necessary to shrink the MDS cluster to a single active MDS (and no standbys) for the duration of the MDS node upgrades. As soon as the second node is upgraded, you can extend the MDS cluster again.

Tip
Tip

On a heavily loaded MDS cluster, you may need to reduce the load (for example by stopping clients) so that a single active MDS is able to handle the workload.

  1. Note the current value of the max_mds option:

    cephadm@adm > ceph fs get cephfs | grep max_mds
  2. Shrink the MDS cluster if you have more then 1 active MDS daemon, i.e. max_mds is > 1. To shrink the MDS cluster, run

    cephadm@adm > ceph fs set FS_NAME max_mds 1

    where FS_NAME is the name of your CephFS instance ('cephfs' by default).

  3. Find the node hosting one of the standby MDS daemons. Consult the output of the ceph fs status command and start the upgrade of the MDS cluster on this node.

    cephadm@adm > ceph fs status
    cephfs - 2 clients
    ======
    +------+--------+--------+---------------+-------+-------+
    | Rank | State  |  MDS   |    Activity   |  dns  |  inos |
    +------+--------+--------+---------------+-------+-------+
    |  0   | active | mon1-6 | Reqs:    0 /s |   13  |   16  |
    +------+--------+--------+---------------+-------+-------+
    +-----------------+----------+-------+-------+
    |       Pool      |   type   |  used | avail |
    +-----------------+----------+-------+-------+
    | cephfs_metadata | metadata | 2688k | 96.8G |
    |   cephfs_data   |   data   |    0  | 96.8G |
    +-----------------+----------+-------+-------+
    +-------------+
    | Standby MDS |
    +-------------+
    |    mon3-6   |
    |    mon2-6   |
    +-------------+

    In this example, you need to start the upgrade procedure either on node 'mon3-6' or 'mon2-6'.

  4. Upgrade the node with the standby MDS daemon. After the upgraded MDS node starts, the outdated MDS daemons will shut down automatically. At this point, clients may experience a short downtime of the CephFS service.

    Use the procedure described in Section 6.8, “Per Node Upgrade—Basic Procedure”.

  5. Upgrade the remaining MDS nodes.

  6. Reset max_mds to the desired configuration:

    cephadm@adm > ceph fs set FS_NAME max_mds ACTIVE_MDS_COUNT

6.12 Upgrade Ceph OSDs Edit source

For each storage node, follow these steps:

  1. Identify which OSD daemons are running on a particular node:

    cephadm@adm > ceph osd tree
  2. Set the noout flag for each OSD daemon on the node that is being upgraded:

    cephadm@adm > ceph osd add-noout osd.OSD_ID

    For example:

    cephadm@adm > for i in $(ceph osd ls-tree OSD_NODE_NAME);do echo "osd: $i"; ceph osd add-noout osd.$i; done

    Verify with:

    cephadm@adm > ceph health detail | grep noout

    or

    cephadm@adm > ceph –s
    cluster:
     id:     44442296-033b-3275-a803-345337dc53da
     health: HEALTH_WARN
          6 OSDs or CRUSH {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT} flags set
  3. Create /etc/ceph/osd/*.json files for all existing OSDs by running the following command on the node that is going to be upgraded:

    cephadm@osd > ceph-volume simple scan --force
  4. Upgrade the OSD node. Use the procedure described in Section 6.8, “Per Node Upgrade—Basic Procedure”.

  5. Activate all OSDs found in the system:

    cephadm@osd > ceph-volume simple activate --all
    Tip
    Tip: Activating Data Partitions Individually

    If you want to activate data partitions individually, you need to find the correct ceph-volume command for each partition to activate it. Replace X1 with the partition's correct letter/number:

     cephadm@osd > ceph-volume simple scan /dev/sdX1

    For example:

    cephadm@osd > ceph-volume simple scan /dev/vdb1
    [...]
    --> OSD 8 got scanned and metadata persisted to file:
    /etc/ceph/osd/8-d7bd2685-5b92-4074-8161-30d146cd0290.json
    --> To take over management of this scanned OSD, and disable ceph-disk
    and udev, run:
    -->     ceph-volume simple activate 8 d7bd2685-5b92-4074-8161-30d146cd0290

    The last line of the output contains the command to activate the partition:

    cephadm@osd > ceph-volume simple activate 8 d7bd2685-5b92-4074-8161-30d146cd0290
    [...]
    --> All ceph-disk systemd units have been disabled to prevent OSDs
    getting triggered by UDEV events
    [...]
    Running command: /bin/systemctl start ceph-osd@8
    --> Successfully activated OSD 8 with FSID
    d7bd2685-5b92-4074-8161-30d146cd0290
  6. Verify that the OSD node will start properly after the reboot.

  7. Address the 'Legacy BlueStore stats reporting detected on XX OSD(s)' message:

    cephadm@adm > ceph –s
    cluster:
     id:     44442296-033b-3275-a803-345337dc53da
     health: HEALTH_WARN
     Legacy BlueStore stats reporting detected on 6 OSD(s)

    The warning is normal when upgrading Ceph to 14.2.2. You can disable it by setting:

    bluestore_warn_on_legacy_statfs = false

    The proper fix is to run the following command on all OSDs while they are stopped:

    cephadm@osd > ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-XXX

    Following is a helper script that runs the ceph-bluestore-tool repair for all OSDs on the NODE_NAME node:

    cephadm@adm > OSDNODE=OSD_NODE_NAME;\
     for OSD in $(ceph osd ls-tree $OSDNODE);\
     do echo "osd=" $OSD;\
     salt $OSDNODE* cmd.run "systemctl stop ceph-osd@$OSD";\
     salt $OSDNODE* cmd.run "ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-$OSD";\
     salt $OSDNODE* cmd.run "systemctl start ceph-osd@$OSD";\
     done
  8. Unset the 'noout' flag for each OSD daemon on the node that is upgraded:

    cephadm@adm > ceph osd rm-noout osd.OSD_ID

    For example:

    cephadm@adm > for i in $(ceph osd ls-tree OSD_NODE_NAME);do echo "osd: $i"; ceph osd rm-noout osd.$i; done

    Verify with:

    cephadm@adm > ceph health detail | grep noout

    Note:

    cephadm@adm > ceph –s
    cluster:
     id:     44442296-033b-3275-a803-345337dc53da
     health: HEALTH_WARN
     Legacy BlueStore stats reporting detected on 6 OSD(s)
  9. Verify the cluster status. It will be similar to the following output:

    cephadm@adm > ceph status
    cluster:
      id:     e0d53d64-6812-3dfe-8b72-fd454a6dcf12
      health: HEALTH_WARN
              3 monitors have not enabled msgr2
    
    services:
      mon: 3 daemons, quorum mon1,mon2,mon3 (age 2h)
      mgr: mon2(active, since 22m), standbys: mon1, mon3
      osd: 30 osds: 30 up, 30 in
    
    data:
      pools:   1 pools, 1024 pgs
      objects: 0 objects, 0 B
      usage:   31 GiB used, 566 GiB / 597 GiB avail
      pgs:     1024 active+clean
  10. Once the last OSD node has been upgraded, issue the following command:

    cephadm@adm > ceph osd require-osd-release nautilus

    This disallows pre-SUSE Enterprise Storage 6 and Nautilus OSDs and enables all new SUSE Enterprise Storage 6 and Nautilus-only OSD functionality.

  11. Enable the new v2 network protocol by issuing the following command:

    cephadm@adm > ceph mon enable-msgr2

    This instructs all monitors that bind to the old default port 6789 for the legacy v1 Messenger protocol to also bind to the new 3300 v2 protocol port. To see if all monitors have been updated, run:

    cephadm@adm > ceph mon dump

    Verify that each monitor has both a v2: and v1: address listed.

  12. Verify that all OSD nodes were rebooted and that OSDs started automatically after the reboot.

6.13 Final Steps Edit source

Perform the following steps to complete the upgrade:

  1. For each host that has been upgraded — OSD, MON, MGR, MDS, and Gateway nodes, as well as client hosts — update your ceph.conf file so that it either specifies no monitor port (if you are running the monitors on the default ports) or references both the v2 and v1 addresses and ports explicitly.

    Note
    Note

    Things will still work if only the v1 IP and port are listed, but each CLI instantiation or daemon will need to reconnect after learning that the monitors also speak the v2 protocol. This slows things down and prevents a full transition to the v2 protocol.

  2. Finally, consider enabling the Telemetry module to send anonymized usage statistics and crash information to the upstream Ceph developers. To see what would be reported (without actually sending any information to anyone):

    cephadm@adm > ceph mgr module enable telemetry
    cephadm@adm > ceph telemetry show

    If you are comfortable with the high-level cluster metadata that will be reported, you can opt-in to automatically report it:

    cephadm@adm > ceph telemetry on

6.14 OSD Migration to BlueStore Edit source

OSD BlueStore is a new back-end for the OSD daemons. It is the default option since SUSE Enterprise Storage 5. Compared to FileStore, which stores objects as files in an XFS file system, BlueStore can deliver increased performance because it stores objects directly on the underlying block device. BlueStore also enables other features, such as built-in compression and EC overwrites, that are unavailable with FileStore.

Specifically for BlueStore, an OSD has a 'wal' (Write Ahead Log) device and a 'db' (RocksDB database) device. The RocksDB database holds the metadata for a BlueStore OSD. These two devices will reside on the same device as an OSD by default, but either can be placed on different, for example faster, media.

In SUSE Enterprise Storage 5, both FileStore and BlueStore are supported and it is possible for FileStore and BlueStore OSDs to co-exist in a single cluster. During the SUSE Enterprise Storage upgrade procedure, FileStore OSDs are not automatically converted to BlueStore. Be aware that the BlueStore-specific features will not be available on OSDs that have not been migrated to BlueStore.

Before converting to BlueStore, the OSDs need to be running SUSE Enterprise Storage 5. The conversion is a slow process as all data gets re-written twice. Though the migration process can take a long time to complete, there is no cluster outage and all clients can continue accessing the cluster during this period. However, do expect lower performance for the duration of the migration. This is caused by rebalancing and backfilling of cluster data.

Use the following procedure to migrate FileStore OSDs to BlueStore:

Tip
Tip: Turn Off Safety Measures

Salt commands needed for running the migration are blocked by safety measures. In order to turn these precautions off, run the following command:

 root@master # salt-run disengage.safety

Rebuild the nodes before continuing:

 root@master #  salt-run rebuild.node TARGET

You can also choose to rebuild each node individually. For example:

root@master #  salt-run rebuild.node data1.ceph

The rebuild.node always removes and recreates all OSDs on the node.

Important
Important

If one OSD fails to convert, re-running the rebuild destroys the already converted BlueStore OSDs. Instead of re-running the rebuild, you can run:

root@master # salt-run disks.deploy TARGET

After the migration to BlueStore, the object count will remain the same and disk usage will be nearly the same.

6.15 Upgrade Application Nodes Edit source

Upgrade application nodes in the following order:

  1. Object Gateways

    • If the Object Gateways are fronted by a load balancer, then a rolling upgrade of the Object Gateways should be possible without an outage.

    • Validate that the Object Gateway daemons are running after each upgrade, and test with S3/Swift client.

    • Use the procedure described in Section 6.8, “Per Node Upgrade—Basic Procedure”.

  2. iSCSI Gateways

    Important
    Important: Package Dependency Conflict

    After a package dependency is calculated, you need to resolve a package dependency conflict. It applies to the patterns-ses-ceph_iscsi version mismatch.

    Dependency Conflict Resolution
    Figure 6.2: Dependency Conflict Resolution

    From the four presented solutions, choose deinstalling the patterns-ses-ceph_iscsi pattern. This way you will keep the required lrbd package installed.

    • If iSCSI initiators are configured with multipath, then a rolling upgrade of the iSCSI Gateways should be possible without an outage.

    • Validate that the lrbd daemon is running after each upgrade, and test with initiator.

    • Use the procedure described in Section 6.8, “Per Node Upgrade—Basic Procedure”.

  3. NFS Ganesha. Use the procedure described in Section 6.8, “Per Node Upgrade—Basic Procedure”.

  4. Samba Gateways. Use the procedure described in Section 6.8, “Per Node Upgrade—Basic Procedure”.

6.16 Update policy.cfg and Deploy Ceph Dashboard Using DeepSea Edit source

On the Admin Node, edit /srv/pillar/ceph/proposals/policy.cfg and apply the following changes:

Important
Important: No New Services

During cluster upgrade, do not add new services to the policy.cfg file. Change the cluster architecture only after the upgrade is completed.

  1. Remove role-openattic.

  2. Add role-prometheus and role-grafana to the node that had Prometheus and Grafana installed, usually the Admin Node.

  3. Role profile-PROFILE_NAME is now ignored. Add new corresponding role, role-storage line. For example, for existing

    profile-default/cluster/*.sls

    add

    role-storage/cluster/*.sls
  4. Synchronize all Salt modules:

    root@master # salt '*' saltutil.sync_all
  5. Update the Salt pillar by running DeepSea stage 1 and stage 2:

    root@master # salt-run state.orch ceph.stage.1
    root@master # salt-run state.orch ceph.stage.2
  6. Clean up openATTIC:

    root@master # salt OA_MINION state.apply ceph.rescind.openattic
    root@master # salt OA_MINION state.apply ceph.remove.openattic
  7. Unset the restart_igw grain to prevent stage 0 from restarting iSCSI Gateway, which is not installed yet:

    root@master # salt '*' grains.delkey restart_igw
  8. Finally, run through DeepSea stages 0-4:

    root@master # salt-run state.orch ceph.stage.0
    root@master # salt-run state.orch ceph.stage.1
    root@master # salt-run state.orch ceph.stage.2
    root@master # salt-run state.orch ceph.stage.3
    root@master # salt-run state.orch ceph.stage.4
    Tip
    Tip: 'subvolume missing' Errors during Stage 3

    DeepSea stage 3 may fail with an error similar to the following:

    subvolume : ['/var/lib/ceph subvolume missing on 4510-2', \
    '/var/lib/ceph subvolume missing on 4510-1', \
    [...]
    'See /srv/salt/ceph/subvolume/README.md']

    In this case, you need to edit /srv/pillar/ceph/stack/global.yml and add the following line:

    subvolume_init: disabled

    Then refresh the Salt pillar and re-run DeepSea stage.3:

    root@master # salt '*' saltutil.refresh_pillar
     root@master # salt-run state.orch ceph.stage.3

    After DeepSea successfully finished stage.3, the Ceph Dashboard will be running. Refer to Chapter 20, Ceph Dashboard for a detailed overview of Ceph Dashboard features.

    To list nodes running dashboard, run:

    cephadm@adm > ceph mgr services | grep dashboard

    To list admin credentials, run:

    root@master # salt-call grains.get dashboard_creds
  9. Sequentially restart the Object Gateway services to use 'beast' Web server instead of the outdated 'civetweb':

    root@master # salt-run state.orch ceph.restart.rgw.force
  10. Before you continue, we strongly recommend enabling the Ceph telemetry module. For more information, see Section 10.2, “Telemetry Module” for information and instructions.

6.17 Migration from Profile-based Deployments to DriveGroups Edit source

In SUSE Enterprise Storage 5.5, DeepSea offered so called 'profiles' to describe the layout of your OSDs. Starting with SUSE Enterprise Storage 6, we moved to a different approach called DriveGroups (find more details in Section 5.5.2, “DriveGroups”).

Note
Note

Migrating to the new approach is not immediately mandatory. Destructive operations, such as salt-run osd.remove, salt-run osd.replace, or salt-run osd.purge are still available. However, adding new OSDs will require your action.

Because of the different approach of these implementations, we do not offer an automated migration path. However, we offer a variety of tools—Salt runners—to make the migration as simple as possible.

6.17.1 Analyze the Current Layout Edit source

To view information about the currently deployed OSDs, use the following command:

root@master # salt-run disks.discover

Alternatively, you can inspect the content of the files in the /srv/pillar/ceph/proposals/profile-*/ directories. They have a similar structure to the following:

ceph:
  storage:
    osds:
      /dev/disk/by-id/scsi-drive_name: format: bluestore
      /dev/disk/by-id/scsi-drive_name2: format: bluestore

6.17.2 Create DriveGroups Matching the Current Layout Edit source

Refer to Section 5.5.2.1, “Specification” for more details on DriveGroups specification.

The difference between a fresh deployment and upgrade scenario is that the drives to be migrated are already 'used'. Because

root@master # salt-run disks.list

looks for unused disks only, use

root@master # salt-run disks.list include_unavailable=True

Adjust DriveGroups until you match your current setup. For a more visual representation of what will be happening, use the following command. Note that it has no output if there are no free disks:

root@master # salt-run disks.report bypass_pillar=True

If you verified that your DriveGroups are properly configured and want to apply the new approach, remove the files from the /srv/pillar/ceph/proposals/profile-PROFILE_NAME/ directory, remove the corresponding profile-PROFILE_NAME/cluster/*.sls lines from the /srv/pillar/ceph/proposals/policy.cfg file, and run DeepSea stage 2 to refresh the Salt pillar.

root@master # salt-run state.orch ceph.stage.2

Verify the result by running the following commands:

root@master # salt target_node pillar.get ceph:storage
root@master # salt-run disks.report
Warning
Warning: Incorrect DriveGroups Configuration

If your DriveGroups are not properly configured and there are spare disks in your setup, they will be deployed in the way you specified them. We recommend running:

root@master # salt-run disks.report

6.17.3 OSD Deployment Edit source

As of the Mimic release, the ceph-disk tool is deprecated, and as of the Nautilus release it is no longer shipped upstream.

ceph-disk is still supported in SUSE Enterprise Storage 6. Any pre-deployed ceph-disk OSDs will continue to function normally. However, when a disk breaks there is no migration path: a disk will need to be re-deployed.

For completeness, consider migrating OSDs on the whole node. There are two paths for SUSE Enterprise Storage 6 users:

  • Keep OSDs deployed with ceph-disk: The simple command provides a way to take over the management while disabling ceph-disk triggers.

  • Re-deploy existings OSDs with ceph-volume. For more information on replacing your OSDs, see Section 2.8, “Replacing an OSD Disk”.

Tip
Tip: Migrate to LVM Format

Whenever a single legacy OSD needs to be replaced on a node, all OSDs that share devices with it need to be migrated to the LVM-based format.

6.17.4 More Complex Setups Edit source

If you have a more sophisticated setup than just stand-alone OSDs, for example dedicated WAL/DBs or encrypted OSDs, the migration can only happen when all OSDs assigned to that WAL/DB device are removed. This is due to the ceph-volume command that creates Logical Volumes on disks before deployment. This prevents the user from mixing partition based deployments with LV based deployments. In such cases it is best to manually remove all OSDs that are assigned to a WAL/DB device and re-deploy them using the DriveGroups approach.

Print this page