Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
documentation.suse.com / SUSE Enterprise Storage 7.1 Documentation / Deployment Guide / Upgrading from Previous Releases / Upgrade from SUSE Enterprise Storage 6 to 7.1
Applies to SUSE Enterprise Storage 7.1

10 Upgrade from SUSE Enterprise Storage 6 to 7.1

This chapter introduces steps to upgrade SUSE Enterprise Storage 6 to version 7.1.

The upgrade includes the following tasks:

  • Upgrading from Ceph Nautilus to Pacific.

  • Switching from installing and running Ceph via RPM packages to running in containers.

  • Complete removal of DeepSea and replacing with ceph-salt and cephadm.

Warning
Warning

The upgrade information in this chapter only applies to upgrades from DeepSea to cephadm. Do not attempt to follow these instructions if you want to deploy SUSE Enterprise Storage on SUSE CaaS Platform.

Important
Important

Upgrading from SUSE Enterprise Storage versions older than 6 is not supported. First, you must upgrade to the latest version of SUSE Enterprise Storage 6, and then follow the steps in this chapter.

10.1 Before upgrading

The following tasks must be completed before you start the upgrade. This can be done at any time during the SUSE Enterprise Storage 6 lifetime.

10.1.1 Points to consider

Before upgrading, ensure you read through the following sections to ensure you understand all tasks that need to be executed.

  • Read the release notes. In them, you can find additional information on changes since the previous release of SUSE Enterprise Storage. Check the release notes to see whether:

    • Your hardware needs special considerations.

    • Any used software packages have changed significantly.

    • Special precautions are necessary for your installation.

    The release notes also provide information that could not make it into the manual on time. They also contain notes about known issues.

    You can find SES 7.1 release notes online at https://www.suse.com/releasenotes/.

    Additionally, after having installed the package release-notes-ses from the SES 7.1 repository, find the release notes locally in the directory /usr/share/doc/release-notes or online at https://www.suse.com/releasenotes/.

  • Read Part II, “Deploying Ceph Cluster” to familiarize yourself with ceph-salt and the Ceph orchestrator, and in particular the information on service specifications.

  • The cluster upgrade may take a long time—approximately the time it takes to upgrade one machine multiplied by the number of cluster nodes.

  • You need to upgrade the Salt Master first, then replace DeepSea with ceph-salt and cephadm. You will not be able to start using the cephadm orchestrator module until at least all Ceph Manager nodes are upgraded.

  • The upgrade from using Nautilus RPMs to Pacific containers needs to happen in a single step. This means upgrading an entire node at a time, not one daemon at a time.

  • The upgrade of core services (MON, MGR, OSD) happens in an orderly fashion. Each service is available during the upgrade. The gateway services (Metadata Server, Object Gateway, NFS Ganesha, iSCSI Gateway) need to be redeployed after the core services are upgraded. There is a certain amount of downtime for each of the following services:

    • Important
      Important

      Metadata Servers and Object Gateways are down from the time the nodes are upgraded from SUSE Linux Enterprise Server 15 SP1 to SUSE Linux Enterprise Server 15 SP3 until the services are redeployed at the end of the upgrade procedure. This is particularly important to bear in mind if these services are colocated with MONs, MGRs or OSDs as they may be down for the duration of the cluster upgrade. If this is going to be a problem, consider deploying these services separately on additional nodes before upgrading, so that they are down for the shortest possible time. This is the duration of the upgrade of the gateway nodes, not the duration of the upgrade of the entire cluster.

    • NFS Ganesha and iSCSI Gateways are down only while nodes are rebooting during upgrade from SUSE Linux Enterprise Server 15 SP1 to SUSE Linux Enterprise Server 15 SP3, and again briefly when each service is redeployed on the containerized mode.

10.1.2 Backing Up cluster configuration and data

We strongly recommend backing up all cluster configuration and data before starting your upgrade to SUSE Enterprise Storage 7.1. For instructions on how to back up all your data, see Chapter 15, Backup and restore.

10.1.3 Verifying steps from the previous upgrade

In case you previously upgraded from version 5, verify that the upgrade to version 6 was completed successfully:

Check for the existence of the /srv/salt/ceph/configuration/files/ceph.conf.import file.

This file is created by the engulf process during the upgrade from SUSE Enterprise Storage 5 to 6. The configuration_init: default-import option is set in /srv/pillar/ceph/proposals/config/stack/default/ceph/cluster.yml.

If configuration_init is still set to default-import, the cluster is using ceph.conf.import as its configuration file and not DeepSea's default ceph.conf, which is compiled from files in /srv/salt/ceph/configuration/files/ceph.conf.d/.

Therefore, you need to inspect ceph.conf.import for any custom configuration, and possibly move the configuration to one of the files in /srv/salt/ceph/configuration/files/ceph.conf.d/.

Then remove the configuration_init: default-import line from /srv/pillar/ceph/proposals/config/stack/default/ceph/cluster.yml.

10.1.4 Updating cluster nodes and verifying cluster health

Verify that all latest updates of SUSE Linux Enterprise Server 15 SP1 and SUSE Enterprise Storage 6 are applied to all cluster nodes:

# zypper refresh && zypper patch
Tip
Tip

Refer to https://documentation.suse.com/ses/6/html/ses-all/storage-salt-cluster.html#deepsea-rolling-updates for detailed information about updating the cluster nodes.

After updates are applied, restart the Salt Master, synchronize new Salt modules, and check the cluster health:

root@master # systemctl restart salt-master.service
root@master # salt '*' saltutil.sync_all
cephuser@adm > ceph -s

10.1.4.1 Disable insecure clients

Since Nautilus v14.2.20, a new health warning was introduced that informs you that insecure clients are allowed to join the cluster. This warning is on by default. The Ceph Dashboard will show the cluster in the HEALTH_WARN status. The command line verifies the cluster status as follows:

 cephuser@adm > ceph status
 cluster:
   id:     3fe8b35a-689f-4970-819d-0e6b11f6707c
   health: HEALTH_WARN
   mons are allowing insecure global_id reclaim
 [...]

This warning means that the Ceph Monitors are still allowing old, unpatched clients to connect to the cluster. This ensures existing clients can still connect while the cluster is being upgraded, but warns you that there is a problem that needs to be addressed. When the cluster and all clients are upgraded to the latest version of Ceph, disallow unpatched clients by running the following command:

cephuser@adm > ceph config set mon auth_allow_insecure_global_id_reclaim false

10.1.4.2 Disable FSMap sanity check

Before you start upgrading cluster nodes, disable the FSMap sanity check:

cephuser@adm > ceph config set mon mon_mds_skip_sanity true

10.1.5 Verifying access to software repositories and container images

Verify that each cluster node has access to the SUSE Linux Enterprise Server 15 SP3 and SUSE Enterprise Storage 7.1 software repositories, as well as the registry of container images.

10.1.5.1 Software repositories

If all nodes are registered with SCC, you will be able to use the zypper migration command to upgrade. Refer to https://documentation.suse.com/sles/15-SP3/html/SLES-all/cha-upgrade-online.html#sec-upgrade-online-zypper for more details.

If nodes are not registered with SCC, disable all existing software repositories and add both the Pool and Updates repositories for each of the following extensions:

  • SLE-Product-SLES/15-SP3

  • SLE-Module-Basesystem/15-SP3

  • SLE-Module-Server-Applications/15-SP3

  • SUSE-Enterprise-Storage-7.1

10.1.5.2 Container images

All cluster nodes need access to the container image registry. In most cases, you will use the public SUSE registry at registry.suse.com. You need the following images:

  • registry.suse.com/ses/7.1/ceph/ceph

  • registry.suse.com/ses/7.1/ceph/grafana

  • registry.suse.com/ses/7.1/ceph/prometheus-server

  • registry.suse.com/ses/7.1/ceph/prometheus-node-exporter

  • registry.suse.com/ses/7.1/ceph/prometheus-alertmanager

Alternatively—for example, for air-gapped deployments—configure a local registry and verify that you have the correct set of container images available. Refer to Section 7.2.10, “Using the container registry” for more details about configuring a local container image registry.

10.2 Upgrading the Salt Master

The following procedure describes the process of upgrading the Salt Master:

Important
Important

Before continuing, ensure the steps have been followed from Section 7.2.10.2, “Configuring the path to container images”. Without this configuration, the podman image pull fails in cephadm, but will succeed in the terminal if the customer sets the following variables:

https_proxy=
http_proxy=
  1. Upgrade the underlying OS to SUSE Linux Enterprise Server 15 SP3:

    • For cluster whose all nodes are registered with SCC, run zypper migration.

    • For cluster whose nodes have software repositories assigned manually, run zypper dup followed by reboot.

  2. Disable the DeepSea stages to avoid accidental use. Add the following content to /srv/pillar/ceph/stack/global.yml:

    stage_prep: disabled
    stage_discovery: disabled
    stage_configure: disabled
    stage_deploy: disabled
    stage_services: disabled
    stage_remove: disabled

    Save the file and apply the changes:

    root@master # salt '*' saltutil.pillar_refresh
  3. If you are not using container images from registry.suse.com but rather the locally configured registry, edit /srv/pillar/ceph/stack/global.yml to inform DeepSea which Ceph container image and registry to use. For example, to use 192.168.121.1:5000/my/ceph/image add the following lines:

    ses7_container_image: 192.168.121.1:5000/my/ceph/image
    ses7_container_registries:
      - location: 192.168.121.1:5000

    If you need to specify authentication information for the registry, add the ses7_container_registry_auth: block, for example:

    ses7_container_image: 192.168.121.1:5000/my/ceph/image
    ses7_container_registries:
      - location: 192.168.121.1:5000
    ses7_container_registry_auth:
      registry: 192.168.121.1:5000
      username: USER_NAME
      password: PASSWORD

    Save the file and apply the changes:

    root@master # salt '*' saltutil.refresh_pillar
  4. Assimilate existing configuration:

    cephuser@adm > ceph config assimilate-conf -i /etc/ceph/ceph.conf
  5. Verify the upgrade status. Your output may differ depending on your cluster configuration:

    root@master # salt-run upgrade.status
    The newest installed software versions are:
     ceph: ceph version 16.2.7-640-gceb23c7491b (ceb23c7491bd96ab7956111374219a4cdcf6f8f4) pacific (stable)
     os: SUSE Linux Enterprise Server 15 SP3
    
    Nodes running these software versions:
     admin.ceph (assigned roles: master, prometheus, grafana)
    
    Nodes running older software versions must be upgraded in the following order:
     1: mon1.ceph (assigned roles: admin, mon, mgr)
     2: mon2.ceph (assigned roles: admin, mon, mgr)
     3: mon3.ceph (assigned roles: admin, mon, mgr)
     4: data4.ceph (assigned roles: storage, mds)
     5: data1.ceph (assigned roles: storage)
     6: data2.ceph (assigned roles: storage)
     7: data3.ceph (assigned roles: storage)
     8: data5.ceph (assigned roles: storage, rgw)

10.3 Upgrading the MON, MGR, and OSD nodes

Upgrade the Ceph Monitor, Ceph Manager, and OSD nodes one at a time. For each service, follow these steps:

  1. Before adopting any OSD node, you need to perform a format conversion of OSD nodes to improve the accounting for OMAP data. You can do so by running the following commands on the Admin Node:

    cephuser@adm > cephadm unit --name osd.OSD_DAEMON_ID stop
    cephuser@adm > cephadm shell --name osd.OSD_DAEMON_ID ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-OSD_DAEMON_ID --command quick-fix
    cephuser@adm > cephadm unit --name osd.OSD_DAEMON_ID start

    The conversion may take minutes to hours, depending on how much OMAP data the related disk contains. For more details, refer to https://docs.ceph.com/en/latest/releases/pacific/#upgrading-non-cephadm-clusters.

    Tip
    Tip

    You can run the above commands in parallel on multiple OSD daemons on the same OSD node to help accelerate the upgrade.

  2. If the node you are upgrading is an OSD node, avoid having the OSD marked out during the upgrade by running the following command:

    cephuser@adm > ceph osd add-noout SHORT_NODE_NAME

    Replace SHORT_NODE_NAME with the short name of the node as it appears in the output of the ceph osd tree command. In the following input, the short host names are ses-node1 and ses-node2

    root@master # ceph osd tree
    ID   CLASS  WEIGHT   TYPE NAME       STATUS  REWEIGHT  PRI-AFF
     -1         0.60405  root default
    -11         0.11691      host ses-node1
      4    hdd  0.01949          osd.4       up   1.00000  1.00000
      9    hdd  0.01949          osd.9       up   1.00000  1.00000
     13    hdd  0.01949          osd.13      up   1.00000  1.00000
    [...]
     -5         0.11691      host ses-node2
      2    hdd  0.01949          osd.2       up   1.00000  1.00000
      5    hdd  0.01949          osd.5       up   1.00000  1.00000
    [...]
  3. Upgrade the underlying OS to SUSE Linux Enterprise Server 15 SP3:

    • If the cluster's nodes are all registered with SCC, run zypper migration.

    • If the cluster's nodes have software repositories assigned manually, run zypper dup followed by reboot.

  4. If the node you are upgrading is an OSD node, then, after the OSD node with the Salt minion ID MINION_ID has been rebooted and is now up, run the following command:

    root@master # salt MINION_ID state.apply ceph.upgrade.ses7.adopt
  5. If the node you are upgrading is not an OSD node, then after the node is rebooted, containerize all existing MON and MGR daemons on that node by running the following command on the Salt Master:

    root@master # salt MINION_ID state.apply ceph.upgrade.ses7.adopt

    Replace MINION_ID with the ID of the minion that you are upgrading. You can get the list of minion IDs by running the salt-key -L command on the Salt Master.

    Tip
    Tip

    To see the status and progress of the adoption, check the Ceph Dashboard or run one of the following commands on the Salt Master:

    root@master # ceph status
    root@master # ceph versions
    root@master # salt-run upgrade.status
  6. After the adoption has successfully finished, unset the noout flag if the node you are upgrading is an OSD node:

    cephuser@adm > ceph osd rm-noout SHORT_NODE_NAME

10.4 Upgrading gateway nodes

Upgrade your separate gateway nodes (Samba Gateway, Metadata Server, Object Gateway, NFS Ganesha, or iSCSI Gateway) next. Upgrade the underlying OS to SUSE Linux Enterprise Server 15 SP3 for each node:

  • If the cluster's nodes are all registered with SUSE Customer Center, run the zypper migration command.

  • If the cluster's nodes have software repositories assigned manually, run the zypper dup followed by the reboot commands.

This step also applies to any nodes that are part of the cluster, but do not yet have any roles assigned (if in doubt, check the list of hosts on the Salt Master provided by the salt-key -L command and compare it to the output of the salt-run upgrade.status command).

When the OS is upgraded on all nodes in the cluster, the next step is to install the ceph-salt package and apply the cluster configuration. The actual gateway services are redeployed in a containerized mode at the end of the upgrade procedure.

Important
Important

To successfully upgrade the Metadata Server, ensure you reduce the Metadata Server to 1.

Run salt-run upgrade.status to ensure that all Metadata Servers on standby are stopped.

Ensure you stop the Metadata Server before upgrading the Ceph Monitor nodes as it may otherwise result in a failed quorum.

Note
Note

Metadata Server and Object Gateway services are unavailable from the time of upgrading to SUSE Linux Enterprise Server 15 SP3 until they are redeployed at the end of the upgrade procedure.

Important
Important

SUSE Enterprise Storage 7.1 does not use the rgw_frontend_ssl_key option. Instead, both the SSL key and certificate are concatenated under the rgw_frontend_ssl_certificate option. If the Object Gateway deployment uses the rgw_frontend_ssl_key option, it will not be available after the upgrade to SUSE Enterprise Storage 7.1. In this case, the Object Gateway must be redeployed with the rgw_frontend_ssl_certificate option. Refer to Section 8.3.4.1, “Using secure SSL access” for more details.

10.5 Installing ceph-salt and applying the cluster configuration

Before you start the procedure of installing ceph-salt and applying the cluster configuration, check the cluster and upgrade status by running the following commands:

root@master # ceph status
root@master # ceph versions
root@master # salt-run upgrade.status
  1. Remove the DeepSea-created rbd_exporter and rgw_exporter cron jobs. On the Salt Master as the root run the crontab -e command to edit the crontab. Delete the following items if present:

    # SALT_CRON_IDENTIFIER:deepsea rbd_exporter cron job
    */5 * * * * /var/lib/prometheus/node-exporter/rbd.sh > \
     /var/lib/prometheus/node-exporter/rbd.prom 2> /dev/null
    # SALT_CRON_IDENTIFIER:Prometheus rgw_exporter cron job
    */5 * * * * /var/lib/prometheus/node-exporter/ceph_rgw.py > \
     /var/lib/prometheus/node-exporter/ceph_rgw.prom 2> /dev/null
  2. Export cluster configuration from DeepSea, by running the following commands:

    root@master # salt-run upgrade.ceph_salt_config > ceph-salt-config.json
    root@master # salt-run upgrade.generate_service_specs > specs.yaml
  3. Uninstall DeepSea and install ceph-salt on the Salt Master:

    root@master # zypper remove 'deepsea*'
    root@master # zypper install ceph-salt
  4. Restart the Salt Master and synchronize Salt modules:

    root@master # systemctl restart salt-master.service
    root@master # salt \* saltutil.sync_all
  5. Import DeepSea's cluster configuration into ceph-salt:

    root@master # ceph-salt import ceph-salt-config.json
  6. Generate SSH keys for cluster node communication:

    root@master # ceph-salt config /ssh generate
    Tip
    Tip

    Verify that the cluster configuration was imported from DeepSea and specify potentially missed options:

    root@master # ceph-salt config ls

    For a complete description of cluster configuration, refer to Section 7.2, “Configuring cluster properties”.

  7. Apply the configuration and enable cephadm:

    root@master # ceph-salt apply
  8. If you need to supply local container registry URL and access credentials, follow the steps described in Section 7.2.10, “Using the container registry”.

    1. If you are using container images from registry.suse.com, you need to set the container_image option:

      root@master # ceph config set global container_image registry.suse.com/ses/7.1/ceph/ceph:latest
    2. If you are not using container images from registry.suse.com but rather the locally-configured registry, inform Ceph which container image to use by running the following command:

      root@master # ceph config set global container_image IMAGE_NAME

      For example:

      root@master # ceph config set global container_image 192.168.121.1:5000/my/ceph/image
  9. Stop and disable the SUSE Enterprise Storage 6 ceph-crash daemons. New containerized forms of these daemons are started later automatically.

    root@master # salt '*' service.stop ceph-crash
    root@master # salt '*' service.disable ceph-crash

10.6 Upgrading and adopting the monitoring stack

This following procedure adopts all components of the monitoring stack (see Chapter 16, Monitoring and alerting for more details).

  1. Pause the orchestrator:

    cephuser@adm > ceph orch pause
  2. On whichever node is running Prometheus, Grafana and Alertmanager (the Salt Master by default), run the following commands:

    cephuser@adm > cephadm adopt --style=legacy --name prometheus.$(hostname)
    cephuser@adm > cephadm adopt --style=legacy --name alertmanager.$(hostname)
    cephuser@adm > cephadm adopt --style=legacy --name grafana.$(hostname)
    Tip
    Tip

    If you are not running the default container image registry registry.suse.com, you need to specify the image to use on each command, for example:

    cephuser@adm > cephadm --image 192.168.121.1:5000/ses/7.1/ceph/prometheus-server:2.32.1 \
      adopt --style=legacy --name prometheus.$(hostname)
    cephuser@adm > cephadm --image 192.168.121.1:5000/ses/7.1/ceph/prometheus-alertmanager:0.23.0 \
      adopt --style=legacy --name alertmanager.$(hostname)
    cephuser@adm > cephadm --image 192.168.121.1:5000/ses/7.1/ceph/grafana:8.3.10 \
     adopt --style=legacy --name grafana.$(hostname)

    The container images required and their respective versions are listed in Section 16.1, “Configuring custom or local images”.

  3. Remove Node-Exporter from all nodes. The Node-Exporter does not need to be migrated and will be re-installed as a container when the specs.yaml file is applied.

    > sudo zypper rm golang-github-prometheus-node_exporter

    Alternatively, you can remove Node-Exporter from all nodes simultaneously using Salt on the admin node:

    root@master # salt '*' pkg.remove golang-github-prometheus-node_exporter
  4. If you are using a custom container image registry that requires authentication, run a login command to verify that the images can be pulled:

    cephuser@adm > ceph cephadm registry-login URL USERNAME PASSWORD
  5. Apply the service specifications that you previously exported from DeepSea:

    cephuser@adm > ceph orch apply -i specs.yaml
    Tip
    Tip

    If you are not running the default container image registry registry.suse.com, but a local container registry, configure cephadm to use the container image from the local registry for the deployment of Node-Exporter before deploying the Node-Exporter. Otherwise you can safely skip this step and ignore the following warning.

    cephuser@adm > ceph config set mgr mgr/cephadm/container_image_node_exporter QUALIFIED_IMAGE_PATH

    Make sure that all container images for monitoring services point to the local registry, not only the one for Node-Exporter. This step requires you to do so for the Node-Exporter only, but it is advised than you set all the monitoring container images in cephadm to point to the local registry at this point.

    If you do not do so, new deployments of monitoring services as well as re-deployments will use the default cephadm configuration and you may end up being unable to deploy services (in the case of air-gapped deployments), or with services deployed with mixed versions.

    How cephadm needs to be configured to use container images from the local registry is described in Section 16.1, “Configuring custom or local images”.

  6. Resume the orchestrator:

    cephuser@adm > ceph orch resume

10.7 Gateway service redeployment

10.7.1 Upgrading the Object Gateway

In SUSE Enterprise Storage 7.1, the Object Gateways are always configured with a realm, which allows for multi-site (see Section 21.13, “Multisite Object Gateways” for more details) in the future. If you used a single-site Object Gateway configuration in SUSE Enterprise Storage 6, follow these steps to add a realm. If you do not plan to use the multi-site functionality, you can use default for the realm, zonegroup and zone names.

  1. Create a new realm:

    cephuser@adm > radosgw-admin realm create --rgw-realm=REALM_NAME --default
  2. Optionally, rename the default zone and zonegroup.

    cephuser@adm > radosgw-admin zonegroup rename \
     --rgw-zonegroup default \
     --zonegroup-new-name=ZONEGROUP_NAME
    cephuser@adm > radosgw-admin zone rename \
     --rgw-zone default \
     --zone-new-name ZONE_NAME \
     --rgw-zonegroup=ZONEGROUP_NAME
  3. Configure the master zonegroup:

    cephuser@adm > radosgw-admin zonegroup modify \
     --rgw-realm=REALM_NAME \
     --rgw-zonegroup=ZONEGROUP_NAME \
     --endpoints http://RGW.EXAMPLE.COM:80 \
     --master --default
  4. Configure the master zone. For this, you will need the ACCESS_KEY and SECRET_KEY of an Object Gateway user with the system flag enabled. This is usually the admin user. To get the ACCESS_KEY and SECRET_KEY, run radosgw-admin user info --uid admin --rgw-zone=ZONE_NAME.

    cephuser@adm > radosgw-admin zone modify \
     --rgw-realm=REALM_NAME \
     --rgw-zonegroup=ZONEGROUP_NAME \
     --rgw-zone=ZONE_NAME \
     --endpoints http://RGW.EXAMPLE.COM:80 \
     --access-key=ACCESS_KEY \
     --secret=SECRET_KEY \
     --master --default
  5. Commit the updated configuration:

    cephuser@adm > radosgw-admin period update --commit

To have the Object Gateway service containerized, create its specification file as described in Section 8.3.4, “Deploying Object Gateways”, and apply it.

cephuser@adm > ceph orch apply -i RGW.yml
Tip
Tip

After applying the new Object Gateway specification, run ceph config dump and inspect the lines that contain client.rgw. to see if there are any old settings that need to be applied to the new Object Gateway instances.

10.7.2 Upgrading NFS Ganesha

Important
Important

NFS Ganesha supports NFS version 4.1 and newer. It does not support NFS version 3.

Important
Important

The upgrade process disables the nfs module in the Ceph Manager daemon. You can re-enable it by executing the following command from the Admin Node:

cephuser@adm > ceph mgr module enable nfs

The following demonstrates how to migrate an existing NFS Ganesha service running Ceph Nautilus to an NFS Ganesha container running Ceph Octopus.

Warning
Warning

The following documentation requires you to have already successfully upgraded the core Ceph services.

NFS Ganesha stores additional per-daemon configuration and exports configuration in a RADOS pool. The configured RADOS pool can be found on the watch_url line of the RADOS_URLS block in the ganesha.conf file. By default, this pool will be named ganesha_config

Before attempting any migration, we strongly recommend making a copy of the export and daemon configuration objects located in the RADOS pool. To locate the configured RADOS pool, run the following command:

cephuser@adm > grep -A5 RADOS_URLS /etc/ganesha/ganesha.conf

To list the contents of the RADOS pool:

cephuser@adm > rados --pool ganesha_config --namespace ganesha ls | sort
  conf-node3
  export-1
  export-2
  export-3
  export-4

To copy the RADOS objects:

cephuser@adm > RADOS_ARGS="--pool ganesha_config --namespace ganesha"
cephuser@adm > OBJS=$(rados $RADOS_ARGS ls)
cephuser@adm > for obj in $OBJS; do rados $RADOS_ARGS get $obj $obj; done
cephuser@adm > ls -lah
total 40K
drwxr-xr-x 2 root root 4.0K Sep 8 03:30 .
drwx------ 9 root root 4.0K Sep 8 03:23 ..
-rw-r--r-- 1 root root 90 Sep 8 03:30 conf-node2
-rw-r--r-- 1 root root 90 Sep 8 03:30 conf-node3
-rw-r--r-- 1 root root 350 Sep 8 03:30 export-1
-rw-r--r-- 1 root root 350 Sep 8 03:30 export-2
-rw-r--r-- 1 root root 350 Sep 8 03:30 export-3
-rw-r--r-- 1 root root 358 Sep 8 03:30 export-4

On a per-node basis, any existing NFS Ganesha service needs to be stopped and then replaced with a container managed by cephadm.

  1. Stop and disable the existing NFS Ganesha service:

    cephuser@adm > systemctl stop nfs-ganesha
    cephuser@adm > systemctl disable nfs-ganesha
  2. After the existing NFS Ganesha service has been stopped, a new one can be deployed in a container using cephadm. To do so, you need to create a service specification that contains a service_id that will be used to identify this new NFS cluster, the host name of the node we are migrating listed as a host in the placement specification, and the RADOS pool and namespace that contains the configured NFS export objects. For example:

    service_type: nfs
    service_id: SERVICE_ID
    placement:
      hosts:
      - node2
      pool: ganesha_config
      namespace: ganesha

    For more information on creating a placement specification, see Section 8.2, “Service and placement specification”.

  3. Apply the placement specification:

    cephuser@adm > ceph orch apply -i FILENAME.yaml
  4. Confirm the NFS Ganesha daemon is running on the host:

    cephuser@adm > ceph orch ps --daemon_type nfs
    NAME           HOST   STATUS         REFRESHED  AGE  VERSION  IMAGE NAME                                IMAGE ID      CONTAINER ID
    nfs.foo.node2  node2  running (26m)  8m ago     27m  3.3      registry.suse.com/ses/7.1/ceph/ceph:latest  8b4be7c42abd  c8b75d7c8f0d
  5. Repeat these steps for each NFS Ganesha node. You do not need to create a separate service specification for each node. It is sufficient to add each node's host name to the existing NFS service specification and re-apply it.

The existing exports can be migrated in two different ways:

  • Manually re-created or re-assigned using the Ceph Dashboard.

  • Manually copy the contents of each per-daemon RADOS object into the newly created NFS Ganesha common configuration.

Procedure 10.1: Manually copying exports to NFS Ganesha common configuration file
  1. Determine the list of per-daemon RADOS objects:

    cephuser@adm > RADOS_ARGS="--pool ganesha_config --namespace ganesha"
    cephuser@adm > DAEMON_OBJS=$(rados $RADOS_ARGS ls | grep 'conf-')
  2. Make a copy of the per-daemon RADOS objects:

    cephuser@adm > for obj in $DAEMON_OBJS; do rados $RADOS_ARGS get $obj $obj; done
    cephuser@adm > ls -lah
    total 20K
    drwxr-xr-x 2 root root 4.0K Sep 8 16:51 .
    drwxr-xr-x 3 root root 4.0K Sep 8 16:47 ..
    -rw-r--r-- 1 root root 90 Sep 8 16:51 conf-nfs.SERVICE_ID
    -rw-r--r-- 1 root root 90 Sep 8 16:51 conf-node2
    -rw-r--r-- 1 root root 90 Sep 8 16:51 conf-node3
  3. Sort and merge into a single list of exports:

    cephuser@adm > cat conf-* | sort -u > conf-nfs.SERVICE_ID
    cephuser@adm > cat conf-nfs.foo
    %url "rados://ganesha_config/ganesha/export-1"
    %url "rados://ganesha_config/ganesha/export-2"
    %url "rados://ganesha_config/ganesha/export-3"
    %url "rados://ganesha_config/ganesha/export-4"
  4. Write the new NFS Ganesha common configuration file:

    cephuser@adm > rados $RADOS_ARGS put conf-nfs.SERVICE_ID conf-nfs.SERVICE_ID
  5. Notify the NFS Ganesha daemon:

    cephuser@adm > rados $RADOS_ARGS notify conf-nfs.SERVICE_ID conf-nfs.SERVICE_ID
    Note
    Note

    This action will cause the daemon to reload the configuration.

After the service has been successfully migrated, the Nautilus-based NFS Ganesha service can be removed.

  1. Remove NFS Ganesha:

    cephuser@adm > zypper rm nfs-ganesha
    Reading installed packages...
    Resolving package dependencies...
    The following 5 packages are going to be REMOVED:
      nfs-ganesha nfs-ganesha-ceph nfs-ganesha-rados-grace nfs-ganesha-rados-urls nfs-ganesha-rgw
    5 packages to remove.
    After the operation, 308.9 KiB will be freed.
    Continue? [y/n/v/...? shows all options] (y): y
    (1/5) Removing nfs-ganesha-ceph-2.8.3+git0.d504d374e-3.3.1.x86_64 .................................................................................................................................................................................................................................................................................................[done]
    (2/5) Removing nfs-ganesha-rgw-2.8.3+git0.d504d374e-3.3.1.x86_64 ..................................................................................................................................................................................................................................................................................................[done]
    (3/5) Removing nfs-ganesha-rados-urls-2.8.3+git0.d504d374e-3.3.1.x86_64 ...........................................................................................................................................................................................................................................................................................[done]
    (4/5) Removing nfs-ganesha-rados-grace-2.8.3+git0.d504d374e-3.3.1.x86_64 ..........................................................................................................................................................................................................................................................................................[done]
    (5/5) Removing nfs-ganesha-2.8.3+git0.d504d374e-3.3.1.x86_64 ......................................................................................................................................................................................................................................................................................................[done]
    Additional rpm output:
    warning: /etc/ganesha/ganesha.conf saved as /etc/ganesha/ganesha.conf.rpmsave
  2. Remove the legacy cluster settings from the Ceph Dashboard:

    cephuser@adm > ceph dashboard reset-ganesha-clusters-rados-pool-namespace

10.7.3 Upgrading the Metadata Server

Unlike MONs, MGRs and OSDs, Metadata Server cannot be adopted in-place. Instead, you need to redeploy them in containers using the Ceph orchestrator.

  1. Run the ceph fs ls command to obtain the name of your file system, for example:

    cephuser@adm > ceph fs ls
    name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
  2. Create a new service specification file mds.yml as described in Section 8.3.3, “Deploying Metadata Servers” by using the file system name as the service_id and specifying the hosts that will run the MDS daemons. For example:

    service_type: mds
    service_id: cephfs
    placement:
      hosts:
      - ses-node1
      - ses-node2
      - ses-node3
  3. Run the ceph orch apply -i mds.yml command to apply the service specification and start the MDS daemons.

10.7.4 Upgrading the iSCSI Gateway

To upgrade the iSCSI Gateway, you need to redeploy it in containers using the Ceph orchestrator. If you have multiple iSCSI Gateways, you need to redeploy them one-by-one to reduce the service downtime.

  1. Stop and disable the existing iSCSI daemons on each iSCSI Gateway node:

    > sudo systemctl stop rbd-target-gw
    > sudo systemctl disable rbd-target-gw
    > sudo systemctl stop rbd-target-api
    > sudo systemctl disable rbd-target-api
  2. Create a service specification for the iSCSI Gateway as described in Section 8.3.5, “Deploying iSCSI Gateways”. For this, you need the pool, trusted_ip_list, and api_* settings from the existing /etc/ceph/iscsi-gateway.cfg file. If you have SSL support enabled (api_secure = true), you also need the SSL certificate (/etc/ceph/iscsi-gateway.crt) and key (/etc/ceph/iscsi-gateway.key).

    For example, if /etc/ceph/iscsi-gateway.cfg contains the following:

    [config]
    cluster_client_name = client.igw.ses-node5
    pool = iscsi-images
    trusted_ip_list = 10.20.179.203,10.20.179.201,10.20.179.205,10.20.179.202
    api_port = 5000
    api_user = admin
    api_password = admin
    api_secure = true

    Then you need to create the following service specification file iscsi.yml:

    service_type: iscsi
    service_id: igw
    placement:
      hosts:
      - ses-node5
    spec:
      pool: iscsi-images
      trusted_ip_list: "10.20.179.203,10.20.179.201,10.20.179.205,10.20.179.202"
      api_port: 5000
      api_user: admin
      api_password: admin
      api_secure: true
      ssl_cert: |
        -----BEGIN CERTIFICATE-----
        MIIDtTCCAp2gAwIBAgIYMC4xNzc1NDQxNjEzMzc2MjMyXzxvQ7EcMA0GCSqGSIb3
        DQEBCwUAMG0xCzAJBgNVBAYTAlVTMQ0wCwYDVQQIDARVdGFoMRcwFQYDVQQHDA5T
        [...]
        -----END CERTIFICATE-----
      ssl_key: |
        -----BEGIN PRIVATE KEY-----
        MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQC5jdYbjtNTAKW4
        /CwQr/7wOiLGzVxChn3mmCIF3DwbL/qvTFTX2d8bDf6LjGwLYloXHscRfxszX/4h
        [...]
        -----END PRIVATE KEY-----
    Note
    Note

    The pool, trusted_ip_list, api_port, api_user, api_password, api_secure settings are identical to the ones from the /etc/ceph/iscsi-gateway.cfg file. The ssl_cert and ssl_key values can be copied in from the existing SSL certificate and key files. Verify that they are indented correctly and the pipe character | appears at the end of the ssl_cert: and ssl_key: lines (see the content of the iscsi.yml file above).

  3. Run the ceph orch apply -i iscsi.yml command to apply the service specification and start the iSCSI Gateway daemons.

  4. Remove the old ceph-iscsi package from each of the existing iSCSI gateway nodes:

    cephuser@adm > zypper rm -u ceph-iscsi

10.8 Post-upgrade Clean-up

After the upgrade, perform the following clean-up steps:

  1. Verify that the cluster was successfully upgraded by checking the current Ceph version:

    cephuser@adm > ceph versions
  2. Check that the osdspec_affinity entry is properly set for existing OSDs. If the ceph osd stat command shows some OSDs as not running or unknown, refer to https://www.suse.com/support/kb/doc/?id=000020667 to get more details on properly mapp OSDs to a service specification.

  3. Make sure that no old OSDs will join the cluster:

    cephuser@adm > ceph osd require-osd-release pacific
  4. Set the pg_autoscale_mode of existing pools if necessary:

    Important
    Important

    Pools in SUSE Enterprise Storage 6 had the pg_autoscale_mode set to warn by default. This resulted in a warning message in case of suboptimal number of PGs, but autoscaling did not actually happen. The default in SUSE Enterprise Storage 7.1 is that the pg_autoscale_mode option is set to on for new pools, and PGs will actually autoscale. The upgrade process does not automatically change the pg_autoscale_mode of existing pools. If you want to change it to on to get the full benefit of the autoscaler, see the instructions in Section 17.4.12, “Enabling the PG auto-scaler”.

    Find more details in Section 17.4.12, “Enabling the PG auto-scaler”.

  5. Set the FSMap sanity checks to the default value and then remove its configuration:

    cephuser@adm > ceph config set mon mon_mds_skip_sanity false
    cephuser@adm > ceph config rm mon mon_mds_skip_sanity
  6. Prevent pre-Luminous clients:

    cephuser@adm > ceph osd set-require-min-compat-client luminous
  7. Enable the balancer module:

    cephuser@adm > ceph balancer mode upmap
    cephuser@adm > ceph balancer on

    Find more details in Section 29.1, “Balancer”.

  8. Optionally, enable the telemetry module:

    cephuser@adm > ceph mgr module enable telemetry
    cephuser@adm > ceph telemetry on

    Find more details in Section 29.2, “Enabling the telemetry module”.