10 Upgrade from SUSE Enterprise Storage 6 to 7.1 #
This chapter introduces steps to upgrade SUSE Enterprise Storage 6 to version 7.1.
The upgrade includes the following tasks:
Upgrading from Ceph Nautilus to Pacific.
Switching from installing and running Ceph via RPM packages to running in containers.
Complete removal of DeepSea and replacing with
ceph-salt
and cephadm.
The upgrade information in this chapter only applies to upgrades from DeepSea to cephadm. Do not attempt to follow these instructions if you want to deploy SUSE Enterprise Storage on SUSE CaaS Platform.
Upgrading from SUSE Enterprise Storage versions older than 6 is not supported. First, you must upgrade to the latest version of SUSE Enterprise Storage 6, and then follow the steps in this chapter.
10.1 Before upgrading #
The following tasks must be completed before you start the upgrade. This can be done at any time during the SUSE Enterprise Storage 6 lifetime.
The OSD migration from FileStore to BlueStore must happen before the upgrade as FileStore unsupported in SUSE Enterprise Storage 7.1. Find more details about BlueStore and how to migrate from FileStore at https://documentation.suse.com/ses/6/html/ses-all/cha-ceph-upgrade.html#filestore2bluestore.
If you are running an older cluster that still uses
ceph-disk
OSDs, you need to switch toceph-volume
before the upgrade. Find more details in https://documentation.suse.com/ses/6/html/ses-all/cha-ceph-upgrade.html#upgrade-osd-deployment.
10.1.1 Points to consider #
Before upgrading, ensure you read through the following sections to ensure you understand all tasks that need to be executed.
Read the release notes. In them, you can find additional information on changes since the previous release of SUSE Enterprise Storage. Check the release notes to see whether:
Your hardware needs special considerations.
Any used software packages have changed significantly.
Special precautions are necessary for your installation.
The release notes also provide information that could not make it into the manual on time. They also contain notes about known issues.
You can find SES 7.1 release notes online at https://www.suse.com/releasenotes/.
Additionally, after having installed the package release-notes-ses from the SES 7.1 repository, find the release notes locally in the directory
/usr/share/doc/release-notes
or online at https://www.suse.com/releasenotes/.Read Part II, “Deploying Ceph Cluster” to familiarize yourself with
ceph-salt
and the Ceph orchestrator, and in particular the information on service specifications.The cluster upgrade may take a long time—approximately the time it takes to upgrade one machine multiplied by the number of cluster nodes.
You need to upgrade the Salt Master first, then replace DeepSea with
ceph-salt
and cephadm. You will not be able to start using the cephadm orchestrator module until at least all Ceph Manager nodes are upgraded.The upgrade from using Nautilus RPMs to Pacific containers needs to happen in a single step. This means upgrading an entire node at a time, not one daemon at a time.
The upgrade of core services (MON, MGR, OSD) happens in an orderly fashion. Each service is available during the upgrade. The gateway services (Metadata Server, Object Gateway, NFS Ganesha, iSCSI Gateway) need to be redeployed after the core services are upgraded. There is a certain amount of downtime for each of the following services:
- Important
Metadata Servers and Object Gateways are down from the time the nodes are upgraded from SUSE Linux Enterprise Server 15 SP1 to SUSE Linux Enterprise Server 15 SP3 until the services are redeployed at the end of the upgrade procedure. This is particularly important to bear in mind if these services are colocated with MONs, MGRs or OSDs as they may be down for the duration of the cluster upgrade. If this is going to be a problem, consider deploying these services separately on additional nodes before upgrading, so that they are down for the shortest possible time. This is the duration of the upgrade of the gateway nodes, not the duration of the upgrade of the entire cluster.
NFS Ganesha and iSCSI Gateways are down only while nodes are rebooting during upgrade from SUSE Linux Enterprise Server 15 SP1 to SUSE Linux Enterprise Server 15 SP3, and again briefly when each service is redeployed on the containerized mode.
10.1.2 Backing Up cluster configuration and data #
We strongly recommend backing up all cluster configuration and data before starting your upgrade to SUSE Enterprise Storage 7.1. For instructions on how to back up all your data, see Chapter 15, Backup and restore.
10.1.3 Verifying steps from the previous upgrade #
In case you previously upgraded from version 5, verify that the upgrade to version 6 was completed successfully:
Check for the existence of the
/srv/salt/ceph/configuration/files/ceph.conf.import
file.
This file is created by the engulf process during the upgrade from
SUSE Enterprise Storage 5 to 6. The configuration_init:
default-import
option is set in
/srv/pillar/ceph/proposals/config/stack/default/ceph/cluster.yml
.
If configuration_init
is still set to
default-import
, the cluster is using
ceph.conf.import
as its configuration file and not
DeepSea's default ceph.conf
, which is compiled from
files in
/srv/salt/ceph/configuration/files/ceph.conf.d/
.
Therefore, you need to inspect ceph.conf.import
for
any custom configuration, and possibly move the configuration to one of the
files in
/srv/salt/ceph/configuration/files/ceph.conf.d/
.
Then remove the configuration_init: default-import
line
from
/srv/pillar/ceph/proposals/config/stack/default/ceph/cluster.yml
.
10.1.4 Updating cluster nodes and verifying cluster health #
Verify that all latest updates of SUSE Linux Enterprise Server 15 SP1 and SUSE Enterprise Storage 6 are applied to all cluster nodes:
#
zypper refresh && zypper patch
Refer to https://documentation.suse.com/ses/6/html/ses-all/storage-salt-cluster.html#deepsea-rolling-updates for detailed information about updating the cluster nodes.
After updates are applied, restart the Salt Master, synchronize new Salt modules, and check the cluster health:
root@master #
systemctl restart salt-master.serviceroot@master #
salt '*' saltutil.sync_allcephuser@adm >
ceph -s
10.1.4.1 Disable insecure clients #
Since Nautilus v14.2.20, a new health warning was introduced
that informs you that insecure clients are allowed to join the cluster.
This warning is on by default. The Ceph Dashboard will
show the cluster in the HEALTH_WARN
status. The command
line verifies the cluster status as follows:
cephuser@adm >
ceph status
cluster:
id: 3fe8b35a-689f-4970-819d-0e6b11f6707c
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
[...]
This warning means that the Ceph Monitors are still allowing old, unpatched clients to connect to the cluster. This ensures existing clients can still connect while the cluster is being upgraded, but warns you that there is a problem that needs to be addressed. When the cluster and all clients are upgraded to the latest version of Ceph, disallow unpatched clients by running the following command:
cephuser@adm >
ceph config set mon auth_allow_insecure_global_id_reclaim false
10.1.4.2 Disable FSMap sanity check #
Before you start upgrading cluster nodes, disable the FSMap sanity check:
cephuser@adm >
ceph config set mon mon_mds_skip_sanity true
10.1.5 Verifying access to software repositories and container images #
Verify that each cluster node has access to the SUSE Linux Enterprise Server 15 SP3 and SUSE Enterprise Storage 7.1 software repositories, as well as the registry of container images.
10.1.5.1 Software repositories #
If all nodes are registered with SCC, you will be able to use the
zypper migration
command to upgrade. Refer to
https://documentation.suse.com/sles/15-SP3/html/SLES-all/cha-upgrade-online.html#sec-upgrade-online-zypper
for more details.
If nodes are not registered with SCC,
disable all existing software repositories and add both the
Pool
and Updates
repositories for
each of the following extensions:
SLE-Product-SLES/15-SP3
SLE-Module-Basesystem/15-SP3
SLE-Module-Server-Applications/15-SP3
SUSE-Enterprise-Storage-7.1
10.1.5.2 Container images #
All cluster nodes need access to the container image registry. In most
cases, you will use the public SUSE registry at
registry.suse.com
. You need the following images:
registry.suse.com/ses/7.1/ceph/ceph
registry.suse.com/ses/7.1/ceph/grafana
registry.suse.com/ses/7.1/ceph/prometheus-server
registry.suse.com/ses/7.1/ceph/prometheus-node-exporter
registry.suse.com/ses/7.1/ceph/prometheus-alertmanager
Alternatively—for example, for air-gapped deployments—configure a local registry and verify that you have the correct set of container images available. Refer to Section 7.2.10, “Using the container registry” for more details about configuring a local container image registry.
10.2 Upgrading the Salt Master #
The following procedure describes the process of upgrading the Salt Master:
Before continuing, ensure the steps have been followed from Section 7.2.10.2, “Configuring the path to container images”. Without this configuration, the podman image pull fails in cephadm, but will succeed in the terminal if the customer sets the following variables:
https_proxy= http_proxy=
Upgrade the underlying OS to SUSE Linux Enterprise Server 15 SP3:
For cluster whose all nodes are registered with SCC, run
zypper migration
.For cluster whose nodes have software repositories assigned manually, run
zypper dup
followed byreboot
.
Disable the DeepSea stages to avoid accidental use. Add the following content to
/srv/pillar/ceph/stack/global.yml
:stage_prep: disabled stage_discovery: disabled stage_configure: disabled stage_deploy: disabled stage_services: disabled stage_remove: disabled
Save the file and apply the changes:
root@master #
salt '*' saltutil.pillar_refreshIf you are not using container images from
registry.suse.com
but rather the locally configured registry, edit/srv/pillar/ceph/stack/global.yml
to inform DeepSea which Ceph container image and registry to use. For example, to use192.168.121.1:5000/my/ceph/image
add the following lines:ses7_container_image: 192.168.121.1:5000/my/ceph/image ses7_container_registries: - location: 192.168.121.1:5000
If you need to specify authentication information for the registry, add the
ses7_container_registry_auth:
block, for example:ses7_container_image: 192.168.121.1:5000/my/ceph/image ses7_container_registries: - location: 192.168.121.1:5000 ses7_container_registry_auth: registry: 192.168.121.1:5000 username: USER_NAME password: PASSWORD
Save the file and apply the changes:
root@master #
salt '*' saltutil.refresh_pillarAssimilate existing configuration:
cephuser@adm >
ceph config assimilate-conf -i /etc/ceph/ceph.confVerify the upgrade status. Your output may differ depending on your cluster configuration:
root@master #
salt-run upgrade.status The newest installed software versions are: ceph: ceph version 16.2.7-640-gceb23c7491b (ceb23c7491bd96ab7956111374219a4cdcf6f8f4) pacific (stable) os: SUSE Linux Enterprise Server 15 SP3 Nodes running these software versions: admin.ceph (assigned roles: master, prometheus, grafana) Nodes running older software versions must be upgraded in the following order: 1: mon1.ceph (assigned roles: admin, mon, mgr) 2: mon2.ceph (assigned roles: admin, mon, mgr) 3: mon3.ceph (assigned roles: admin, mon, mgr) 4: data4.ceph (assigned roles: storage, mds) 5: data1.ceph (assigned roles: storage) 6: data2.ceph (assigned roles: storage) 7: data3.ceph (assigned roles: storage) 8: data5.ceph (assigned roles: storage, rgw)
10.3 Upgrading the MON, MGR, and OSD nodes #
Upgrade the Ceph Monitor, Ceph Manager, and OSD nodes one at a time. For each service, follow these steps:
Before adopting any OSD node, you need to perform a format conversion of OSD nodes to improve the accounting for OMAP data. You can do so by running the following commands on the Admin Node:
cephuser@adm >
cephadm unit --name osd.OSD_DAEMON_ID stopcephuser@adm >
cephadm shell --name osd.OSD_DAEMON_ID ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-OSD_DAEMON_ID --command quick-fixcephuser@adm >
cephadm unit --name osd.OSD_DAEMON_ID startThe conversion may take minutes to hours, depending on how much OMAP data the related disk contains. For more details, refer to https://docs.ceph.com/en/latest/releases/pacific/#upgrading-non-cephadm-clusters.
TipYou can run the above commands in parallel on multiple OSD daemons on the same OSD node to help accelerate the upgrade.
If the node you are upgrading is an OSD node, avoid having the OSD marked
out
during the upgrade by running the following command:cephuser@adm >
ceph osd add-noout SHORT_NODE_NAMEReplace SHORT_NODE_NAME with the short name of the node as it appears in the output of the
ceph osd tree
command. In the following input, the short host names areses-node1
andses-node2
root@master #
ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.60405 root default -11 0.11691 host ses-node1 4 hdd 0.01949 osd.4 up 1.00000 1.00000 9 hdd 0.01949 osd.9 up 1.00000 1.00000 13 hdd 0.01949 osd.13 up 1.00000 1.00000 [...] -5 0.11691 host ses-node2 2 hdd 0.01949 osd.2 up 1.00000 1.00000 5 hdd 0.01949 osd.5 up 1.00000 1.00000 [...]Upgrade the underlying OS to SUSE Linux Enterprise Server 15 SP3:
If the cluster's nodes are all registered with SCC, run
zypper migration
.If the cluster's nodes have software repositories assigned manually, run
zypper dup
followed byreboot
.
If the node you are upgrading is an OSD node, then, after the OSD node with the Salt minion ID MINION_ID has been rebooted and is now up, run the following command:
root@master #
salt MINION_ID state.apply ceph.upgrade.ses7.adoptIf the node you are upgrading is not an OSD node, then after the node is rebooted, containerize all existing MON and MGR daemons on that node by running the following command on the Salt Master:
root@master #
salt MINION_ID state.apply ceph.upgrade.ses7.adoptReplace MINION_ID with the ID of the minion that you are upgrading. You can get the list of minion IDs by running the
salt-key -L
command on the Salt Master.TipTo see the status and progress of the adoption, check the Ceph Dashboard or run one of the following commands on the Salt Master:
root@master #
ceph statusroot@master #
ceph versionsroot@master #
salt-run upgrade.statusAfter the adoption has successfully finished, unset the
noout
flag if the node you are upgrading is an OSD node:cephuser@adm >
ceph osd rm-noout SHORT_NODE_NAME
10.4 Upgrading gateway nodes #
Upgrade your separate gateway nodes (Samba Gateway, Metadata Server, Object Gateway, NFS Ganesha, or iSCSI Gateway) next. Upgrade the underlying OS to SUSE Linux Enterprise Server 15 SP3 for each node:
If the cluster's nodes are all registered with SUSE Customer Center, run the
zypper migration
command.If the cluster's nodes have software repositories assigned manually, run the
zypper dup
followed by thereboot
commands.
This step also applies to any nodes that are part of the cluster, but do not
yet have any roles assigned (if in doubt, check the list of hosts on the
Salt Master provided by the salt-key -L
command and compare
it to the output of the salt-run upgrade.status
command).
When the OS is upgraded on all nodes in the cluster, the next step is to install the ceph-salt package and apply the cluster configuration. The actual gateway services are redeployed in a containerized mode at the end of the upgrade procedure.
To successfully upgrade the Metadata Server, ensure you reduce the Metadata Server to 1.
Run salt-run upgrade.status
to ensure that all Metadata Servers on
standby are stopped.
Ensure you stop the Metadata Server before upgrading the Ceph Monitor nodes as it may otherwise result in a failed quorum.
Metadata Server and Object Gateway services are unavailable from the time of upgrading to SUSE Linux Enterprise Server 15 SP3 until they are redeployed at the end of the upgrade procedure.
SUSE Enterprise Storage 7.1 does not use the
rgw_frontend_ssl_key
option. Instead, both the SSL key and
certificate are concatenated under the
rgw_frontend_ssl_certificate
option. If the Object Gateway
deployment uses the rgw_frontend_ssl_key
option, it will
not be available after the upgrade to SUSE Enterprise Storage 7.1.
In this case, the Object Gateway must be redeployed with the
rgw_frontend_ssl_certificate
option.
Refer to Section 8.3.4.1, “Using secure SSL access” for more
details.
10.5 Installing ceph-salt
and applying the cluster configuration #
Before you start the procedure of installing ceph-salt
and applying the
cluster configuration, check the cluster and upgrade status by running the
following commands:
root@master #
ceph statusroot@master #
ceph versionsroot@master #
salt-run upgrade.status
Remove the DeepSea-created
rbd_exporter
andrgw_exporter
cron jobs. On the Salt Master as theroot
run thecrontab -e
command to edit the crontab. Delete the following items if present:# SALT_CRON_IDENTIFIER:deepsea rbd_exporter cron job */5 * * * * /var/lib/prometheus/node-exporter/rbd.sh > \ /var/lib/prometheus/node-exporter/rbd.prom 2> /dev/null # SALT_CRON_IDENTIFIER:Prometheus rgw_exporter cron job */5 * * * * /var/lib/prometheus/node-exporter/ceph_rgw.py > \ /var/lib/prometheus/node-exporter/ceph_rgw.prom 2> /dev/null
Export cluster configuration from DeepSea, by running the following commands:
root@master #
salt-run upgrade.ceph_salt_config > ceph-salt-config.jsonroot@master #
salt-run upgrade.generate_service_specs > specs.yamlUninstall DeepSea and install
ceph-salt
on the Salt Master:root@master #
zypper remove 'deepsea*'root@master #
zypper install ceph-saltRestart the Salt Master and synchronize Salt modules:
root@master #
systemctl restart salt-master.serviceroot@master #
salt \* saltutil.sync_allImport DeepSea's cluster configuration into
ceph-salt
:root@master #
ceph-salt import ceph-salt-config.jsonGenerate SSH keys for cluster node communication:
root@master #
ceph-salt config /ssh generateTipVerify that the cluster configuration was imported from DeepSea and specify potentially missed options:
root@master #
ceph-salt config lsFor a complete description of cluster configuration, refer to Section 7.2, “Configuring cluster properties”.
Apply the configuration and enable cephadm:
root@master #
ceph-salt applyIf you need to supply local container registry URL and access credentials, follow the steps described in Section 7.2.10, “Using the container registry”.
If you are using container images from
registry.suse.com
, you need to set thecontainer_image
option:root@master #
ceph config set global container_image registry.suse.com/ses/7.1/ceph/ceph:latestIf you are not using container images from
registry.suse.com
but rather the locally-configured registry, inform Ceph which container image to use by running the following command:root@master #
ceph config set global container_image IMAGE_NAMEFor example:
root@master #
ceph config set global container_image 192.168.121.1:5000/my/ceph/image
Stop and disable the SUSE Enterprise Storage 6
ceph-crash
daemons. New containerized forms of these daemons are started later automatically.root@master #
salt '*' service.stop ceph-crashroot@master #
salt '*' service.disable ceph-crash
10.6 Upgrading and adopting the monitoring stack #
This following procedure adopts all components of the monitoring stack (see Chapter 16, Monitoring and alerting for more details).
Pause the orchestrator:
cephuser@adm >
ceph orch pauseOn whichever node is running Prometheus, Grafana and Alertmanager (the Salt Master by default), run the following commands:
cephuser@adm >
cephadm adopt --style=legacy --name prometheus.$(hostname)cephuser@adm >
cephadm adopt --style=legacy --name alertmanager.$(hostname)cephuser@adm >
cephadm adopt --style=legacy --name grafana.$(hostname)TipIf you are not running the default container image registry
registry.suse.com
, you need to specify the image to use on each command, for example:cephuser@adm >
cephadm --image 192.168.121.1:5000/ses/7.1/ceph/prometheus-server:2.32.1 \ adopt --style=legacy --name prometheus.$(hostname)cephuser@adm >
cephadm --image 192.168.121.1:5000/ses/7.1/ceph/prometheus-alertmanager:0.23.0 \ adopt --style=legacy --name alertmanager.$(hostname)cephuser@adm >
cephadm --image 192.168.121.1:5000/ses/7.1/ceph/grafana:8.3.10 \ adopt --style=legacy --name grafana.$(hostname)The container images required and their respective versions are listed in Section 16.1, “Configuring custom or local images”.
Remove Node-Exporter from all nodes. The Node-Exporter does not need to be migrated and will be re-installed as a container when the
specs.yaml
file is applied.>
sudo
zypper rm golang-github-prometheus-node_exporterAlternatively, you can remove Node-Exporter from all nodes simultaneously using Salt on the admin node:
root@master #
salt '*' pkg.remove golang-github-prometheus-node_exporterIf you are using a custom container image registry that requires authentication, run a login command to verify that the images can be pulled:
cephuser@adm >
ceph cephadm registry-login URL USERNAME PASSWORDApply the service specifications that you previously exported from DeepSea:
cephuser@adm >
ceph orch apply -i specs.yamlTipIf you are not running the default container image registry
registry.suse.com
, but a local container registry, configure cephadm to use the container image from the local registry for the deployment of Node-Exporter before deploying the Node-Exporter. Otherwise you can safely skip this step and ignore the following warning.cephuser@adm >
ceph config set mgr mgr/cephadm/container_image_node_exporter QUALIFIED_IMAGE_PATHMake sure that all container images for monitoring services point to the local registry, not only the one for Node-Exporter. This step requires you to do so for the Node-Exporter only, but it is advised than you set all the monitoring container images in cephadm to point to the local registry at this point.
If you do not do so, new deployments of monitoring services as well as re-deployments will use the default cephadm configuration and you may end up being unable to deploy services (in the case of air-gapped deployments), or with services deployed with mixed versions.
How cephadm needs to be configured to use container images from the local registry is described in Section 16.1, “Configuring custom or local images”.
Resume the orchestrator:
cephuser@adm >
ceph orch resume
10.7 Gateway service redeployment #
10.7.1 Upgrading the Object Gateway #
In SUSE Enterprise Storage 7.1, the Object Gateways are always configured with a
realm, which allows for multi-site (see Section 21.13, “Multisite Object Gateways” for
more details) in the future. If you used a single-site Object Gateway configuration
in SUSE Enterprise Storage 6, follow these steps to add a
realm. If you do not plan to use the multi-site functionality, you can use
default
for the realm, zonegroup and zone names.
Create a new realm:
cephuser@adm >
radosgw-admin realm create --rgw-realm=REALM_NAME --defaultOptionally, rename the default zone and zonegroup.
cephuser@adm >
radosgw-admin zonegroup rename \ --rgw-zonegroup default \ --zonegroup-new-name=ZONEGROUP_NAMEcephuser@adm >
radosgw-admin zone rename \ --rgw-zone default \ --zone-new-name ZONE_NAME \ --rgw-zonegroup=ZONEGROUP_NAMEConfigure the master zonegroup:
cephuser@adm >
radosgw-admin zonegroup modify \ --rgw-realm=REALM_NAME \ --rgw-zonegroup=ZONEGROUP_NAME \ --endpoints http://RGW.EXAMPLE.COM:80 \ --master --defaultConfigure the master zone. For this, you will need the ACCESS_KEY and SECRET_KEY of an Object Gateway user with the
system
flag enabled. This is usually theadmin
user. To get the ACCESS_KEY and SECRET_KEY, runradosgw-admin user info --uid admin --rgw-zone=ZONE_NAME
.cephuser@adm >
radosgw-admin zone modify \ --rgw-realm=REALM_NAME \ --rgw-zonegroup=ZONEGROUP_NAME \ --rgw-zone=ZONE_NAME \ --endpoints http://RGW.EXAMPLE.COM:80 \ --access-key=ACCESS_KEY \ --secret=SECRET_KEY \ --master --defaultCommit the updated configuration:
cephuser@adm >
radosgw-admin period update --commit
To have the Object Gateway service containerized, create its specification file as described in Section 8.3.4, “Deploying Object Gateways”, and apply it.
cephuser@adm >
ceph orch apply -i RGW.yml
After applying the new Object Gateway specification, run ceph config
dump
and inspect the lines that contain
client.rgw.
to see if there are any old settings that
need to be applied to the new Object Gateway instances.
10.7.2 Upgrading NFS Ganesha #
NFS Ganesha supports NFS version 4.1 and newer. It does not support NFS version 3.
The upgrade process disables the nfs
module in the Ceph Manager daemon.
You can re-enable it by executing the following command from the Admin Node:
cephuser@adm >
ceph mgr module enable nfs
The following demonstrates how to migrate an existing NFS Ganesha service running Ceph Nautilus to an NFS Ganesha container running Ceph Octopus.
The following documentation requires you to have already successfully upgraded the core Ceph services.
NFS Ganesha stores additional per-daemon configuration and exports
configuration in a RADOS pool. The configured RADOS pool can be found
on the watch_url
line of the
RADOS_URLS
block in the
ganesha.conf
file. By default, this pool will be named
ganesha_config
Before attempting any migration, we strongly recommend making a copy of the export and daemon configuration objects located in the RADOS pool. To locate the configured RADOS pool, run the following command:
cephuser@adm >
grep -A5 RADOS_URLS /etc/ganesha/ganesha.conf
To list the contents of the RADOS pool:
cephuser@adm >
rados --pool ganesha_config --namespace ganesha ls | sort
conf-node3
export-1
export-2
export-3
export-4
To copy the RADOS objects:
cephuser@adm >
RADOS_ARGS="--pool ganesha_config --namespace ganesha"cephuser@adm >
OBJS=$(rados $RADOS_ARGS ls)cephuser@adm >
for obj in $OBJS; do rados $RADOS_ARGS get $obj $obj; donecephuser@adm >
ls -lah total 40K drwxr-xr-x 2 root root 4.0K Sep 8 03:30 . drwx------ 9 root root 4.0K Sep 8 03:23 .. -rw-r--r-- 1 root root 90 Sep 8 03:30 conf-node2 -rw-r--r-- 1 root root 90 Sep 8 03:30 conf-node3 -rw-r--r-- 1 root root 350 Sep 8 03:30 export-1 -rw-r--r-- 1 root root 350 Sep 8 03:30 export-2 -rw-r--r-- 1 root root 350 Sep 8 03:30 export-3 -rw-r--r-- 1 root root 358 Sep 8 03:30 export-4
On a per-node basis, any existing NFS Ganesha service needs to be stopped and then replaced with a container managed by cephadm.
Stop and disable the existing NFS Ganesha service:
cephuser@adm >
systemctl stop nfs-ganeshacephuser@adm >
systemctl disable nfs-ganeshaAfter the existing NFS Ganesha service has been stopped, a new one can be deployed in a container using cephadm. To do so, you need to create a service specification that contains a
service_id
that will be used to identify this new NFS cluster, the host name of the node we are migrating listed as a host in the placement specification, and the RADOS pool and namespace that contains the configured NFS export objects. For example:service_type: nfs service_id: SERVICE_ID placement: hosts: - node2 pool: ganesha_config namespace: ganesha
For more information on creating a placement specification, see Section 8.2, “Service and placement specification”.
Apply the placement specification:
cephuser@adm >
ceph orch apply -i FILENAME.yamlConfirm the NFS Ganesha daemon is running on the host:
cephuser@adm >
ceph orch ps --daemon_type nfs NAME HOST STATUS REFRESHED AGE VERSION IMAGE NAME IMAGE ID CONTAINER ID nfs.foo.node2 node2 running (26m) 8m ago 27m 3.3 registry.suse.com/ses/7.1/ceph/ceph:latest 8b4be7c42abd c8b75d7c8f0dRepeat these steps for each NFS Ganesha node. You do not need to create a separate service specification for each node. It is sufficient to add each node's host name to the existing NFS service specification and re-apply it.
The existing exports can be migrated in two different ways:
Manually re-created or re-assigned using the Ceph Dashboard.
Manually copy the contents of each per-daemon RADOS object into the newly created NFS Ganesha common configuration.
Determine the list of per-daemon RADOS objects:
cephuser@adm >
RADOS_ARGS="--pool ganesha_config --namespace ganesha"cephuser@adm >
DAEMON_OBJS=$(rados $RADOS_ARGS ls | grep 'conf-')Make a copy of the per-daemon RADOS objects:
cephuser@adm >
for obj in $DAEMON_OBJS; do rados $RADOS_ARGS get $obj $obj; donecephuser@adm >
ls -lah total 20K drwxr-xr-x 2 root root 4.0K Sep 8 16:51 . drwxr-xr-x 3 root root 4.0K Sep 8 16:47 .. -rw-r--r-- 1 root root 90 Sep 8 16:51 conf-nfs.SERVICE_ID -rw-r--r-- 1 root root 90 Sep 8 16:51 conf-node2 -rw-r--r-- 1 root root 90 Sep 8 16:51 conf-node3Sort and merge into a single list of exports:
cephuser@adm >
cat conf-* | sort -u > conf-nfs.SERVICE_IDcephuser@adm >
cat conf-nfs.foo %url "rados://ganesha_config/ganesha/export-1" %url "rados://ganesha_config/ganesha/export-2" %url "rados://ganesha_config/ganesha/export-3" %url "rados://ganesha_config/ganesha/export-4"Write the new NFS Ganesha common configuration file:
cephuser@adm >
rados $RADOS_ARGS put conf-nfs.SERVICE_ID conf-nfs.SERVICE_IDNotify the NFS Ganesha daemon:
cephuser@adm >
rados $RADOS_ARGS notify conf-nfs.SERVICE_ID conf-nfs.SERVICE_IDNoteThis action will cause the daemon to reload the configuration.
After the service has been successfully migrated, the Nautilus-based NFS Ganesha service can be removed.
Remove NFS Ganesha:
cephuser@adm >
zypper rm nfs-ganesha Reading installed packages... Resolving package dependencies... The following 5 packages are going to be REMOVED: nfs-ganesha nfs-ganesha-ceph nfs-ganesha-rados-grace nfs-ganesha-rados-urls nfs-ganesha-rgw 5 packages to remove. After the operation, 308.9 KiB will be freed. Continue? [y/n/v/...? shows all options] (y): y (1/5) Removing nfs-ganesha-ceph-2.8.3+git0.d504d374e-3.3.1.x86_64 .................................................................................................................................................................................................................................................................................................[done] (2/5) Removing nfs-ganesha-rgw-2.8.3+git0.d504d374e-3.3.1.x86_64 ..................................................................................................................................................................................................................................................................................................[done] (3/5) Removing nfs-ganesha-rados-urls-2.8.3+git0.d504d374e-3.3.1.x86_64 ...........................................................................................................................................................................................................................................................................................[done] (4/5) Removing nfs-ganesha-rados-grace-2.8.3+git0.d504d374e-3.3.1.x86_64 ..........................................................................................................................................................................................................................................................................................[done] (5/5) Removing nfs-ganesha-2.8.3+git0.d504d374e-3.3.1.x86_64 ......................................................................................................................................................................................................................................................................................................[done] Additional rpm output: warning: /etc/ganesha/ganesha.conf saved as /etc/ganesha/ganesha.conf.rpmsaveRemove the legacy cluster settings from the Ceph Dashboard:
cephuser@adm >
ceph dashboard reset-ganesha-clusters-rados-pool-namespace
10.7.3 Upgrading the Metadata Server #
Unlike MONs, MGRs and OSDs, Metadata Server cannot be adopted in-place. Instead, you need to redeploy them in containers using the Ceph orchestrator.
Run the
ceph fs ls
command to obtain the name of your file system, for example:cephuser@adm >
ceph fs ls name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]Create a new service specification file
mds.yml
as described in Section 8.3.3, “Deploying Metadata Servers” by using the file system name as theservice_id
and specifying the hosts that will run the MDS daemons. For example:service_type: mds service_id: cephfs placement: hosts: - ses-node1 - ses-node2 - ses-node3
Run the
ceph orch apply -i mds.yml
command to apply the service specification and start the MDS daemons.
10.7.4 Upgrading the iSCSI Gateway #
To upgrade the iSCSI Gateway, you need to redeploy it in containers using the Ceph orchestrator. If you have multiple iSCSI Gateways, you need to redeploy them one-by-one to reduce the service downtime.
Stop and disable the existing iSCSI daemons on each iSCSI Gateway node:
>
sudo
systemctl stop rbd-target-gw>
sudo
systemctl disable rbd-target-gw>
sudo
systemctl stop rbd-target-api>
sudo
systemctl disable rbd-target-apiCreate a service specification for the iSCSI Gateway as described in Section 8.3.5, “Deploying iSCSI Gateways”. For this, you need the
pool
,trusted_ip_list
, andapi_*
settings from the existing/etc/ceph/iscsi-gateway.cfg
file. If you have SSL support enabled (api_secure = true
), you also need the SSL certificate (/etc/ceph/iscsi-gateway.crt
) and key (/etc/ceph/iscsi-gateway.key
).For example, if
/etc/ceph/iscsi-gateway.cfg
contains the following:[config] cluster_client_name = client.igw.ses-node5 pool = iscsi-images trusted_ip_list = 10.20.179.203,10.20.179.201,10.20.179.205,10.20.179.202 api_port = 5000 api_user = admin api_password = admin api_secure = true
Then you need to create the following service specification file
iscsi.yml
:service_type: iscsi service_id: igw placement: hosts: - ses-node5 spec: pool: iscsi-images trusted_ip_list: "10.20.179.203,10.20.179.201,10.20.179.205,10.20.179.202" api_port: 5000 api_user: admin api_password: admin api_secure: true ssl_cert: | -----BEGIN CERTIFICATE----- MIIDtTCCAp2gAwIBAgIYMC4xNzc1NDQxNjEzMzc2MjMyXzxvQ7EcMA0GCSqGSIb3 DQEBCwUAMG0xCzAJBgNVBAYTAlVTMQ0wCwYDVQQIDARVdGFoMRcwFQYDVQQHDA5T [...] -----END CERTIFICATE----- ssl_key: | -----BEGIN PRIVATE KEY----- MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQC5jdYbjtNTAKW4 /CwQr/7wOiLGzVxChn3mmCIF3DwbL/qvTFTX2d8bDf6LjGwLYloXHscRfxszX/4h [...] -----END PRIVATE KEY-----
NoteThe
pool
,trusted_ip_list
,api_port
,api_user
,api_password
,api_secure
settings are identical to the ones from the/etc/ceph/iscsi-gateway.cfg
file. Thessl_cert
andssl_key
values can be copied in from the existing SSL certificate and key files. Verify that they are indented correctly and the pipe character|
appears at the end of thessl_cert:
andssl_key:
lines (see the content of theiscsi.yml
file above).Run the
ceph orch apply -i iscsi.yml
command to apply the service specification and start the iSCSI Gateway daemons.Remove the old ceph-iscsi package from each of the existing iSCSI gateway nodes:
cephuser@adm >
zypper rm -u ceph-iscsi
10.8 Post-upgrade Clean-up #
After the upgrade, perform the following clean-up steps:
Verify that the cluster was successfully upgraded by checking the current Ceph version:
cephuser@adm >
ceph versionsCheck that the
osdspec_affinity
entry is properly set for existing OSDs. If theceph osd stat
command shows some OSDs as not running or unknown, refer to https://www.suse.com/support/kb/doc/?id=000020667 to get more details on properly mapp OSDs to a service specification.Make sure that no old OSDs will join the cluster:
cephuser@adm >
ceph osd require-osd-release pacificSet the
pg_autoscale_mode
of existing pools if necessary:ImportantPools in SUSE Enterprise Storage 6 had the
pg_autoscale_mode
set towarn
by default. This resulted in a warning message in case of suboptimal number of PGs, but autoscaling did not actually happen. The default in SUSE Enterprise Storage 7.1 is that thepg_autoscale_mode
option is set toon
for new pools, and PGs will actually autoscale. The upgrade process does not automatically change thepg_autoscale_mode
of existing pools. If you want to change it toon
to get the full benefit of the autoscaler, see the instructions in Section 17.4.12, “Enabling the PG auto-scaler”.Find more details in Section 17.4.12, “Enabling the PG auto-scaler”.
Set the FSMap sanity checks to the default value and then remove its configuration:
cephuser@adm >
ceph config set mon mon_mds_skip_sanity false
cephuser@adm >
ceph config rm mon mon_mds_skip_sanity
Prevent pre-Luminous clients:
cephuser@adm >
ceph osd set-require-min-compat-client luminousEnable the balancer module:
cephuser@adm >
ceph balancer mode upmapcephuser@adm >
ceph balancer onFind more details in Section 29.1, “Balancer”.
Optionally, enable the telemetry module:
cephuser@adm >
ceph mgr module enable telemetrycephuser@adm >
ceph telemetry onFind more details in Section 29.2, “Enabling the telemetry module”.