D Ceph maintenance updates based on upstream 'Octopus' point releases #
Several key packages in SUSE Enterprise Storage 7 are based on the Octopus release series of Ceph. When the Ceph project (https://github.com/ceph/ceph) publishes new point releases in the Octopus series, SUSE Enterprise Storage 7 is updated to ensure that the product benefits from the latest upstream bug fixes and feature backports.
This chapter contains summaries of notable changes contained in each upstream point release that has been—or is planned to be—included in the product.
Octopus 15.2.11 Point Release#
This release includes a security fix that ensures the
global_id
value (a numeric value that should be unique for
every authenticated client or daemon in the cluster) is reclaimed after a
network disconnect or ticket renewal in a secure fashion. Two new health
alerts may appear during the upgrade indicating that there are clients or
daemons that are not yet patched with the appropriate fix.
To temporarily mute the health alerts around insecure clients for the duration of the upgrade, you may want to run:
cephuser@adm >
ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM 1hcephuser@adm >
ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1h
When all clients are updated, enable the new secure behavior, not allowing old insecure clients to join the cluster:
cephuser@adm >
ceph config set mon auth_allow_insecure_global_id_reclaim false
For more details, refer ro https://docs.ceph.com/en/latest/security/CVE-2021-20288/.
Octopus 15.2.10 Point Release#
This backport release includes the following fixes:
The containers include an updated
tcmalloc
that avoids crashes seen on 15.2.9.RADOS: BlueStore handling of huge (>4GB) writes from RocksDB to BlueFS has been fixed.
When upgrading from a previous cephadm release,
systemctl
may hang when trying to start or restart the monitoring containers. This is caused by a change in thesystemd
unit to usetype=forking
.) After the upgrade, please run:cephuser@adm >
ceph orch redeploy nfscephuser@adm >
ceph orch redeploy iscsicephuser@adm >
ceph orch redeploy node-exportercephuser@adm >
ceph orch redeploy prometheuscephuser@adm >
ceph orch redeploy grafanacephuser@adm >
ceph orch redeploy alertmanager
Octopus 15.2.9 Point Release#
This backport release includes the following fixes:
MGR: progress module can now be turned on/off, using the commands:
ceph progress on
andceph progress off
.OSD: PG removal has been optimized in this release.
Octopus 15.2.8 Point Release#
This release fixes a security flaw in CephFS and includes a number of bug fixes:
OpenStack Manila use of
ceph_volume_client.py
library allowed tenant access to any Ceph credential’s secret.ceph-volume
: Thelvm batch
subcommand received a major rewrite. This closed a number of bugs and improves usability in terms of size specification and calculation, as well as idempotency behaviour and disk replacement process. Please refer to https://docs.ceph.com/en/latest/ceph-volume/lvm/batch/ for more detailed information.MON: The cluster log now logs health detail every
mon_health_to_clog_interval
, which has been changed from 1hr to 10min. Logging of health detail will be skipped if there is no change in health summary since last known.The
ceph df
command now lists the number of PGs in each pool.The
bluefs_preextend_wal_files
option has been removed.It is now possible to specify the initial monitor to contact for Ceph tools and daemons using the
mon_host_override
config option or--mon-host-override
command line switch. This generally should only be used for debugging and only affects initial communication with Ceph's monitor cluster.
Octopus 15.2.7 Point Release#
This release fixes a serious bug in RGW that has been shown to cause data
loss when a read of a large RGW object (for example, one with at least one
tail segment) takes longer than one half the time specified in the
configuration option rgw_gc_obj_min_wait
. The bug causes the
tail segments of that read object to be added to the RGW garbage collection
queue, which will in turn cause them to be deleted after a period of time.
Octopus 15.2.6 Point Release#
This releases fixes a security flaw affecting Messenger V2 for Octopus and Nautilus.
Octopus 15.2.5 Point Release#
The Octopus point release 15.2.5 brought the following fixes and other changes:
CephFS: Automatic static sub-tree partitioning policies may now be configured using the new distributed and random ephemeral pinning extended attributes on directories. See the following documentation for more information: https://docs.ceph.com/docs/master/cephfs/multimds/
Monitors now have a configuration option
mon_osd_warn_num_repaired
, which is set to 10 by default. If any OSD has repaired more than this many I/O errors in stored data aOSD_TOO_MANY_REPAIRS
health warning is generated.Now, when
no scrub
and/orno deep-scrub
flags are set globally or per pool, scheduled scrubs of the type disabled will be aborted. All user initiated scrubs are NOT interrupted.Fixed an issue with osdmaps not being trimmed in a healthy cluster.
Octopus 15.2.4 Point Release#
The Octopus point release 15.2.4 brought the following fixes and other changes:
CVE-2020-10753: rgw: sanitize newlines in s3 CORSConfiguration’s ExposeHeader
Object Gateway: The
radosgw-admin
sub-commands dealing with orphans—radosgw-admin orphans find
,radosgw-admin orphans finish
, andradosgw-admin orphans list-jobs
—have been deprecated. They had not been actively maintained, and since they store intermediate results on the cluster, they could potentially fill a nearly-full cluster. They have been replaced by a tool,rgw-orphan-list
, which is currently considered experimental.RBD: The name of the RBD pool object that is used to store RBD trash purge schedule is changed from
rbd_trash_trash_purge_schedule
torbd_trash_purge_schedule
. Users that have already started using RBD trash purge schedule functionality and have per pool or name space schedules configured should copy therbd_trash_trash_purge_schedule
object torbd_trash_purge_schedule
before the upgrade and removerbd_trash_purge_schedule
using the following commands in every RBD pool and name space where a trash purge schedule was previously configured:rados -p pool-name [-N namespace] cp rbd_trash_trash_purge_schedule rbd_trash_purge_schedule rados -p pool-name [-N namespace] rm rbd_trash_trash_purge_schedule
Alternatively, use any other convenient way to restore the schedule after the upgrade.
Octopus 15.2.3 Point Release#
The Octopus point release 15.2.3 was a hot-fix release to address an issue where WAL corruption was seen when
bluefs_preextend_wal_files
andbluefs_buffered_io
were enabled at the same time. The fix in 15.2.3 is only a temporary measure (changing the default value ofbluefs_preextend_wal_files
tofalse
). The permanent fix will be to remove thebluefs_preextend_wal_files
option completely: this fix will most likely arrive in the 15.2.6 point release.
Octopus 15.2.2 Point Release#
The Octopus point release 15.2.2 patched one security vulnerability:
CVE-2020-10736: Fixed an authorization bypass in MONs and MGRs
Octopus 15.2.1 Point Release#
The Octopus point release 15.2.1 fixed an issue where upgrading quickly from Luminous (SES5.5) to Nautilus (SES6) to Octopus (SES7) caused OSDs to crash. In addition, it patched two security vulnerabilities that were present in the initial Octopus (15.2.0) release:
CVE-2020-1759: Fixed nonce reuse in msgr V2 secure mode
CVE-2020-1760: Fixed XSS because of RGW GetObject header-splitting