10 Ceph OSD management #
10.1 Ceph OSD management #
Ceph Object Storage Daemons (OSDs) are the heart and soul of the Ceph storage platform. Each OSD manages a local device and together they provide the distributed storage. Rook will automate creation and management of OSDs to hide the complexity based on the desired state in the CephCluster CR as much as possible. This guide will walk through some of the scenarios to configure OSDs where more configuration may be required.
10.1.1 Analyzing OSD health #
The rook-ceph-tools
pod provides a simple environment to
run Ceph tools. The Ceph commands mentioned in this document should be
run from the toolbox.
Once created, connect to the pod to execute the ceph
commands to analyze the health of the cluster, in particular the OSDs and
placement groups (PGs). Some common commands to analyze OSDs include:
cephuser@adm >
ceph statuscephuser@adm >
ceph osd treecephuser@adm >
ceph osd statuscephuser@adm >
ceph osd dfcephuser@adm >
ceph osd utilization
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
10.1.2 Adding an OSD #
To add more OSDs, Rook automatically watches for new nodes and devices
being added to your cluster. If they match the filters or other settings in
the storage
section of the cluster CR, the operator will
create new OSDs.
10.1.3 Adding an OSD on a PVC #
In more dynamic environments where storage can be dynamically provisioned with a raw block storage provider, the OSDs can be backed by PVCs.
To add more OSDs, you can either increase the count
of
the OSDs in an existing device set or you can add more device sets to the
cluster CR. The operator will then automatically create new OSDs according
to the updated cluster CR.
10.1.4 Removing an OSD #
Removal of OSDs is intentionally not automated. Rook’s charter is to keep your data safe, not to delete it. If you are sure you need to remove OSDs, it can be done. We just want you to be in control of this action.
To remove an OSD due to a failed disk or other re-configuration, consider the following to ensure the health of the data through the removal process:
Confirm you will have enough space on your cluster after removing your OSDs to properly handle the deletion.
Confirm the remaining OSDs and their placement groups (PGs) are healthy in order to handle the rebalancing of the data.
Do not remove too many OSDs at once, wait for rebalancing between removing multiple OSDs.
On host-based clusters, you may need to stop the Rook Operator while performing OSD removal steps in order to prevent Rook from detecting the old OSD and trying to re-create it before the disk is wiped or removed.
If all the PGs are active+clean
and there are no
warnings about being low on space, this means the data is fully replicated
and it is safe to proceed. If an OSD is failing, the PGs will not be
perfectly clean, and you will need to proceed anyway.
10.1.4.1 From the toolbox #
Determine the OSD ID for the OSD to be removed. The OSD pod may be in an error state, such as
CrashLoopBackoff
, or theceph
commands in the toolbox may show which OSD isdown
.Mark the OSD as
out
if not already marked as such by Ceph. This signals Ceph to start moving (backfilling) the data that was on that OSD to another OSD.ceph osd out osd.ID
For example:
cephuser
ceph osd out osd.23Wait for the data to finish backfilling to other OSDs.
ceph status
will indicate the backfilling is done when all of the PGs areactive+clean
. It is safe to remove the disk after that.Update your CephCluster CR such that the operator will not create an OSD on the device anymore. Depending on your CR settings, you may need to remove the device from the list or update the device filter. If you are using
useAllDevices: true
, no change to the CR is necessary.Remove the OSD from the Ceph cluster:
cephuser@adm >
ceph osd purge ID --yes-i-really-mean-itVerify the OSD is removed from the node in the CRUSH map:
cephuser@adm >
ceph osd tree
10.1.4.2 Removing the OSD deployment #
The operator can automatically remove OSD deployments that are considered “safe-to-destroy” by Ceph. After the steps above, the OSD will be considered safe to remove since the data has all been moved to other OSDs. But this will only be done automatically by the operator if you have this setting in the cluster CR:
removeOSDsIfOutAndSafeToRemove: true
Otherwise, you will need to delete the deployment directly:
kubectl@adm >
kubectl delete deployment -n rook-ceph rook-ceph-osd-ID
10.1.5 Replacing an OSD #
To replace a disk that has failed:
Run the steps in the previous section to Section 10.1.4, “Removing an OSD”.
Replace the physical device and verify the new device is attached.
Check if your cluster CR will find the new device. If you are using
useAllDevices: true
you can skip this step. If your cluster CR lists individual devices or uses a device filter you may need to update the CR.The operator ideally will automatically create the new OSD within a few minutes of adding the new device or updating the CR. If you do not see a new OSD automatically created, restart the operator (by deleting the operator pod) to trigger the OSD creation.
Verify if the OSD is created on the node by running
ceph osd tree
from the toolbox.
The OSD might have a different ID than the previous OSD that was replaced.
10.1.6 Removing an OSD from a PVC #
If you have installed your OSDs on top of PVCs and you desire to reduce the size of your cluster by removing OSDs:
Shrink the number of OSDs in the
storageClassDeviceSet
in the CephCluster CR.kubectl@adm >
kubectl -n rook-ceph edit cephcluster rook-cephReduce the
count
of the OSDs to the desired number. Rook will not take any action to automatically remove the extra OSD(s), but will effectively stop managing the orphaned OSD.Identify the orphaned PVC that belongs to the orphaned OSD.
NoteThe orphaned PVC will have the highest index among the PVCs for the device set.
kubectl@adm >
kubectl -n rook-ceph get pvc -l ceph.rook.io/DeviceSet=deviceSetFor example if the device set is named
set1
and thecount
was reduced from3
to2
, the orphaned PVC would have the index2
and might be namedset1-2-data-vbwcf
Identify the orphaned OSD.
NoteThe OSD assigned to the PVC can be found in the labels on the PVC.
kubectl@adm >
kubectl -n rook-ceph get pod -l ceph.rook.io/pvc=ORPHANED_PVC -o yaml | grep ceph-osd-idFor example, this might return:
cephuser@adm >
ceph-osd-id: "0"Now, proceed with the steps in the section above to Section 10.1.4, “Removing an OSD” for the orphaned OSD ID.
If desired, delete the orphaned PVC after the OSD is removed.