This is unreleased documentation for SUSE® Storage 1.9.0 (Dev). |
Orphaned Instance Cleanup
SUSE Storage can identify and clean up orphaned instances on each node.
Orphaned Runtime Instance
When a network outage affects a SUSE Storage node, it might leave behind engine or replica runtime instances that are no longer tracked by the SUSE Storage system. The corresponding engine and replica Custom Resources (CRs) might be removed or rescheduled to another node during the outage. When the node recovers, the SUSE Storage system no longer tracks these runtime instances. These instances, such as the engine and replica processes for v1 volumes, are called orphaned instances. Orphaned instances continue to consume node resources like CPU and memory.
SUSE Storage supports the detection and cleanup of orphaned instances. It identifies these instances and creates orphan
resources that describe them. By default, SUSE Storage does not automatically delete orphan
resources. Users can trigger the deletion of orphaned instances manually or configure SUSE Storage to delete them automatically.
Example
The following example demonstrates how to manage orphaned instances using kubectl
.
Manage Orphaned Instances via kubectl
-
Introduce nodes that orphaned instance processes running on
-
Orphaned replica instance on Node
worker1
Name: instance-manager-8ff396d6d3744979b32abafc6346781c Namespace: longhorn-system Kind: InstanceManager ... Status: Instance Replicas: pvc-569e44c0-b352-4aca-bf14-2cf7a6cfe86f-r-05660b73: # This instance might be an orphan Spec: Data Engine: v1 Name: pvc-569e44c0-b352-4aca-bf14-2cf7a6cfe86f-r-05660b73 Status: Conditions: <nil> Endpoint: Error Msg: Listen: Port End: 10020 Port Start: 10011 Resource Version: 0 State: running Target Port End: 0 Target Port Start:0 Type: replica ...
-
Orphaned engine instance on Node
worker2
Name: instance-manager-b87f10b867cec1dca2b814f5e78bcc90 Namespace: longhorn-system Kind: InstanceManager ... Status: Instance Engines: pvc-569e44c0-b352-4aca-bf14-2cf7a6cfe86f-e-0: # This instance might be an orphan Spec: Data Engine: v1 Name: pvc-569e44c0-b352-4aca-bf14-2cf7a6cfe86f-e-0 Status: Conditions: Filesystem Read Only: false Endpoint: Error Msg: Listen: Port End: 10020 Port Start: 10020 Resource Version: 0 State: running Target Port End: 10020 Target Port Start: 10020 Type: engine ...
-
-
SUSE Storage detects the orphaned instances and creates an
orphan
resources describing the instances.NAME TYPE NODE orphan-1807009489e50534c35c350e22680449c97deca4e5d3b72f4591976145f8bc41 engine-instance worker2 orphan-a91aa42ab5eda6b8b9fe1116d5b5f5673e5108d89be3db6fd18a275913463eef replica-instance worker1
-
You can view the list of
orphan
resources created by the SUSE Storage system by runningkubectl -n longhorn-system get orphan
.# kubectl -n longhorn-system get orphan
-
Get the detailed information of one of the orphaned replica instance in
spec.parameters
bykubectl -n longhorn-system get orphan <name>
.apiVersion: longhorn.io/v1beta2 kind: Orphan metadata: creationTimestamp: "2025-05-02T06:07:32Z" finalizers: - longhorn.io generation: 1 labels: longhorn.io/component: orphan longhorn.io/managed-by: longhorn-manager longhorn.io/orphan-type: replica-instance longhornnode: worker1 longhornreplica: pvc-569e44c0-b352-4aca-bf14-2cf7a6cfe86f-r-05660b73 # ... (representing other omitted metadata fields) spec: dataEngine: v1 nodeID: worker1 orphanType: replica-instance parameters: InstanceManager: instance-manager-8ff396d6d3744979b32abafc6346781c InstanceName: pvc-569e44c0-b352-4aca-bf14-2cf7a6cfe86f-r-05660b73 status: conditions: - lastProbeTime: "" lastTransitionTime: "2025-05-02T06:06:39Z" message: "" reason: running status: "True" type: InstanceExist - lastProbeTime: "" lastTransitionTime: "2025-05-02T06:06:39Z" message: "" reason: "" status: "False" type: Error ownerID: worker1
-
Get the detailed information of one of the orphaned engine instance in
spec.parameters
bykubectl -n longhorn-system get orphan <name>
.apiVersion: longhorn.io/v1beta2 kind: Orphan metadata: creationTimestamp: "2025-05-02T06:47:25Z" finalizers: - longhorn.io generation: 1 labels: longhorn.io/component: orphan longhorn.io/managed-by: longhorn-manager longhorn.io/orphan-type: engine-instance longhornengine: pvc-569e44c0-b352-4aca-bf14-2cf7a6cfe86f-e-0 longhornnode: worker2 # ... (representing other omitted metadata fields) spec: dataEngine: v1 nodeID: worker2 orphanType: engine-instance parameters: InstanceManager: instance-manager-b87f10b867cec1dca2b814f5e78bcc90 InstanceName: pvc-569e44c0-b352-4aca-bf14-2cf7a6cfe86f-e-0 status: conditions: - lastProbeTime: "" lastTransitionTime: "2025-05-02T06:47:25Z" message: "" reason: running status: "True" type: InstanceExist - lastProbeTime: "" lastTransitionTime: "2025-05-02T06:47:25Z" message: "" reason: "" status: "False" type: Error ownerID: worker2
-
You can delete an
orphan
resource by runningkubectl -n longhorn-system delete orphan <name>
. The corresponding orphaned instance will also be removed.# kubectl -n longhorn-system delete orphan orphan-a91aa42ab5eda6b8b9fe1116d5b5f5673e5108d89be3db6fd18a275913463eef # kubectl -n longhorn-system get orphan -l "longhorn.io/orphan-type in (engine-instance,replica-instance)" NAME TYPE NODE orphan-1807009489e50534c35c350e22680449c97deca4e5d3b72f4591976145f8bc41 engine-instance worker2
The orphaned instance is deleted.
# kubectl -n longhorn-system describe instancemanager -l "longhorn.io/node=worker1" Name: instance-manager-8ff396d6d3744979b32abafc6346781c Namespace: longhorn-system Kind: InstanceManager ... Status: Instance Replicas: ...
-
By default, SUSE Storage does not automatically delete orphaned instances. You can enable automatic deletion by configuring the
orphan-resource-auto-deletion
setting.# kubectl -n longhorn-system edit settings.longhorn.io orphan-resource-auto-deletion
Then, add
engineInstance
andreplicaInstance
to the list by including it as one of the semicolon-separated items.NAME VALUE APPLIED AGE orphan-resource-auto-deletion engineInstance;replicaInstance true 45h
-
After enabling the automatic deletion and wait for a while, the
orphan
resources and processes are deleted automatically.# kubectl -n longhorn-system get orphan -l "longhorn.io/orphan-type in (engine-instance,replica-instance)" No resources found in longhorn-system namespace.
The orphaned instances are deleted from instance manager.
# kubectl -n longhorn-system describe instancemanager -l "longhorn.io/node=worker1" Name: instance-manager-8ff396d6d3744979b32abafc6346781c Namespace: longhorn-system Kind: InstanceManager ... Status: Instance Replicas: ... # kubectl -n longhorn-system describe instancemanager -l "longhorn.io/node=worker2" Name: instance-manager-b87f10b867cec1dca2b814f5e78bcc90 Namespace: longhorn-system Kind: InstanceManager ... Status: Instance Engines: ...
Additionally, you can delete all orphaned instances on the specified node by running:
# kubectl -n longhorn-system delete orphan -l "longhorn.io/orphan-type in (engine-instance,replica-instance),longhornnode=<node name>"
Manage Orphaned Instances via SUSE Storage UI
-
In the top navigation bar, select Settings > Orphan Resources > Instances.
-
Review the list of orphaned instances, which displays relevant instance information.
-
To delete a specific orphaned instance, select Operation > Delete for that instance.
By default, SUSE Storage does not automatically delete orphaned instances through this manual UI operation alone. To enable automatic deletion of orphaned instances, or to configure settings related to general orphaned data, navigate to Setting > General > Orphan and configure the relevant options. (Refer to the kubectl
section for details on the specific orphan-resource-auto-deletion
setting for instance CRs if managing via backend settings).
Exceptions
SUSE Storage does not create an orphan
resource in the following scenarios:
-
The orphaned engine or replica instance is rescheduled back to its original node and is correctly tracked again.
-
The engine or replica instance is in a transient state such as migrating, starting, or stopping.
-
The node where the instance was running is evicted from the Kubernetes cluster.