|
This is unreleased documentation for SUSE® Storage 1.12 (Dev). |
SUSE® Storage Metrics for Monitoring
Volume
| Name | Description | Example |
|---|---|---|
|
Actual space used by each replica of the volume on the corresponding node |
|
|
Configured size in bytes for this volume |
|
|
Encrypted volume |
|
|
Volume state. This metric uses the |
|
|
Volume robustness. This metric uses the |
|
|
Read throughput of this volume (Bytes/s) |
|
|
Write throughput of this volume (Bytes/s) |
|
|
Read IOPS of this volume |
|
|
Write IOPS of this volume |
|
|
Read latency of this volume (ns) |
|
|
Write latency of this volume (ns) |
|
|
This metric indicates that the volume is now in read-only mode. The metric is either 1 or no record for each volume |
|
|
Unix timestamp of the last successful backup of this volume, or 0 if no such backup exists |
|
Node
| Name | Description | Example |
|---|---|---|
|
Status of this node: 1=true, 0=false |
|
|
Total number of nodes in the Longhorn system |
|
|
The maximum allocatable CPU on this node |
|
|
The CPU usage on this node |
|
|
The maximum allocatable memory on this node |
|
|
The memory usage on this node |
|
|
The storage capacity of this node |
|
|
The used storage of this node |
|
|
The reserved storage for other applications and system on this node |
|
Replica
Name |
Description |
Example |
|
Static metadata for each Replica CR |
|
|
Current runtime state of the replica: running, stopped, error, starting, stopping, unknown |
|
Engine
| Name | Description | Example |
|---|---|---|
|
Static metadata for each Engine CR |
|
|
Runtime state of an engine: running, stopped, error, starting, stopping, unknown |
|
|
The mode reported for each replica by the engine: RW, WO, ERR |
|
|
Engine rebuild progress, ranging from 0 to 100 percent. This metric is visible only when a replica is rebuilding. |
|
Disk
| Name | Description | Example |
|---|---|---|
|
The storage capacity of this disk |
|
|
The used storage of this disk |
|
|
The reserved storage for other applications and system on this disk |
|
|
The status of this disk |
|
|
Read throughput of this disk (Bytes/s) |
|
|
Write throughput of this disk (Bytes/s) |
|
|
Read IOPS of this disk |
|
|
Write IOPS of this disk |
|
|
Read latency of this disk (nanoseconds) |
|
|
Write latency of this disk (nanoseconds) |
|
|
Disk health status (1 = healthy, 0 = unhealthy). See [Disk Health Monitoring](../disk-heath) for details |
|
|
Raw SMART health attribute value for the disk. Available only when SMART data is supported. See Disk Health Monitoring for details |
|
Instance Manager
| Name | Description | Example |
|---|---|---|
|
The cpu usage of this longhorn instance manager |
|
|
Requested CPU resources in kubernetes of this Longhorn instance manager |
|
|
The memory usage of this longhorn instance manager |
|
|
Requested memory in Kubernetes of this longhorn instance manager |
|
|
The number of proxy gRPC connection of this longhorn instance manager |
|
Manager
| Name | Description | Example |
|---|---|---|
|
The CPU usage of this Longhorn Manager |
|
|
The memory usage of this Longhorn Manager |
|
Backup
| Name | Description | Example |
|---|---|---|
|
Actual size of this backup |
|
|
State of this backup: 0=New, 1=Pending, 2=InProgress, 3=Completed, 4=Error, 5=Unknown |
|
|
Uploaded data size of this backup |
|
Backup Target
| Name | Description | Example |
|---|---|---|
|
Number of backup volumes on this backup target |
|
Backup Volume
| Name | Description | Example |
|---|---|---|
|
Number of backups belonging to this backup volume |
|
Snapshot
| Name | Description | Example |
|---|---|---|
|
Actual size of this snapshot |
|
BackingImage
| Name | Description | Example |
|---|---|---|
|
Actual size of this backing image |
|
|
State of this backing image: 0=Pending, 1=Starting, 2=InProgress, 3=ReadyForTransfer, 4=Ready, 5=Failed, 6=FailedAndCleanUp, 7=Unknown |
|
BackupBackingImage
| Name | Description | Example |
|---|---|---|
|
Actual size of this backup backing image |
|
|
State of this backup backing image: 0=New, 1=Pending, 2=InProgress, 3=Completed, 4=Error, 5=Unknown |
|
CSI
The CSI sidecar component has built-in metrics for users to get insights into CSI operations. The CSI operations metrics cover total count, error count, and call latency. Longhorn enables the metrics by adding the flag --http-endpoint for each CSI sidecar component. You can use Prometheus’s PodMonitor to collect these metrics.
| Name | Port |
|---|---|
longhorn-csi-attacher |
8000 |
longhorn-csi-provisioner |
8000 |
longhorn-csi-resizer |
8000 |
longhorn-csi-snapshotter |
8000 |
The metrics provided by the CSI sidecar component are provided in a histogram format. For example, you can obtain metrics observing the time it takes to create a Longhorn Volume for the PVC.
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="0.1"} 0
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="0.25"} 0
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="0.5"} 0
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="1"} 0
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="2.5"} 3
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="5"} 3
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="10"} 3
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="15"} 9
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="25"} 9
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="50"} 9
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="120"} 9
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="300"} 9
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="600"} 9
csi_sidecar_operations_seconds_bucket{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume",le="+Inf"} 9
csi_sidecar_operations_seconds_sum{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume"} 66.816478825
csi_sidecar_operations_seconds_count{driver_name="driver.longhorn.io",grpc_status_code="OK",method_name="/csi.v1.Controller/ControllerPublishVolume"} 9