Documentation survey

This is unreleased documentation for SUSE® Storage 1.10 (Dev).

Important Notes

Please see here for the full release notes.

Warning

Upgrade

If your SUSE Storage cluster was initially deployed with a version earlier than v1.3.0, the Custom Resources (CRs) were created by using the v1beta1 APIs. While the upgrade from SUSE Storage v1.8 to v1.9 automatically migrates all CRs to the new v1beta2 version, a manual CR migration is strongly advised before you upgrade from SUSE Storage v1.9 to v1.10.

Certain operations, such as an etcd or CRD restore, might leave behind v1beta1 data. Manually migrating your CRs ensures that all SUSE Storage data is properly updated to the v1beta2 API, which prevents potential compatibility issues and unexpected behavior with the new SUSE Storage version.

Following the manual migration, verify that v1beta1 has been removed from the CRD stored versions to ensure completion and a successful upgrade.

Migration Requirement Before SUSE Storage v1.10 Upgrade

Before you upgrade from SUSE Storage v1.9 to v1.10, perform the following manual CRD storage version migration.

If your SUSE Storage installation uses a namespace other than longhorn-system, replace longhorn-system with your custom namespace throughout the commands.

# Temporarily disable the CR validation webhook to allow updating read-only settings CRs.
kubectl patch validatingwebhookconfiguration longhorn-webhook-validator \
  --type=merge \
  -p "$(kubectl get validatingwebhookconfiguration longhorn-webhook-validator -o json | \
  jq '.webhooks[0].rules |= map(if .apiGroups == ["longhorn.io"] and .resources == ["settings"] then
    .operations |= map(select(. != "UPDATE")) else . end)')"

# Migrate CRDs that ever stored v1beta1 resources
migration_time="$(date +%Y-%m-%dT%H:%M:%S)"
crds=($(kubectl get crd -l app.kubernetes.io/name=longhorn -o json | jq -r '.items[] | select(.status.storedVersions | index("v1beta1")) | .metadata.name'))
for crd in "${crds[@]}"; do
  echo "Migrating ${crd} ..."
  for name in $(kubectl -n longhorn-system get "$crd" -o jsonpath='{.items[*].metadata.name}'); do
    # Attach additional annotations to trigger v1beta1 resource updating in the latest storage version.
    kubectl patch "${crd}" "${name}" -n longhorn-system --type=merge -p='{"metadata":{"annotations":{"migration-time":"'"${migration_time}"'"}}}'
  done
  # Clean up the stored version in CRD status
  kubectl patch crd "${crd}" --type=merge -p '{"status":{"storedVersions":["v1beta2"]}}' --subresource=status
done

# Re-enable the CR validation webhook.
kubectl patch validatingwebhookconfiguration longhorn-webhook-validator \
  --type=merge \
  -p "$(kubectl get validatingwebhookconfiguration longhorn-webhook-validator -o json | \
  jq '.webhooks[0].rules |= map(if .apiGroups == ["longhorn.io"] and .resources == ["settings"] then
    .operations |= (. + ["UPDATE"] | unique) else . end)')"

Migration Verification

After running the script, verify the CRD stored versions using this command:

kubectl get crd -l app.kubernetes.io/name=longhorn -o=jsonpath='{range .items[*]}{.metadata.name}{": "}{.status.storedVersions}{"\n"}{end}'

Before upgrading to SUSE Storage v1.10, you must confirm that all SUSE Storage Custom Resource Definitions (CRDs) have only "v1beta2" listed in their storedVersions. The "v1beta1" version must be completely removed.

Example of successful output:

backingimagedatasources.longhorn.io: ["v1beta2"]
backingimagemanagers.longhorn.io: ["v1beta2"]
backingimages.longhorn.io: ["v1beta2"]
backupbackingimages.longhorn.io: ["v1beta2"]
backups.longhorn.io: ["v1beta2"]
backuptargets.longhorn.io: ["v1beta2"]
backupvolumes.longhorn.io: ["v1beta2"]
engineimages.longhorn.io: ["v1beta2"]
engines.longhorn.io: ["v1beta2"]
instancemanagers.longhorn.io: ["v1beta2"]
nodes.longhorn.io: ["v1beta2"]
orphans.longhorn.io: ["v1beta2"]
recurringjobs.longhorn.io: ["v1beta2"]
replicas.longhorn.io: ["v1beta2"]
settings.longhorn.io: ["v1beta2"]
sharemanagers.longhorn.io: ["v1beta2"]
snapshots.longhorn.io: ["v1beta2"]
supportbundles.longhorn.io: ["v1beta2"]
systembackups.longhorn.io: ["v1beta2"]
systemrestores.longhorn.io: ["v1beta2"]
volumeattachments.longhorn.io: ["v1beta2"]
volumes.longhorn.io: ["v1beta2"]

With these steps completed, the SUSE Storage upgrade to v1.10 should now proceed without issues.

Troubleshooting CRD Upgrade Failures During Upgrade to SUSE Storage v1.10

If you did not apply the required pre-upgrade migration steps and the CRs are not fully migrated to v1beta2, the longhorn-manager Pods may fail to operate correctly. A common error message for this issue is:

Upgrade failed: cannot patch "backingimagedatasources.longhorn.io" with kind CustomResourceDefinition: CustomResourceDefinition.apiextensions.k8s.io "backingimagedatasources.longhorn.io" is invalid: status.storedVersions[0]: Invalid value: "v1beta1": missing from spec.versions; v1beta1 was previously a storage version, and must remain in spec.versions until a storage migration ensures no data remains persisted in v1beta1 and removes v1beta1 from status.storedVersions

To fix this issue, you must perform a forced downgrade back to the exact SUSE Storage v1.9.x version that was running before the failed upgrade attempt.

Downgrade Procedure (kubectl Installation)

If SUSE Storage was installed by using kubectl, you must patch the current-longhorn-version setting before you downgrade. Replace "v1.9.x" to the original version before upgrade in the following commands.

# Attaching annotation to allow patching current-longhorn-version.
kubectl patch settings.longhorn.io current-longhorn-version -n longhorn-system --type=merge -p='{"metadata":{"annotations":{"longhorn.io/update-setting-from-longhorn":""}}}'
# Temporarily override current version to allow old version installation.
# Replace the value `"v1.9.x"` with the original version before upgrade.
kubectl patch settings.longhorn.io current-longhorn-version -n longhorn-system --type=merge -p='{"value":"v1.9.1"}'

After modifying current-longhorn-version, you can proceed to downgrade to the original SUSE Storage v1.9.x deployment.

Downgrade Procedure (Helm Installation)

If SUSE Storage was installed by using Helm, the downgrade is allowed by disabling the preUpgradeChecker.upgradeVersionCheck flag.

Post-Downgrade

Once the downgrade is complete and the Longhorn system is stable on the v1.9.x version, you must immediately follow the steps outlined in the Manual CRD Migration Guide. This step is crucial to migrate all remaining v1beta1 CRs to v1beta2 before you try the Longhorn v1.10 upgrade again.

Removal

longhorn.io/v1beta1 API

The v1beta1 Longhorn API version was removed in SUSE Storage v1.10.0.

For more details, see Issue #10249.

replica.status.evictionRequested field

The deprecated replica.status.evictionRequested field has been removed.

For more details, see Issue #7022.

General

Kubernetes Version Requirement

Due to the upgrade of the CSI external snapshotter to v8.2.0, you must be running Kubernetes v1.25 or later to upgrade to SUSE Storage v1.8.0 or a newer version.

CRD Upgrade Validation

During an upgrade, a new Longhorn manager might start before the Custom Resource Definitions (CRDs) are applied. This sequence ensures the controller doesn’t process objects with deprecated data or fields. However, if the CRD hasn’t yet been applied, the Longhorn manager can fail during the initial upgrade phase.

If the Longhorn Manager crashes during the upgrade, check the logs to determine if the CRD not being applied is the cause of the failure. In such cases, the logs might contain error messages similar to the following:

time="2025-03-27T06:59:55Z" level=fatal msg="Error starting manager: upgrade resources failed: BackingImage in version \"v1beta2\" cannot be handled as a BackingImage: strict decoding error: unknown field \"spec.diskFileSpecMap\", unknown field \"spec.diskSelector\", unknown field \"spec.minNumberOfCopies\", unknown field \"spec.nodeSelector\", unknown field \"spec.secret\", unknown field \"spec.secretNamespace\"" func=main.main.DaemonCmd.func3 file="daemon.go:94"

Upgrade Check Events

When you upgrade with Helm or the Rancher App Marketplace, SUSE Storage performs pre-upgrade checks. If a check fails, the upgrade stops and the reason for the failure is recorded in an event.

For more details, see Upgrading Longhorn Manager.

Manual Checks Before Upgrade

Automated pre-upgrade checks does not cover all scenarios. A manual check is recommended using kubectl or the SUSE Storage UI.

  • Ensure all V2 Data Engine volumes are detached and replicas are stopped. The V2 engine does not support live upgrades.

  • Avoid upgrading when volumes are Faulted. Unusable replicas may be deleted, causing permanent data loss if no backups exist.

  • Avoid upgrading if a failed BackingImage exists. For more information, see Backing Image for details.

  • Create a Longhorn System Backup upgrading is recommended to ensure recoverability.

Consolidated SUSE Storage Settings

Settings have been consolidated for easier management across V1 and V2 Data Engines. Each setting now uses one of these formats:

  • Single value for all supported data engines:

    • Format: Non-JSON string (for example, 1024)

    • This value applies to all supported data engines and must be the same across them. Data engine-specific values are not allowed.

  • Data engine-specific values for V1 and V2 data engines:

    • Format: JSON object (for example, {"v1": "value1", "v2": "value2"})

    • This allows you to specify different values for the V1 and V2 data engines.

  • Data engine-specific values for V1 data engine only:

    • Format: JSON object with a v1 key only (for example, {"v1": "value1"})

    • This allows you to configure only the V1 data engine, and it does not affect the V2 data engine.

  • Data engine-specific values for V2 data engine only:

    • Format: JSON object with a v2 key only (for example, {"v2": "value1"})

    • This allows you to configure only the V2 data engine, and it does not affect the V1 data engine.

For more information, see the SUSE Storage Settings.

System Info Category in Setting

A new System Info category has been added to show cluster-level information more clearly.

For more details, see Issue #11656.

Configurable Backup Block Size

The SUSE Storage UI now display a summary of attachment tickets on each volume overview page for improved visibility into volume state.

For more details, see Issue #11400 and Issue #11400.

Scheduling

Pod Scheduling with CSIStorageCapacity

SUSE Storage now supports Kubernetes CSIStorageCapacity, which enables the scheduler to verify node storage before it schedules pods that use StorageClasses with WaitForFirstConsumer. This reduces scheduling errors and and improves reliability.

For more details, see Issue #10685.

Performance

Configurable Backup Block Size

Backup block size can now be configured when you create a volume, starting in SUSE Storage v1.10.0. This allows you to optimize for performance, efficiency, and cost.

For more information, see Create Longhorn Volumes.

Profiling Support for Backup Sync Agent

The backup sync agent has a pprof server for profiling runtime resource usage during backup sync operations.

For more information, see Profiling.

Resilience

Configurable Liveness Probe for Instance Manager

You can now configure the instance-manager pod liveness probes. This allows the system to better distinguish between temporary delays and actual failures, which helps reduce unnecessary restarts and improves overall cluster stability.

For more information, see SUSE Storage Settings.

Backing Image Manager CR Naming

Backing Image Manager CRs now use a compact, collision-resistant naming format to reduce the risk of conflicts.

For more details, see Issue #11455.

Security

Refined RBAC Permissions

RBAC permissions have been refined to minimize privileges and improve cluster security.

For more details, see Issue #11345.

V1 Data Engine

IPv6 Support

V1 volumes now support single-stack IPv6 Kubernetes clusters.

Dual-stack Kubernetes clusters and V2 volumes are not supported in this release.

For more details, see Issue #2259.

V2 Data Engine

SUSE Storage System Upgrade

Live upgrades of V2 volumes are not supported. Before you upgrade, make sure all V2 volumes are detached.

New Introduced Functionalities since SUSE Storage v1.10.0

V2 Data Engine Without Hugepage Support

The V2 Data Engine can run without Hugepages by setting `data-engine-hugepage-enabled to `{"v2":"false"}. This reduces memory pressure on low-spec nodes and increases deployment flexibility. The performance may be lower compared to running with Hugepage.

V2 Data Engine Interrupt Mode Support

Interrupt mode has been added to the V2 Data Engine to help reduce CPU usage. This feature is especially beneficial for clusters with idle or low I/O workloads, where conserving CPU resources is more important than minimizing latency.

While interrupt mode lowers CPU consumption, it may introduce slightly higher I/O latency compared to polling mode. In addition, the current implementation uses a hybrid approach, which still incurs a minimal, constant CPU load even when interrupts are enabled.

For more information, see Interrupt Mode.

Interrupt mode currently supports only AIO disks.

V2 Data Engine Volume Clone Support

SUSE Storage now supports volume and snapshot cloning for V2 data engine volumes.

For more information, see Volume Clone Support.

V2 Data Engine Replica Rebuild QoS

Provides Quality of Service (QoS) control for V2 volume replica rebuilds. You can configure bandwidth limits globally or per volume to prevent storage throughput overload on source and destination nodes.

For more information, see Replica Rebuild QoS.

V2 Data Engine Volume Expansion

SUSE Storage now supports volume expansion for V2 Data Engine volumes. Users can expand the volume through the UI or by modifying the PVC manifest.

For more information, see V2 Volume Expansion.