Upgrade from v1.3.2 to v1.4.0

General Information

An Upgrade button appears on the Dashboard screen whenever a new Harvester version that you can upgrade to becomes available. For more information, see Start an upgrade.

For air-gapped environments, see Prepare an air-gapped upgrade.

Prevent corruption of virtual machine images during upgrades

Before starting the upgrade, ensure that the BackingImage CRD is updated to the SUSE Storage v1.7.2 version.

Skipping the CRD update may lead to backing image corruption, as described in issue #10644.

Perform the following steps before starting the upgrade.

  1. Patch the SUSE Virtualization ManagedChart object to avoid related errors and warnings.

    kubectl patch managedchart harvester \
    -n fleet-local \
    --type='json' \
    -p='[
      {
        "op":"add",
        "path":"/spec/diff/comparePatches/-",
        "value": {
          "apiVersion":"apiextensions.k8s.io/v1",
          "jsonPointers":["/spec","/metadata/annotations", "/metadata/labels", "/status"],
          "kind":"CustomResourceDefinition",
          "name":"backingimages.longhorn.io"
        }
      }
    ]'
  2. Apply the SUSE Storage v1.7.2 BackingImage CRD.

    apiVersion: apiextensions.k8s.io/v1
    kind: CustomResourceDefinition
    metadata:
      annotations:
        controller-gen.kubebuilder.io/version: v0.15.0
      labels:
        app.kubernetes.io/name: longhorn
        app.kubernetes.io/instance: longhorn
        app.kubernetes.io/version: v1.7.2
        longhorn-manager: ""
      name: backingimages.longhorn.io
    spec:
      conversion:
        strategy: Webhook
        webhook:
          clientConfig:
            service:
              name: longhorn-conversion-webhook
              namespace: longhorn-system
              path: /v1/webhook/conversion
              port: 9501
          conversionReviewVersions:
          - v1beta2
          - v1beta1
      group: longhorn.io
      names:
        kind: BackingImage
        listKind: BackingImageList
        plural: backingimages
        shortNames:
        - lhbi
        singular: backingimage
      scope: Namespaced
      versions:
      - additionalPrinterColumns:
        - description: The backing image name
          jsonPath: .spec.image
          name: Image
          type: string
        - jsonPath: .metadata.creationTimestamp
          name: Age
          type: date
        name: v1beta1
        schema:
          openAPIV3Schema:
            description: BackingImage is where Longhorn stores backing image object.
            properties:
              apiVersion:
                description: |-
                  APIVersion defines the versioned schema of this representation of an object.
                  Servers should convert recognized schemas to the latest internal value, and
                  may reject unrecognized values.
                  More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
                type: string
              kind:
                description: |-
                  Kind is a string value representing the REST resource this object represents.
                  Servers may infer this from the endpoint the client submits requests to.
                  Cannot be updated.
                  In CamelCase.
                  More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
                type: string
              metadata:
                type: object
              spec:
                x-kubernetes-preserve-unknown-fields: true
              status:
                x-kubernetes-preserve-unknown-fields: true
            type: object
        served: true
        storage: false
        subresources:
          status: {}
      - additionalPrinterColumns:
        - description: The system generated UUID
          jsonPath: .status.uuid
          name: UUID
          type: string
        - description: The source of the backing image file data
          jsonPath: .spec.sourceType
          name: SourceType
          type: string
        - description: The backing image file size in each disk
          jsonPath: .status.size
          name: Size
          type: string
        - description: The virtual size of the image (may be larger than file size)
          jsonPath: .status.virtualSize
          name: VirtualSize
          type: string
        - jsonPath: .metadata.creationTimestamp
          name: Age
          type: date
        name: v1beta2
        schema:
          openAPIV3Schema:
            description: BackingImage is where Longhorn stores backing image object.
            properties:
              apiVersion:
                description: |-
                  APIVersion defines the versioned schema of this representation of an object.
                  Servers should convert recognized schemas to the latest internal value, and
                  may reject unrecognized values.
                  More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
                type: string
              kind:
                description: |-
                  Kind is a string value representing the REST resource this object represents.
                  Servers may infer this from the endpoint the client submits requests to.
                  Cannot be updated.
                  In CamelCase.
                  More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
                type: string
              metadata:
                type: object
              spec:
                description: BackingImageSpec defines the desired state of the Longhorn
                  backing image
                properties:
                  checksum:
                    type: string
                  diskFileSpecMap:
                    additionalProperties:
                      properties:
                        evictionRequested:
                          type: boolean
                      type: object
                    type: object
                  diskSelector:
                    items:
                      type: string
                    type: array
                  disks:
                    additionalProperties:
                      type: string
                    description: Deprecated. We are now using DiskFileSpecMap to assign
                      different spec to the file on different disks.
                    type: object
                  minNumberOfCopies:
                    type: integer
                  nodeSelector:
                    items:
                      type: string
                    type: array
                  secret:
                    type: string
                  secretNamespace:
                    type: string
                  sourceParameters:
                    additionalProperties:
                      type: string
                    type: object
                  sourceType:
                    enum:
                    - download
                    - upload
                    - export-from-volume
                    - restore
                    - clone
                    type: string
                type: object
              status:
                description: BackingImageStatus defines the observed state of the Longhorn
                  backing image status
                properties:
                  checksum:
                    type: string
                  diskFileStatusMap:
                    additionalProperties:
                      properties:
                        lastStateTransitionTime:
                          type: string
                        message:
                          type: string
                        progress:
                          type: integer
                        state:
                          type: string
                      type: object
                    nullable: true
                    type: object
                  diskLastRefAtMap:
                    additionalProperties:
                      type: string
                    nullable: true
                    type: object
                  ownerID:
                    type: string
                  size:
                    format: int64
                    type: integer
                  uuid:
                    type: string
                  virtualSize:
                    description: Virtual size of image, which may be larger than physical
                      size. Will be zero until known (e.g. while a backing image is uploading)
                    format: int64
                    type: integer
                type: object
            type: object
        served: true
        storage: true
        subresources:
          status: {}

Known Issues

1. Upgrade Stuck in "Pre-draining" State

A virtual machine with a container disk cannot be migrated because of a limitation of the live migration feature. This causes the upgrade process to become stuck in the "Pre-draining" state.

Manually stop the virtual machines to continue the upgrade process.

Related issue: #7005

2. Upgrade Stuck on Waiting for Bundle to Become Ready

This issue is caused by a race condition when the Fleet agent (fleet-agent) is redeployed. The following error messages indicate that the issue exists.

> kubectl get bundles -n fleet-local
NAME                                          BUNDLEDEPLOYMENTS-READY   STATUS
mcc-harvester                                 0/1                       ErrApplied(1) [Cluster fleet-local/local: encountered 2 deletion errors. First is: admission webhook "validator.harvesterhci.io" denied the request: Internal error occurred: no route match found for DELETE /v1, Kind=Secret harvester-system/sh.helm.release.v1.harvester.v2]
mcc-harvester-crd                             0/1                       ErrApplied(1) [Cluster fleet-local/local: admission webhook "validator.harvesterhci.io" denied the request: Internal error occurred: no route match found for DELETE /v1, Kind=Secret harvester-system/sh.helm.release.v1.harvester-crd.v1]

You can run the following script to fix the issue.

#!/bin/bash

patch_fleet_bundle() {
  local bundleName=$1
  local generation=$(kubectl get -n fleet-local bundle ${bundleName} -o jsonpath='{.spec.forceSyncGeneration}')
  local new_generation=$((generation+1))
  patch_manifest="$(mktemp)"
  cat > "$patch_manifest" <<EOF
{
  "spec": {
    "forceSyncGeneration": $new_generation
  }
}
EOF
  echo "patch bundle to new generation: $new_generation"
  kubectl patch -n fleet-local bundle ${bundleName}  --type=merge --patch-file $patch_manifest
  rm -f $patch_manifest
}

echo "removing harvester validating webhook"
kubectl delete validatingwebhookconfiguration harvester-validator

for bundle in mcc-harvester-crd mcc-harvester
do
  patch_fleet_bundle ${bundle}
done

echo "removing longhorn services"
kubectl delete svc longhorn-engine-manager -n longhorn-system --ignore-not-found=true
kubectl delete svc longhorn-replica-manager -n longhorn-system --ignore-not-found=true

3. Upgrade Stuck on Waiting for Fleet

When upgrading from v1.3.2 to v1.4.0, the upgrade process may become stuck on waiting for Fleet to become ready. This issue is caused by a race condition when Rancher is redeployed.

Check the Harvester logs and Fleet history for the following indicators:

  • The manifest pod is stuck in the deployed status.

  • The upgrade is pending with a chart version that has been deployed.

Example:

> kubectl logs -n harvester-system -l harvesterhci.io/upgradeComponent=manifest
wait helm release cattle-fleet-system fleet fleet-104.0.2+up0.10.2 0.10.2 deployed

> helm history -n cattle-fleet-system fleet
REVISION	UPDATED                 	STATUS         	CHART                	APP VERSION	DESCRIPTION
26      	Tue Dec 10 03:09:13 2024	superseded     	fleet-103.1.5+up0.9.5	0.9.5      	Upgrade complete
27      	Sun Dec 15 09:26:54 2024	superseded     	fleet-103.1.5+up0.9.5	0.9.5      	Upgrade complete
28      	Sun Dec 15 09:27:03 2024	superseded     	fleet-103.1.5+up0.9.5	0.9.5      	Upgrade complete
29      	Mon Dec 16 05:57:03 2024	deployed       	fleet-103.1.5+up0.9.5	0.9.5      	Upgrade complete
30      	Mon Dec 16 05:57:13 2024	pending-upgrade	fleet-103.1.5+up0.9.5	0.9.5      	Preparing upgrade

You can run the following command to fix the issue.

helm rollback fleet -n cattle-fleet-system <last-deployed-revision>

4. Upgrade Restarts Unexpectedly After Clicking "Dismiss it" Button

When you use Rancher to upgrade SUSE Virtualization, the Rancher UI displays a dialog with a button labeled "Dismiss it". Clicking this button may result in the following issues:

  • The status section of the harvesterhci.io/v1beta1/upgrade CR is cleared, causing the loss of all important information about the upgrade.

  • The upgrade process restarts unexpectedly.

This issue affects Rancher v2.10.x, which uses v1.0.2, v1.0.3, and v1.0.4 of the Harvester UI Extension. All SUSE Virtualization UI versions are not affected. The issue is fixed in Harvester UI Extension v1.0.5 and v1.5.0.

To avoid this issue, perform either of the following actions:

  • Use the SUSE Virtualization UI for upgrades. Clicking the "Dismiss it" button on the SUSE Virtualization UI does not result in unexpected behavior.

  • Instead of clicking the button on the Rancher UI, run the following command against the cluster:

    kubectl -n harvester-system label upgrades -l harvesterhci.io/latestUpgrade=true harvesterhci.io/read-message=true

Related issue: #7791