Cluster API Addon Provider Fleet

Overview

Cluster API Add-on Provider for Fleet (CAAPF) is a Cluster API (CAPI) provider that provides integration with Fleet to enable the easy deployment of applications to a CAPI-provisioned cluster.

For more information about the provider, please refer to CAAPF book.

Starting with Rancher v2.14.1, CAAPF is no longer installed by default. While standard Fleet integration is still available through Rancher, advanced CAAPF features require manual installation. Review the following scenarios to determine if CAAPF is necessary for your environment:

  • Standard Rancher-Fleet integration

    If you only require your CAPI workload clusters to be registered with Fleet for basic application deployment via Rancher, you do not need to install CAAPF. No manual steps are necessary, and when you use the cluster-api.cattle.io/rancher-auto-import label, Rancher will automatically register the cluster with Fleet and install the fleet-agent as part of the standard import process. Upgrading to Rancher v2.14.1 will result in Rancher updating the fleet-agent on any existing downstream clusters.

  • Advanced CAAPF automation

    If you want to use advanced automation provided by CAAPF—such as automatic creation of Fleet ClusterGroups based on ClusterClasses, automatic label propagation from CAPI to Fleet resources, or CAPI-based Helm templating—you must install and enable CAAPF manually. Refer to the prerequisites below for more information. In this scenario, CAAPF takes over the responsibility of registering the cluster with Fleet.

CAAPF depends on the WatchList Kubernetes feature gate. This feature needs to be explicitly enabled on Kubernetes 1.33 versions. See the Kubernetes upstream documentation for further information.

Prerequisites

If you have chosen the Advanced CAAPF automation scenario and want to continue using CAAPF for its advanced features, two configuration changes are required:

  • Set providers.addonFleet.enabled: true in the SUSE® Rancher Prime Cluster API Providers chart values to deploy the CAAPF provider.

    • CAAPF provider installation example:

      Click to Expand
      helm upgrade --install rancher-turtles-providers oci://<REGISTRY_URL>/rancher/charts/rancher-turtles-providers \
          --namespace cattle-turtles-system \
          --set providers.addonFleet.enabled=true \
          --wait
  • Set features.use-caapf.enabled: true in the SUSE® Rancher Prime Cluster API chart values to enable the use-caapf alpha feature gate (disabled by default). For instructions on how to set feature gates, refer to this section.

    This flag must be set before upgrading to Rancher v2.14.1. If the upgrade completes with the flag at its default (false), the SUSE® Rancher Prime Cluster API controller will reconcile imported clusters without the provisioning.cattle.io/externally-managed annotation, handing Fleet agent management to Rancher. Enabling the flag afterward causes SUSE® Rancher Prime Cluster API to re-add the annotation and hand management back to CAAPF, resulting in the Fleet agent being re-installed on workload clusters.

    Once both are in place, Turtles will automatically add the provisioning.cattle.io/externally-managed annotation to imported Rancher clusters, delegating Fleet agent installation on CAPI workload clusters to CAAPF rather than Rancher managing it directly.

Functionality

  • The provider will register a newly provisioned CAPI cluster with Fleet by creating a Fleet Cluster instance with the same name and namespace. Applications can be automatically deployed to the created cluster using GitOps.

  • The provider will automatically create a Fleet ClusterGroup for every CAPI ClusterClass in the ClusterClass namespace. This enables you to deploy the same applications to all clusters created from the same ClusterClass.

  • The provider will automatically create a Fleet ClusterGroup for every referenced CAPI ClusterClass by Cluster located in a different namespace from ClusterClass. This enables you to deploy the same applications to all clusters referencing the same ClusterClass in a particular namespace.

This allows a user to specify either a Bundle resource with raw application workloads, or GitRepo to install applications from git. Each of the resources can provide targets with any combination of:

  targets:
  - clusterGroup: <cluster-class-name> # If the cluster is created from cluster-class
  - clusterName: <a specific CAPI cluster name>

Additionally, CAAPF automatically propagates CAPI cluster labels to the Fleet cluster resource, so users can specify a target matching a common cluster label with:

  targets:
  - clusterSelector: <label selector for the cluster instances, inherited from CAPI clusters>
  - clusterGroupSelector: <label selector for the cluster group instances, labels inherited from ClusterClass>

Helm Chart templating based on CAPI Cluster and ControlPlane

The Cluster API Addon Provider Fleet automates application templating for imported CAPI clusters based on matching cluster state. This feature ensures that the state of a CAPI cluster and resources is always up-to-date in the spec.templateValues.ClusterValues field of the Fleet cluster resource. This allows users to:

  • Reference specific parts of the CAPI cluster directly or via Helm substitution patterns referencing .ClusterValues.Cluster data.

  • Substitute based on the state of the control plane resource via the .ClusterValues.ControlPlane field.

  • Substitute based on the state of the infrastructure cluster resource via the .ClusterValues.InfrastructureCluster field.

  • Maintain a consistent application state across different clusters.

  • Use the same template for multiple matching clusters to simplify deployment and management.

For more information on the feature, please refer to templating documentation in the book.

Example - deploying kindnet CNI

Demo: asciicast

Example - deploying Calico CNI using GitRepo

Demo: asciicast

For a tutorial and prerequisites, please refer to gitrepo tutorial section in the book.

Migrating existing CAAPF-managed clusters to Rancher

When importing a CAPI workload cluster into Rancher, it is registered with Fleet through a Fleet Cluster resource. Before Rancher v2.14.1, CAAPF always created this resource, placing it in the same namespace as the CAPI workload cluster. Rancher v2.14.1 introduced the use-caapf feature gate, which lets Rancher handle the registration instead, creating its own Fleet Cluster in the fleet-default namespace. A cluster that CAAPF already registered therefore ends up with a second Fleet Cluster resource in a different namespace, and the GitRepos, HelmOps and Bundles resources that targeted the original CAAPF-created cluster no longer reach it. Switching these clusters from CAAPF to Rancher-managed registration therefore requires moving those resources onto the Rancher-managed fleet-default cluster and removing the duplicate CAAPF-created cluster, so that applications keep being deployed.

The migrate-caapf.sh script automates this migration. It runs in two phases:

  • pre — run before disabling the use-caapf feature gate, while CAAPF still manages the clusters. It scales down the SUSE® Rancher Prime Cluster API and CAAPF controllers, copies the Fleet cluster labels onto the corresponding Rancher management clusters, checks for bundle collisions, then pauses and labels the affected Fleet resources and creates the BundleNamespaceMappings needed to make those resources eligible for deployment in fleet-default.

  • post — run after the use-caapf feature gate has been disabled and Rancher has created the new fleet-default clusters. It copies templateValues onto the new Fleet clusters, unpauses the migrated resources, and deletes the old per-namespace Fleet clusters once their bundles are confirmed present on the new clusters.

The use-caapf feature gate was introduced in Rancher v2.14.1, so this migration is tied to that upgrade. Run the pre phase before upgrading to Rancher v2.14.1, perform the upgrade, then run the post phase immediately after the upgrade has completed.

Before you begin

  • kubectl, configured to point at the Rancher management cluster.

  • jq available on the PATH.

Download the script

curl -sSfLO https://raw.githubusercontent.com/rancher/turtles/refs/heads/release/v0.26/scripts/migrate-caapf.sh
chmod +x migrate-caapf.sh

Run the migration

The script defaults to a dry run (DRY_RUN=true), which prints the actions it would take without modifying any resources. Always start with a dry run and review the output before applying any changes.

  1. Dry-run the pre phase and review the planned changes:

    ./migrate-caapf.sh pre

    If the script reports a name collision or label collision, it aborts before pausing any Fleet resources or creating the BundleNamespaceMappings. Applying the BundleNamespaceMappings makes every migration-labeled bundle eligible to deploy to any cluster in fleet-default, so two bundles with the same name from different namespaces, or a clusterSelector that now matches multiple clusters, would produce conflicting BundleDeployments. Resolve the reported collisions before continuing.

  2. Apply the pre phase:

    DRY_RUN=false ./migrate-caapf.sh pre
  3. Upgrade Rancher to v2.14.1 (which disables the use-caapf feature gate) and wait for Rancher to create the new Fleet clusters in fleet-default. Run the post phase immediately after the upgrade completes: the affected Fleet resources stay paused until then, so any delay leaves applications undeployed.

  4. Dry-run and then apply the post phase:

    ./migrate-caapf.sh post
    DRY_RUN=false ./migrate-caapf.sh post

    The post phase waits for each new Fleet cluster to reconcile and verifies that every bundle from the old cluster is present on the new one before deleting the old cluster. If any bundles are still missing, it logs a warning and skips deleting that cluster, so it is safe to re-run the post phase after resolving the issue.

Disabling CAAPF

To disable CAAPF and return to Rancher’s default Fleet management, you only need to disable the feature gate: Set features.use-caapf.enabled: false (the default value) in your SUSE® Rancher Prime Cluster API Helm chart configuration.

Once the feature gate is disabled, SUSE® Rancher Prime Cluster API will automatically remove the provisioning.cattle.io/externally-managed annotation from any previously managed clusters, and Rancher will resume responsibility for installing and managing the Fleet agent on the workload clusters.