Deploying and Installing SUSE AI|Alternative deployments
Applies to SUSE AI 1.0

5 Alternative deployments

Selected parts of SUSE AI deployment can be automated or performed in a Web UI environment instead of the command line. This includes the following methods:

5.1 Node installer

The SUSE AI Node Installer is an automation toolkit based on Ansible. It provides a fully automated, repeatable method to prepare and configure SUSE-driven servers for AI workloads. This approach simplifies and streamlines the installation of selected cluster components.

Important
Important: SUSE AI Node Installer does not install OS

SUSE AI Node Installer is not an OS provisioning tool. You need to install a supported OS first on the affected nodes before using SUSE AI Node Installer. Refer to Section 2.1, “Installing SUSE Linux Enterprise Server” or SUSE Linux Micro documentation for steps to install one of the recommended OS.

5.1.1 How does SUSE AI Node Installer work?

SUSE AI Node Installer uses Ansible playbooks to install and configure selected cluster components on a specified group of servers. All playbooks run inside a containerized execution environment, ensuring portability, consistency and ease of use across developer and production environments. Since the setup is done via the Ansible playbooks, the process is idempotent — rerunning it will not produce conflicts. It simply ensures that the system matches the intended configuration.

The installer deploys:

  • RKE2 - SUSE’s CNCF-certified Kubernetes distribution

  • (Optional) Rancher / SUSE Rancher Prime - centralized multi-cluster management

  • (Optional) NVIDIA GPU support - NVIDIA drivers and NVIDIA GPU Operator

5.1.2 Key capabilities

Repeatable cluster bootstrap

Supports single-node and HA RKE2 deployments.

Portable execution

Playbooks run from a container image ensuring consistent Ansible versions and dependencies.

GPU-ready automation

Seamless installation of NVIDIA drivers and GPU Operator when supported hardware is present.

Production-aligned choices

Supports Rancher or SUSE Rancher Prime deployments, plus idempotent Ansible design enabling safe re-runs.

Flexible environment support

Works on bare metal, virtual machines or cloud instances.

5.1.3 Deploying nodes using SUSE AI Node Installer

SUSE AI Node Installer covers the following use cases:

  • You already have an OS provisioned on the servers and you want to deploy an RKE2 cluster with a Rancher management server.

  • You already have an OS provisioned on the servers and you want to deploy an RKE2 cluster to serve as the execution environment for SUSE AI workloads. This cluster can later be imported into a Rancher instance.

  • You already operate a Rancher instance with a downstream cluster under management, and you want to configure that downstream cluster to serve as the execution environment for SUSE AI workloads.

Prerequisites before running the SUSE AI Node Installer
Organizational readiness
  • Define the set of nodes to prepare for the RKE2 cluster.

  • A valid registration key for the SLE distribution, which can be obtained with your SUSE subscription.

  • Ensure IP addresses, host names and networking are functional.

Target Host Requirements
  • Supported Linux distributions: SLE, SUSE Linux Micro or openSUSE Leap variants recommended.

  • Sufficient CPU, memory and storage resources for the intended AI workload. Refer to SUSE AI Requirements for general requirements.

  • (Optional) NVIDIA GPUs if you plan to run GPU-accelerated workloads.

  • Fulfill prerequisites at RKE2 Prerequisites.

  • Target hosts include Python 3.11 or later. Verify that Python 3 points to version 3.11 or higher by running python3 --version.

  • Confirm that sudo permissions are configured correctly.

Access Requirements
  • A management workstation to run SUSE AI Node Installer with Git and Docker or Podman installed.

  • Network connectivity and SSH access to each target node

Installation and setup
  1. On your management workstation, clone the SUSE AI Node Installer repository.

    > git clone https://github.com/SUSE/suse-ai-node-ansible.git
    > cd suse-ai-node-ansible
  2. Build the Docker image from the source.

    > docker build \
      -t suse-ai-node-ansible-runner
      -f Dockerfile.local .

    Or, if you prefer the newer buildx:

    > docker buildx build \
      -t suse-ai-node-ansible-runner
      -f Dockerfile.local --load .
  3. For each RKE2 cluster, create the inventory.ini file.

    > cp inventory.ini.example inventory.ini
  4. For each RKE2 cluster, create the extra_vars.yml file and align its entries with your use case. Refer to Section 5.1.3.1, “extra_vars.yml examples tailored to specific use cases” for examples of extra_vars.yml tailored to a specific use case.

    > cp extra_vars.yml.example extra_vars.yml
  5. Run the site.yml playbook. At a high level, the playbook automates the node setup by performing the following steps:

    • Verifies that the target hosts are supported systems.

    • Registers hosts with the SUSE Customer Center if they are not already registered.

    • Installs required packages and optionally NVIDIA G06 drivers for compatible GPUs (Turing or newer).

    • Reboots the target hosts and runs checks after the reboot.

    • Installs RKE2 servers, RKE2 agents, Rancher and the GPU Operator.

    > docker run --rm \
      -v ~/.ssh/id_rsa:/root/.ssh/id_rsa:ro \
      -v ./inventory.ini:/workspace/inventory.ini \
      -v ./extra_vars.yml:/workspace/extra_vars.yml \
      suse-ai-node-ansible-runner \
      ansible-playbook -i inventory.ini playbooks/site.yml -e "@extra_vars.yml"

    If your target ansible_host is localhost:

    > docker run --rm \
      --network host \
      -v ~/.ssh/id_rsa:/root/.ssh/id_rsa:ro \
      -v ./inventory.ini:/workspace/inventory.ini \
      -v ./extra_vars.yml:/workspace/extra_vars.yml \
      suse-ai-node-ansible-runner \
      ansible-playbook -i inventory.ini playbooks/stage1.yml -e "@extra_vars.yml"
    
    > docker run --rm \
      -v ~/.ssh/id_rsa:/root/.ssh/id_rsa:ro \
      -v ./inventory.ini:/workspace/inventory.ini \
      -v ./extra_vars.yml:/workspace/extra_vars.yml \
      suse-ai-node-ansible-runner \
      ansible-playbook -i inventory.ini playbooks/stage2.yml -e "@extra_vars.yml"
    Note
    Note

    NVIDIA drivers are not installed when localhost is the target. For localhost deployments, we recommend installing the drivers manually.

5.1.3.1 extra_vars.yml examples tailored to specific use cases

Example 5.1: You already have an OS provisioned on the servers and you want to deploy an RKE2 cluster with a Rancher management server.

Note that rancher_enabled is set to true.

# SCC
scc_registration:
  server: "https://scc.suse.com"
  email: "" #Leave empty if nodes are already registered with SCC
  sles_code: "" #Leave empty if nodes are already registered with SCC
  sle_micro_code: "" #Leave empty if nodes are already registered with SCC

# Nvidia
nvidia:
  driver_install: false # Set to true if nvidia drivers need to be installed on nodes with GPU
  gpu_operator_deploy: false # Set to true if you want to deploy nvidia gpu-operator

# RKE2
rke2:
  version: v1.34.3-rc2+rke2r1
  token: suse-rke2-rancher-token
  lb_address: "" # Set this to the address of an existing load balancer.
                 # If blank, the playbooks default to using the IP of the first RKE2 server as the server URL for all other nodes.

# Rancher
rancher:
  enabled: true
  version: 2.13.0 #version should be compatible with rke2.version
  replicas: 1
  hostname: suse-rancher.example.com
  bootstrap_password: rancher
Example 5.2: You already have an OS provisioned on the servers and you want to deploy an RKE2 cluster to serve as the execution environment for SUSE AI workloads.

This cluster can later be imported into a Rancher instance.

Note that rancher_enabled is set to false but NVIDIA driver and GPU operator installation is enabled.

# SCC
scc_registration:
  server: "https://scc.suse.com"
  email: "" #Leave empty if nodes are already registered with SCC
  sles_code: "" #Leave empty if nodes are already registered with SCC
  sle_micro_code: "" #Leave empty if nodes are already registered with SCC

# Nvidia
nvidia:
  driver_install: true # Set to true if nvidia drivers need to be installed on nodes with GPU
  gpu_operator_deploy: true # Set to true if you want to deploy nvidia gpu-operator

# RKE2
rke2:
  version: v1.34.3-rc2+rke2r1
  token: suse-rke2-rancher-token
  lb_address: "" # Set this to the address of an existing load balancer.
                 # If blank, the playbooks default to using the IP of the first RKE2 server as the server URL for all other nodes.

# Rancher
rancher:
  enabled: false
  version: 2.13.0 #version should be compatible with rke2.version
  replicas: 1
  hostname: suse-rancher.example.com
  bootstrap_password: rancher
Tip
Tip

Use the same extra_vars.yml if you already operate a Rancher instance with a downstream cluster under management, and you want to configure it to serve as the execution environment for SUSE AI workloads.

After SUSE AI Node Installer completes successfully, the nodes are fully prepared for SUSE AI workloads deployment. You can choose whether to deploy them manually as described in Chapter 4, Installing applications from AI Library, or by means of SUSE AI Deployer as described in Section 5.2, “AI Library deployer”.

5.2 AI Library deployer

SUSE AI Deployer consists of a meta Helm chart that takes care of downloading and installing individual AI Library components required by SUSE AI on a Kubernetes cluster.

The following procedure describes how to customize and use the SUSE AI Deployer to install AI Library components. It assumes that you already completed steps described in Section 4.1, “Installation procedure” including the installation of cert-manager.

  1. Pull the SUSE AI Deployer Helm chart with the relevant chart version and untar it. You can find the latest of the chart on the SUSE Application Collection page at https://apps.rancher.io/applications/suse-ai-deployer.

    > helm pull oci://dp.apps.rancher.io/charts/suse-ai-deployer \
      --version 1.0.0 --untar
    > cd suse-ai-deployer
  2. Inspect the downloaded chart and its default values.

    > helm show chart .
    > helm show values .
    Tip
    Tip

    To see default values for the charts of the individual components within the meta chart, run the following commands.

    > helm show values charts/ollama/
    > helm show values charts/open-webui/
    > helm show values charts/milvus/
    > helm show values charts/pytorch
  3. Explore downloaded example override files in the suse-ai-deployer/examples subdirectory. It typically includes the following files:

    suse-gen-ai-minimal.yaml

    Basic configuration to get started with GenAI. It deploys Ollama without GPU support, Open WebUI, and Milvus in stand-alone mode using local storage. PyTorch is disabled.

    suse-gen-ai.yaml

    Configuration optimized for production usage. It deploys Ollama with GPU support, Open WebUI, and Milvus in cluster mode using Longhorn storage. PyTorch is disabled.

    suse-ml-stack.yaml

    Basic configuration that enables deployment of PyTorch with no GPU support with Longhorn storage. It deploys PyTorch but disables Ollama, Open WebUI and Milvus.

  4. Create custom-overrides.yaml override file based one of the above examples. The examples use self-signed certificates for TLS communication. To use other option (see Section 4.6.6.1, “TLS sources”), copy the global section from the values.yaml file into your custom-overrides.yaml and update its tls section as needed.

  5. Install the SUSE AI Deployer Helm chart with while overriding values from the custom-overrides.yaml file. Use the appropriate RELEASE_NAME and SUSE_AI_NAMESPACE based the configuration in custom-overrides.yaml.

    > helm upgrade --install \
      RELEASE_NAME \
      --namespace  SUSE_AI_NAMESPACE \
      --create-namespace \
      --values ./custom-overrides.yaml \
      --version 1.0.0 \
      oci://dp.apps.rancher.io/charts/suse-ai-deployer

5.3 Lifecycle manager

SUSE AI Lifecycle Manager is a Rancher UI Extension for managing SUSE AI components across Kubernetes clusters. This extension provides a unified interface for installing, managing and monitoring AI workloads in Rancher-managed clusters. Its usage and administration depend on which user role is being used.

5.3.1 Workflows and responsibilities

The workflow is defined by a user role and is different for a user and an administrator.

5.3.2 Benefits compared to manual deployment

Centralized, role-based control

Permissions are enforced inside Rancher and split into roles. This approach offers greater security and governance without giving everyone Helm access.

Rancher UI–driven workflows

Installed and used entirely from the Rancher Dashboard. Application deployment becomes guided and predictable.

Curated setup for AI workloads

Encodes best practices for AI applications. It assumes:

  • private registries

  • controlled deployment paths

  • abstracts complexity away from end users

Secure handling of private registries

Uses ClusterRepo objects where repositories are:

  • centrally configured

  • authenticated once

  • reused across clusters

5.3.3 Installation using Rancher UI

Important
Important

You must have Rancher Administrator privileges to perform this task.

  1. In Rancher, navigate to Extensions › Manage Extensions Catalog

  2. Import the extension catalog from GitHub. Enter ghcr.io/suse/suse-ai-lifecycle-manager:<VERSION> in the Catalog Image Reference input field. Replace <VERSION> with a tag published in the GitHub Container Registry. Confirm with Load.

  3. From the Manage Repositories page, verify if the SUSE AI Lifecycle Manager repository has the Active state. If not, refresh the connection.

  4. Go back to the Extensions page and start the installation by clicking SUSE AI Lifecycle Manager.

Note
Note

Newly published catalogs are not always available immediately. If the catalog does not show up after publishing, navigate to Extensions › Manage Repositories and manually refresh the repository to force a re-sync.

5.3.4 Installation using SUSE AI Operator

Important
Important

You must have Rancher Administrator privileges to perform this task.

Important
Important

For multi-cluster deployments, install the SUSE AI Operator to the same cluster where the main Rancher server runs.

  1. Install the SUSE AI Operator:

    > helm install suse-ai-operator \
      -n suse-ai-operator-system \ 1
      --version 0.1.0 \ 2
      --create-namespace \
      oci://ghcr.io/suse/chart/suse-ai-operator

    1

    Modify the namespace to match your needs.

    2

    You may replace the version with the production tag published in the GitHub Container Registry.

  2. Create the InstallAIExtension Custom Resource (CR). Save the suse-ai-extension.yaml file with the following content.

    apiVersion: ai-platform.suse.com/v1alpha1
    kind: InstallAIExtension
    metadata:
      name: suseai
    spec:
      helm:
        name: suse-ai-lifecycle-manager
        url: "oci://ghcr.io/suse/chart/suse-ai-lifecycle-manager"
        version: "1.0.0" 1
      extension:
        name: suse-ai-lifecycle-manager
        version: "1.0.0" 2

    1

    You may replace the version with the production tag published in the GitHub Container Registry.

    2

    You may replace the version with the production tag published in the GitHub Container Registry.

  3. Apply the suse-ai-extension.yaml file to install the extension.

    > kubectl apply -f suse-ai-extension.yaml

5.3.4.1 Uninstalling the SUSE AI Extension and Operator

  1. Remove the InstallAIExtension CR:

    > kubectl delete -f suse-ai-extension.yaml
  2. Uninstall the SUSE AI Operator:

    > helm uninstall suse-ai-operator -n suse-ai-operator-system
  3. Delete the associated Custom Resource Definitions (CRDs):

    > kubectl delete crd installaiextension.ai-platform.suse.com

5.3.5 ClusterRepo repository

Helm charts of AI applications are hosted in the private SUSE-trusted registries: SUSE Application Collection and SUSE Registry. Therefore, SUSE AI Lifecycle Manager requires a ClusterRepo repository as the source for AI application charts. The ClusterRepo name will be used by the extension to identify and select the charts' source.

Important
Important

You must create both the SUSE Application Collection and SUSE Registry ClusterRepos in both the Rancher Server cluster and all downstream clusters.

Important
Important

You must have Rancher Administrator privileges to perform this task.

Creating ClusterRepo for the SUSE Application Collection repository
  1. Refer to the document Rancher Authentication to create the user name and access token (password) for SUSE Application Collection.

  2. In Rancher, navigate to Cluster › Apps › Repositories.

  3. Click Create and provide the following details:

    • Name: application-collection

    • Description: Leave this empty.

    • Target: OCI repository

    • OCI Repository Host URL: oci://dp.apps.rancher.io/charts

    • Authentication: Create an HTTP Basic Auth Secret

    • Username: The Application Collection user account or service account user name.

    • Password: The Application Collection access token or service account secret.

  4. Wait until the repository status changes to Active.

Creating ClusterRepo for the SUSE Registry repository
  1. Refer to the document Using container images from the SUSE Registry to get the user name and the access token (password) for SUSE Registry. The access code is the same as the registration code of your SUSE AI subscription, which you can find on your SCC account at https://scc.suse.com/ by following these steps:

    1. Log in to https://scc.suse.com/.

    2. Select your organization under MY ORGANIZATIONS.

    3. In your organization dashboard, click Subscriptions.

    4. If your organization has purchased a SUSE AI subscription, you can find the SUSE AI subscription and its registration key in the subscriptions list.

  2. In Rancher, navigate to Cluster › Apps › Repositories.

  3. Click Create and provide the following details:

    • Name: suse-ai-registry

    • Description: Leave this empty.

    • Target: OCI repository

    • OCI Repository Host URL: oci://registry.suse.com/ai/charts

    • Authentication: Create an HTTP Basic Auth Secret

    • Username: The user name is 'regcode' for everyone.

    • Password: The password is the same as your SUSE AI subscription code.

  4. Wait until the repository status changes to Active.

Note
Note: Same repository name across clusters

We strongly recommend using the same repository name across all clusters. This ensures consistency and allows the SUSE AI Lifecycle Manager to easily reference the repository. For example, application-collection for the SUSE Application Collection, and suse-ai-registry for the SUSE Registry.

5.3.6 Installing and managing applications

After the administrator has installed the SUSE AI Lifecycle Manager extension and added the ClusterRepos repository, users with Rancher Admin and Cluster Owner roles can deploy and manage AI applications.

Installing applications
  1. Click the application’s tile to navigate to its instances page.

  2. Provide the basic information required for the Helm installation, such as the release name, namespace and version. The repository is automatically selected. Click Next to navigate to the Target Cluster page to select the cluster(s).

  3. On the Target Cluster page, choose one or more clusters that meet the application’s resource requirements and click Next to go to the Configuration page.

  4. On the Configuration page, customize the installation using the UI form for key settings or the YAML file for advanced options, and click Next to go to the Review page.

  5. Review the installation summary on the Review page and confirm by clicking Install to install the application. To navigate to previous steps to adjust the settings instead, click the Previous button.

Managing application instances
  1. Click the application’s tile to navigate to its instances page.

  2. Click Manage for the application to start its Manage wizard.

  3. Update the application’s details on the wizard screens.

  4. On the Review page, click Save to apply the changes.

Deleting application instances
  1. Click the application’s tile to navigate to its instances page.

  2. Click Delete for the instance you want to delete.

  3. In the pop-up window, confirm the deletion.