Applies to SUSE AI 1.0

5 Alternative deployment #

You can automate key parts of your SUSE AI configuration and deployment. This includes preparing SUSE-based servers for AI workloads by deploying RKE2, Rancher, SUSE Rancher Prime, NVIDIA GPU drivers and NVIDIA GPU Operator using SUSE AI Node Installer (Section 5.1, “Node installer”).

5.1 Node installer #

The SUSE AI Node Installer is an automation toolkit based on Ansible. It provides a fully automated, repeatable method to prepare and configure SUSE-driven servers for AI workloads. This approach simplifies and streamlines the installation of selected cluster components.

Important: SUSE AI Node Installer does not install OS

SUSE AI Node Installer is not an OS provisioning tool. You need to install a supported OS first on the affected nodes before using SUSE AI Node Installer. Refer to Section 2.1, “Installing SUSE Linux Enterprise Server” or SUSE Linux Micro documentation for steps to install one of the recommended OS.

5.1.1 How does SUSE AI Node Installer work? #

SUSE AI Node Installer uses Ansible playbooks to install and configure selected cluster components on a specified group of servers. All playbooks run inside a containerized execution environment, ensuring portability, consistency and ease of use across developer and production environments. Since the setup is done via the Ansible playbooks, the process is idempotent — rerunning it will not produce conflicts. It simply ensures that the system matches the intended configuration.

The installer deploys:

RKE2 - SUSE’s CNCF-certified Kubernetes distribution
(Optional) Rancher / SUSE Rancher Prime - centralized multi-cluster management
(Optional) NVIDIA GPU support - NVIDIA drivers and NVIDIA GPU Operator

5.1.2 Key capabilities #

Repeatable cluster bootstrap: Supports single-node and HA RKE2 deployments.
Portable execution: Playbooks run from a container image ensuring consistent Ansible versions and dependencies.
GPU-ready automation: Seamless installation of NVIDIA drivers and GPU Operator when supported hardware is present.
Production-aligned choices: Supports Rancher or SUSE Rancher Prime deployments, plus idempotent Ansible design enabling safe re-runs.
Flexible environment support: Works on bare metal, virtual machines or cloud instances.

5.1.3 Deploying nodes using SUSE AI Node Installer #

SUSE AI Node Installer covers the following use cases:

You already have an OS provisioned on the servers and you want to deploy an RKE2 cluster with a Rancher management server.
You already have an OS provisioned on the servers and you want to deploy an RKE2 cluster to serve as the execution environment for SUSE AI workloads. This cluster can later be imported into a Rancher instance.
You already operate a Rancher instance with a downstream cluster under management, and you want to configure that downstream cluster to serve as the execution environment for SUSE AI workloads.

Prerequisites before running the SUSE AI Node Installer #

Organizational readiness

Define the set of nodes to prepare for the RKE2 cluster.
A valid registration key for the SLE distribution, which can be obtained with your SUSE subscription.
Ensure IP addresses, host names and networking are functional.

Target Host Requirements

Supported Linux distributions: SLE, SUSE Linux Micro or openSUSE Leap variants recommended.
Sufficient CPU, memory and storage resources for the intended AI workload. Refer to SUSE AI Requirements for general requirements.
(Optional) NVIDIA GPUs if you plan to run GPU-accelerated workloads.
Fulfill prerequisites at RKE2 Prerequisites.
Target hosts include Python 3.11 or later. Verify that Python 3 points to version 3.11 or higher by running python3 --version.
Confirm that sudo permissions are configured correctly.

Access Requirements

A management workstation to run SUSE AI Node Installer with Git and Docker or Podman installed.
Network connectivity and SSH access to each target node

Installation and setup #

On your management workstation, clone the SUSE AI Node Installer repository.

> git clone https://github.com/SUSE/suse-ai-node-ansible.git
> cd suse-ai-node-ansible

Build the Docker image from the source.

> docker build \
  -t suse-ai-node-ansible-runner
  -f Dockerfile.local .

Or, if you prefer the newer buildx:

> docker buildx build \
  -t suse-ai-node-ansible-runner
  -f Dockerfile.local --load .

For each RKE2 cluster, create the inventory.ini file.
```
> cp inventory.ini.example inventory.ini
```
For each RKE2 cluster, create the extra_vars.yml file and align its entries with your use case. Refer to Section 5.1.3.1, “extra_vars.yml examples tailored to specific use cases” for examples of extra_vars.yml tailored to a specific use case.
```
> cp extra_vars.yml.example extra_vars.yml
```

Run the site.yml playbook. At a high level, the playbook automates the node setup by performing the following steps:

Verifies that the target hosts are supported systems.
Registers hosts with the SUSE Customer Center if they are not already registered.
Installs required packages and optionally NVIDIA G06 drivers for compatible GPUs (Turing or newer).
Reboots the target hosts and runs checks after the reboot.
Installs RKE2 servers, RKE2 agents, Rancher and the GPU Operator.

> docker run --rm \
  -v ~/.ssh/id_rsa:/root/.ssh/id_rsa:ro \
  -v ./inventory.ini:/workspace/inventory.ini \
  -v ./extra_vars.yml:/workspace/extra_vars.yml \
  suse-ai-node-ansible-runner \
  ansible-playbook -i inventory.ini playbooks/site.yml -e "@extra_vars.yml"

If your target ansible_host is localhost:

> docker run --rm \
  --network host \
  -v ~/.ssh/id_rsa:/root/.ssh/id_rsa:ro \
  -v ./inventory.ini:/workspace/inventory.ini \
  -v ./extra_vars.yml:/workspace/extra_vars.yml \
  suse-ai-node-ansible-runner \
  ansible-playbook -i inventory.ini playbooks/stage1.yml -e "@extra_vars.yml"

> docker run --rm \
  -v ~/.ssh/id_rsa:/root/.ssh/id_rsa:ro \
  -v ./inventory.ini:/workspace/inventory.ini \
  -v ./extra_vars.yml:/workspace/extra_vars.yml \
  suse-ai-node-ansible-runner \
  ansible-playbook -i inventory.ini playbooks/stage2.yml -e "@extra_vars.yml"

Note

NVIDIA drivers are not installed when localhost is the target. For localhost deployments, we recommend installing the drivers manually.

5.1.3.1 `extra_vars.yml` examples tailored to specific use cases #

Example 5.1: You already have an OS provisioned on the servers and you want to deploy an RKE2 cluster with a Rancher management server. #

Note that rancher_enabled is set to true.

# SCC
scc_registration:
  server: "https://scc.suse.com"
  email: "" #Leave empty if nodes are already registered with SCC
  sles_code: "" #Leave empty if nodes are already registered with SCC
  sle_micro_code: "" #Leave empty if nodes are already registered with SCC

# Nvidia
nvidia:
  driver_install: false # Set to true if nvidia drivers need to be installed on nodes with GPU
  gpu_operator_deploy: false # Set to true if you want to deploy nvidia gpu-operator

# RKE2
rke2:
  version: v1.34.3-rc2+rke2r1
  token: suse-rke2-rancher-token
  lb_address: "" # Set this to the address of an existing load balancer.
                 # If blank, the playbooks default to using the IP of the first RKE2 server as the server URL for all other nodes.

# Rancher
rancher:
  enabled: true
  version: 2.13.0 #version should be compatible with rke2.version
  replicas: 1
  hostname: suse-rancher.example.com
  bootstrap_password: rancher

Example 5.2: You already have an OS provisioned on the servers and you want to deploy an RKE2 cluster to serve as the execution environment for SUSE AI workloads. #

This cluster can later be imported into a Rancher instance.

Note that rancher_enabled is set to false but NVIDIA driver and GPU operator installation is enabled.

# SCC
scc_registration:
  server: "https://scc.suse.com"
  email: "" #Leave empty if nodes are already registered with SCC
  sles_code: "" #Leave empty if nodes are already registered with SCC
  sle_micro_code: "" #Leave empty if nodes are already registered with SCC

# Nvidia
nvidia:
  driver_install: true # Set to true if nvidia drivers need to be installed on nodes with GPU
  gpu_operator_deploy: true # Set to true if you want to deploy nvidia gpu-operator

# RKE2
rke2:
  version: v1.34.3-rc2+rke2r1
  token: suse-rke2-rancher-token
  lb_address: "" # Set this to the address of an existing load balancer.
                 # If blank, the playbooks default to using the IP of the first RKE2 server as the server URL for all other nodes.

# Rancher
rancher:
  enabled: false
  version: 2.13.0 #version should be compatible with rke2.version
  replicas: 1
  hostname: suse-rancher.example.com
  bootstrap_password: rancher

Tip

Use the same extra_vars.yml if you already operate a Rancher instance with a downstream cluster under management, and you want to configure it to serve as the execution environment for SUSE AI workloads.

After SUSE AI Node Installer completes successfully, the nodes are fully prepared for deploying SUSE AI workloads. Deploy the required applications as described in Chapter 4, Installing applications from AI Library.

5 Alternative deployment #

5.1 Node installer #

5.1.1 How does SUSE AI Node Installer work? #

5.1.2 Key capabilities #

5.1.3 Deploying nodes using SUSE AI Node Installer #

5.1.3.1 extra_vars.yml examples tailored to specific use cases #

5.1.3.1 `extra_vars.yml` examples tailored to specific use cases #