SUSE AI 1.0

Deploying and Installing SUSE AI in Air-Gapped Environments

Publication Date: 19 Nov 2025

WHAT?: This document provides a comprehensive, step-by-step guide for the SUSE AI air-gapped deployment.
WHY?: To help users successfully complete the air-gapped deployment process.
GOAL: To learn enough information to deploy SUSE AI in both testing and production air-gapped environments.
EFFORT: Less than one hour of reading and an advanced knowledge of Linux deployment.

SUSE AI is a versatile product consisting of multiple software layers and components. This document outlines the complete workflow for air-gapped deployment and installation of all SUSE AI dependencies, as well as SUSE AI itself. You can also find references to recommended hardware and software requirements, as well as steps to take after the product installation.

Tip: Hardware and software requirements

For hardware, software and application-specific requirements, refer to SUSE AI requirements.

Revision History: Deploying and Installing SUSE AI in Air-Gapped Environments

1 Air-gapped environments #

An air-gapped environment is a security measure where a single host or the whole network is isolated from all other networks, such as the public Internet. This “air gap” acts as a physical or logical barrier, preventing any direct connection that could be exploited by cyber threats.

1.1 Why you need an air-gapped environment? #

The primary goal is to protect highly sensitive data and critical systems from unauthorized access, cyber attacks, malware and ransomware. Air-gapped environments are typically found in situations where security is of the utmost importance, such as:

Military and government networks handling classified information.
Industrial control systems (ICS) for critical infrastructure like power plants and water treatment facilities.
Financial institutions and stock exchanges.
Systems controlling nuclear power plants or other life-critical operations.

1.2 How do air-gapped environments work? #

There are two types of air gaps:

Physical air gaps: This is the most secure method, where the system is disconnected from any network. It might even involve placing the system in a shielded room.
Logical air gaps: This type uses software controls such as firewall rules and network segmentation to create a highly restricted connection. While it offers more convenience, it is not as secure as a physical air gap because the air-gapped system is still technically connected to a network.

A diagram showing a general schema of an air-gapped environment

Figure 1: General schema of an air-gapped environment #

1.3 What challenges do air-gapped environments face? #

When working in air-gapped systems, you usually face the following limitations:

Manual updates: Air-gapped systems cannot automatically receive software or security updates from external networks. You must manually download and install updates, which can be time-consuming and create vulnerabilities if not done regularly.
Insider threats and physical attacks: An air gap does not protect against threats that gain physical access to the system, such as a malicious insider with a compromised USB drive.
Limited functionality: The lack of connectivity limits the system's ability to communicate with other devices or services, making it less efficient for many modern applications.

2 Installation overview #

The following chart illustrates the installation process of SUSE AI. It outlines the following possible scenarios:

You have clean cluster nodes prepared without a supported Linux operating system installed.
You have a supported Linux operating system and Kubernetes distribution installed on cluster nodes.
You have SUSE Rancher Prime and all supportive components installed on the Kubernetes cluster and are prepared to install the required applications from the AI Library.

Figure 2: SUSE AI air-gapped installation process #

2.1 Air-gapped stack #

The air-gapped stack is a set of scripts that ease the successful air-gap installation of certain SUSE AI components. To use them, you need to clone or download them from the stack's GitHub repository.

2.1.1 What scripts does the stack include? #

The following scripts are included in the air-gapped stack:

SUSE-AI-mirror-nvidia.sh: Mirrors all RPM packages from a specified Web URLs.
SUSE-AI-get-images.sh: Downloads Docker images of SUSE AI applications from SUSE Application Collection.
SUSE-AI-load-images.sh: Loads downloaded Docker images into a custom Docker image registry.

2.1.2 Where do the scripts fit into the air-gap installation? #

The scripts are required in several places during the SUSE AI air-gapped installation. The following simplified workflow outlines the intended usage:

Use SUSE-AI-mirror-nvidia.sh on a remote host to download the required NVIDIA RPM packages. Transfer the downloaded content to an air-gapped local host and add it as a Zypper repository to install NVIDIA drivers on local GPU nodes.
Use SUSE-AI-get-images.sh on a remote host to download Docker images of required SUSE AI components. Transfer them to an air-gapped local host.
Use SUSE-AI-load-images.sh to load the transferred Docker images of SUSE AI components into a custom local $Docker image registry.
Install AI Library components on the local Kubernetes cluster from the local custom Docker registry.

3 Installing the Linux and Kubernetes distribution #

This procedure includes the steps to install the base Linux operating system and a Kubernetes distribution for users who start deploying on cluster nodes from scratch. If you already have a Kubernetes cluster installed and running, you can skip this procedure and continue with Section 5.1, “Installation procedure”.

Install and register a supported Linux operating system on each cluster node. We recommend using one of the following operating systems:
- SUSE Linux Server 15 SP6 for a traditional non-transactional operating system. For more information, see Section 3.1, “Installing SUSE Linux Server”.
- SUSE Linux Micro 6.1 for an immutable transactional operating system. For more information, see SUSE Linux Micro 6.1 documentation.
For a list of supported operating systems, refer to https://www.suse.com/suse-rancher/support-matrix/all-supported-versions/.
Install the NVIDIA GPU driver on cluster nodes with GPUs. Refer to Section 3.2, “Installing NVIDIA GPU drivers” for details.
Install Kubernetes on cluster nodes. We recommend using the supported SUSE Rancher Prime: RKE2 distribution. Refer to Section 3.3, “Installing SUSE Rancher Prime: RKE2 in air-gapped environments” for details. For a list of supported Kubernetes platforms, refer to https://www.suse.com/suse-rancher/support-matrix/all-supported-versions/.

3.1 Installing SUSE Linux Server #

Use the following procedures to install SLES on all supported hardware platforms. They assume you have successfully booted into the installation system. For more detailed installation instructions and deployment strategies, refer to SUSE Linux Server Deployment Guide.

3.1.1 The Unified Installer #

Starting with SLES 15, the installation medium consists only of the Unified Installer, a minimal system for installing, updating and registering all SUSE Linux base products. During the installation, you can add functionality by selecting modules and extensions to be installed on top of the Unified Installer.

3.1.2 Installing offline or without registration #

The default installation medium 15 SP6-Online-ARCH-GM-media1.iso is optimized for size and does not contain any modules and extensions. Therefore, the installation requires network access to register your product and retrieve repository data for the modules and extensions.

For installation without registering the system, use the 15 SP6-Full-ARCH-GM-media1.iso image from https://www.suse.com/download/sles/ and refer to Installing without registration.

Tip: Copying the installation media image to a removable flash disk

Use the following command to copy the contents of the installation image to a removable flash disk.

> sudo dd if=IMAGE of=FLASH_DISK bs=4M && sync

IMAGE needs to be replaced with the path to the 15 SP6-Online-ARCH-GM-media1.iso or 15 SP6-Full-ARCH-GM-media1.iso image file. FLASH_DISK needs to be replaced with the flash device. To identify the device, insert it and run:

# grep -Ff <(hwinfo --disk --short) <(hwinfo --usb --short)
disk:
  /dev/sdc             General USB Flash Disk

Make sure the size of the device is sufficient for the desired image. You can check the size of the device with:

# fdisk -l /dev/sdc | grep -e "^/dev"
     /dev/sdc1  *     2048 31490047 31488000  15G 83 Linux

In this example, the device has a capacity of 15 GB. The command to use for the 15 SP6-Full-ARCH-GM-media1.iso would be:

dd if=15 SP6-Full-ARCH-GM-media1.iso of=/dev/sdc bs=4M && sync

The device must not be mounted when running the dd command. Note that all data on the partition will be erased.

3.1.3 The installation procedure #

To install SLES, boot or IPL into the installer from the Unified Installer medium and start the installation.

3.1.3.1 Language, keyboard and product selection #

Language, keyboard and product selection screen

Figure 3: Language, keyboard and product selection #

The Language and Keyboard Layout settings are initialized with the language you chose on the boot screen. If you do not change the default, it remains English (US). Change the settings here, if necessary. Use the Keyboard Test text box to test the layout.

Select SUSE Linux Server 15 SP6 for installation. You need to have a registration code for the product. Proceed with Next.

Tip: Light and high-contrast themes

If you have difficulty reading the labels in the installer, you can change the widget colors and theme.

Click the button or press Shift–F3 to open a theme selection dialog. Select a theme from the list and Close the dialog.

Shift–F4 switches to the color scheme for vision-impaired users. Press the buttons again to switch back to the default scheme.

3.1.3.2 License agreement #

Figure 4: License agreement #

Read the License Agreement. It is presented in the language you have chosen on the boot screen. Translations are available via the License Language drop-down list. You need to accept the agreement by checking I Agree to the License Terms to install SLES. Proceed with Next.

3.1.3.3 Network settings #

Figure 5: Network settings #

A system analysis is performed, where the installer probes for storage devices and tries to find other installed systems. If the network was automatically configured via DHCP during the start of the installation, you are presented the registration step.

If the network is not yet configured, the Network Settings dialog opens. Choose a network interface from the list and configure it with Edit. Alternatively, Add an interface manually. See the sections on installer network settings and configuring a network connection with YaST for more information. If you prefer to do an installation without network access, skip this step without making any changes and proceed with Next.

3.1.3.4 Registration #

Figure 6: Registration #

To get technical support and product updates, you need to register and activate SLES with the SUSE Customer Center or a local registration server. Registering your product at this stage also grants you immediate access to the update repository. This enables you to install the system with the latest updates and patches available.

When registering, repositories and dependencies for modules and extensions are loaded from the registration server.

Register system at scc.suse.com

To register at the SUSE Customer Center, enter the E-mail Address associated with your SUSE Customer Center account and the Registration Code for SLES. Proceed with Next.

Register system via local RMT server

If your organization provides a local registration server, you may alternatively register to it. Activate Register System via local RMT Server and either choose a URL from the drop-down list or type in an address. Proceed with Next.

Skip registration

If you are offline or want to skip registration, activate Skip Registration. Accept the warning with OK and proceed with Next.

Important: Skipping the registration

Your system and extensions need to be registered to retrieve updates and to be eligible for support. Skipping the registration is only possible when installing from the 15 SP6-Full-ARCH-GM-media1.iso image.

If you do not register during the installation, you can do so at any time later from the running system. To do so, run YaST › Product Registration or the command-line tool SUSEConnect.

Tip: Installing product patches at installation time

After SLES has been successfully registered, you are asked whether to install the latest available online updates during the installation. If choosing Yes, the system will be installed with the most current packages without having to apply the updates after installation. Activating this option is recommended.

Note: Firewall settings for receiving updates

By default, the firewall on SUSE AI only blocks incoming connections. If your system is behind another firewall that blocks outgoing traffic, make sure to allow connections to https://scc.suse.com/ and https://updates.suse.com on ports 80 and 443 to receive updates.

3.1.3.5 Extension and module selection #

Figure 7: Extension and module selection #

After the system is successfully registered, the installer lists modules and extensions that are available for SLES. Modules are components that allow you to customize the product according to your needs. They are included in your SLES subscription. Extensions add functionality to your product. They must be purchased separately.

The availability of certain modules or extensions depends on the product selected in the first step of the installation. For a description of the modules and their lifecycles, select a module to see the accompanying text. More detailed information is available in the Modules and Extensions Quick Start.

The selection of modules indirectly affects the scope of the installation, because it defines which software sources (repositories) are available for installation and in the running system.

The following modules and extensions are available for SUSE Linux Server:

Basesystem Module

This module adds a basic system on top of the Unified Installer. It is required by all other modules and extensions. The scope of an installation that only contains the base system is comparable to the installation pattern minimal system of previous SLES versions. This module is selected for installation by default and should not be deselected.

Dependencies: None

Certifications Module

Contains the FIPS certification packages.

Dependencies: Server Applications

Confidential Computing Technical Preview

Contains packages related to confidential computing.

Dependencies: Basesystem

Containers Module

Contains support and tools for containers.

Dependencies: Basesystem

Desktop Applications Module

Adds a graphical user interface and essential desktop applications to the system.

Dependencies: Basesystem

Development Tools Module

Contains the compilers (including gcc) and libraries required for compiling and debugging applications. Replaces the former Software Development Kit (SDK).

Dependencies: Basesystem, Desktop Applications

Legacy Module

Helps you with migrating applications from earlier versions of SLES and other systems to SLES 15 SP6 by providing packages which are discontinued on SUSE Linux. Packages in this module are selected based on the requirements for migration and the level of complexity of configuration.

This module is recommended when migrating from a previous product version.

Dependencies: Basesystem, Server Applications

NVIDIA Compute Module

Contains the NVIDIA CUDA (Compute Unified Device Architecture) drivers.

The software in this module is provided by NVIDIA under the CUDA End User License Agreement and is not supported by SUSE.

Dependencies: Basesystem

Public Cloud Module

Contains all tools required to create images for deploying SLES in cloud environments such as Amazon Web Services (AWS), Microsoft Azure, Google Compute Platform, or OpenStack.

Dependencies: Basesystem, Server Applications

Python 3 Module

This module contains the most recent versions of the selected Python 3 packages.

Dependencies: Basesystem

SAP Business One Server

This module contains packages and system configurations specific to SAP Business One Server. It is maintained and supported under the SUSE Linux Server product subscription.

Dependencies: Basesystem, Server Applications, Desktop Applications, Development Tools

Server Applications Module

Adds server functionality by providing network services such as DHCP server, name server, or Web server.

Dependencies: Basesystem

SUSE Linux High Availability

Adds clustering support for mission-critical setups to SLES. This extension requires a separate license key.

Dependencies: Basesystem, Server Applications

SUSE Linux Live Patching

Adds support for performing critical patching without having to shut down the system. This extension requires a separate license key.

Dependencies: Basesystem, Server Applications

SUSE Linux Workstation Extension

Extends the functionality of SLES with packages from SUSE Linux Desktop, like additional desktop applications (office suite, e-mail client, graphical editor, etc.) and libraries. It allows combining both products to create a fully featured workstation. This extension requires a separate license key.

Dependencies: Basesystem, Desktop Applications

SUSE Package Hub

Provides access to packages for SLES maintained by the openSUSE community. These packages are delivered without L3 support and do not interfere with the supportability of SLES. For more information, refer to https://packagehub.suse.com/.

Dependencies: Basesystem

Transactional Server Module

Adds support for transactional updates. Updates are either applied to the system as a single transaction or not applied at all. This happens without influencing the running system. If an update fails, or if the successful update is deemed to be incompatible or otherwise incorrect, it can be discarded to immediately return the system to its previous functioning state.

Dependencies: Basesystem

Web and Scripting Module

Contains packages intended for a running Web server.

Dependencies: Basesystem, Server Applications

Certain modules depend on the installation of other modules. Therefore, when selecting a module, other modules may be selected automatically to fulfill dependencies.

Depending on the product, the registration server can mark modules and extensions as recommended. Recommended modules and extensions are preselected for registration and installation. To avoid installing these recommendations, deselect them manually.

Select the modules and extensions you want to install and proceed with Next. In case you have chosen one or more extensions, you will be prompted to provide the respective registration codes. Depending on your choice, it may also be necessary to accept additional license agreements.

Important: Default modules for offline installation

When performing an offline installation from the 15 SP6-Full-ARCH-GM-media1.iso, only the Basesystem Module is selected by default. To install the complete default package set of SUSE Linux Server, additionally select the Server Applications Module and the Python 3 Module.

3.1.3.6 Add-on product #

Figure 8: Add-on product #

The Add-On Product dialog allows you to add additional software sources (called “repositories”) to SLES that are not provided by the SUSE Customer Center. Add-on products may include third-party products and drivers as well as additional software for your system.

Tip: Adding drivers during the installation

You can also add driver update repositories via the Add-On Product dialog. Driver updates for SUSE Linux are provided at https://drivers.suse.com/. These drivers have been created through the SUSE SolidDriver Program.

To skip this step, proceed with Next. Otherwise, activate I would like to install an additional Add On Product. Specify a media type, a local path, or a network resource hosting the repository and follow the on-screen instructions.

Check Download Repository Description Files to download the files describing the repository now. If deactivated, they will be downloaded after the installation has started. Proceed with Next and insert a medium if required. Depending on the content of the product, it may be necessary to accept additional license agreements. Proceed with Next. If you have chosen an add-on product requiring a registration key, you will be asked to enter it before proceeding to the next step.

3.1.3.7 System role #

Figure 9: System role #

The availability of system roles depends on your selection of modules and extensions. System roles define, for example, the set of software patterns that are preselected for the installation. Refer to the description on the screen to make your choice. Select a role and proceed with Next. If from the enabled modules only one role or no role is suitable for the respective base product, the System Role dialog is omitted.

Tip: Release notes

From this point on, the Release Notes can be viewed from any screen during the installation process by selecting Release Notes.

3.1.3.8 Suggested partitioning #

Figure 10: Suggested partitioning #

Review the partition setup proposed by the system. If necessary, change it. You have the following options:

Guided setup

Starts a wizard that lets you refine the partitioning proposal. The options available here depend on your system setup. If it contains more than a single hard disk, you can choose which disk or disks to use and where to place the root partition. If the disks already contain partitions, decide whether to remove or resize them.

In subsequent steps, you may also add LVM support and disk encryption. You can change the file system for the root partition and decide whether or not to have a separate home partition.

Expert partitioner

Opens the Expert Partitioner. This gives you full control over the partitioning setup and lets you create a custom setup. This option is intended for experts. For details, see the Expert Partitioner chapter.

Warning: Disk space units

For partitioning purposes, disk space is measured in binary units rather than in decimal units. For example, if you enter sizes of 1GB, 1GiB or 1G, they all signify 1 GiB (Gibibyte), as opposed to 1 GB (Gigabyte).

Binary: 1 GiB = 1 073 741 824 bytes.
Decimal: 1 GB = 1 000 000 000 bytes.
Difference: 1 GiB ≈ 1.07 GB.

To accept the proposed setup without any changes, choose Next to proceed.

3.1.3.9 Clock and time zone #

Figure 11: Clock and time zone #

Select the clock and time zone to use in your system. To manually adjust the time or to configure an NTP server for time synchronization, choose Other Settings. See the section on Clock and Time Zone for detailed information. Proceed with Next.

3.1.3.10 Local user #

Figure 12: Local user creation #

To create a local user, type the first and last name in the User’s Full Name field, the login name in the Username field, and the password in the Password field.

The password should be at least eight characters long and should contain both uppercase and lowercase letters and numbers. The maximum length for passwords is 72 characters, and passwords are case-sensitive.

For security reasons, it is also strongly recommended not to enable Automatic Login. You should also not Use this Password for the System Administrator but provide a separate root password in the next installation step.

If you install on a system where a previous Linux installation was found, you may Import User Data from a Previous Installation. Click Choose User for a list of available user accounts. Select one or more users.

In an environment where users are centrally managed (for example, by NIS or LDAP), you can skip the creation of local users. Select Skip User Creation in this case.

Proceed with Next.

3.1.3.11 Authentication for the system administrator “root” #

Authentication for the system administrator “root” screen

Figure 13: Password for the system administrator “root” #

Type a password for the system administrator (called the root user) or provide a public SSH key. If you want, you can use both.

Because the root user is equipped with extensive permissions, the password should be chosen carefully. You should never forget the root password. After you entered it here, the password cannot be retrieved.

Tip: Passwords and keyboard layout

It is recommended to use only US ASCII characters. In the event of a system error or when you need to start your system in rescue mode, the keyboard may not be localized.

To access the system remotely via SSH using a public key, import a key from removable media or an existing partition. See the section on Authentication for the system administrator root for more information.

Proceed with Next.

3.1.3.12 Installation settings #

Figure 14: Installation settings #

Use the Installation Settings screen to review and—if necessary—change several proposed installation settings. The current configuration is listed for each setting. To change it, click the headline. Certain settings, such as firewall or SSH, can be changed directly by clicking the respective links.

Important: Remote access

Changes you can make here can also be made later at any time from the installed system. However, if you need remote access right after the installation, you may need to open the SSH port in the Security settings.

Software

The scope of the installation is defined by the modules and extensions you have chosen for this installation. However, depending on your selection, not all packages available in a module are selected for installation.

Clicking Software opens the Software Selection and System Tasks screen, where you can change the software selection by selecting or deselecting patterns. Each pattern contains several software packages needed for specific functions (for example, KVM Host Server). For a more detailed selection based on software packages to install, select Details to switch to the YaST Software Manager. See Installing or removing software for more information.

Booting

This section shows the boot loader configuration. Changing the defaults is recommended only if really needed. Refer to The boot loader GRUB 2 for details.

Security

The CPU Mitigations refer to kernel boot command-line parameters for software mitigations that have been deployed to prevent CPU side-channel attacks. Click the selected entry to choose a different option. For details, see the section on CPU Mitigations.

By default, the Firewall is enabled on all configured network interfaces. To disable firewalld, click disable (not recommended). Refer to the Masquerading and Firewalls chapter for configuration details.

Note: Firewall settings for receiving updates

The SSH service is enabled by default, but its port (22) is closed in the firewall. Click open to open the port or disable to disable the service. If SSH is disabled, remote logins will not be possible. Refer to Securing network operations with OpenSSH for more information.

The default Major Linux Security Module is AppArmor. To disable it, select None as the module in the Security settings.

Security Policies

Click to enable the Defense Information Systems Agency STIG security policy. If any installation settings are incompatible with the policy, you will be prompted to modify them accordingly. Certain settings can be adjusted automatically while others require user input.

Enabling a security profile enables a full SCAP remediation on first boot. You can also perform a scan only or do nothing and manually remediate the system later with OpenSCAP. For more information, refer to the section on Security Profiles.

Network configuration

Displays the current network configuration. By default, wicked is used for server installations and NetworkManager for desktop workloads. Click Network Configuration to change the settings. For details, see the section on Configuring a network connection with YaST.

Important: Support for NetworkManager

SUSE only supports NetworkManager for desktop workloads with SLED or the Workstation extension. All server certifications are done with wicked as the network configuration tool, and using NetworkManager may invalidate them. NetworkManager is not supported by SUSE for server workloads.

Kdump

Kdump saves the memory image (“core dump”) to the file system in case the kernel crashes. This enables you to find the cause of the crash by debugging the dump file. Kdump is preconfigured and enabled by default. See the Basic Kdump configuration for more information.

Default systemd target

If you have installed the desktop applications module, the system boots into the graphical target, with network, multi-user and display manager support. Switch to multi-user if you do not need to log in via a display manager.

System

View detailed hardware information by clicking System. In the resulting screen, you can also change Kernel Settings—see the section on System Information for more information.

3.1.3.13 Start the installation #

Installation Settings screen with Confirm Installation dialog

Figure 15: Confirm installation #

After you have finalized the system configuration on the Installation Settings screen, click Install. Depending on your software selection, you may need to agree to license agreements before the installation confirmation screen pops up. Up to this point, no changes have been made to your system. After you click Install a second time, the installation process starts.

3.1.3.14 The installation process #

Figure 16: Performing the installation #

During the installation, the progress is shown. After the installation routine has finished, the computer is rebooted into the installed system.

3.2 Installing NVIDIA GPU drivers #

This article demonstrates how to implement host-level NVIDIA GPU support via the open-driver. The open-driver is part of the core package repositories. Therefore, there is no need to compile it or download executable packages. This driver is built into the operating system rather than dynamically loaded by the NVIDIA GPU Operator. This configuration is desirable for customers who want to pre-build all artifacts required for deployment into the image, and where the dynamic selection of the driver version via Kubernetes is not a requirement.

3.2.1 Installing NVIDIA GPU drivers on SUSE Linux Server #

3.2.1.1 Requirements #

If you are following this guide, it assumes that you have the following already available:

At least one host with SLES 15 SP6 installed, physical or virtual.
Your hosts are attached to a subscription as this is required for package access.
A compatible NVIDIA GPU installed or fully passed through to the virtual machine in which SLES is running.
Access to the root user—these instructions assume you are the root user, and not escalating your privileges via sudo.

3.2.1.2 Considerations before the installation #

3.2.1.2.1 Select the driver generation #

You must verify the driver generation for the NVIDIA GPU that your system has. For modern GPUs, the G06 driver is the most common choice. Find more details in the support database.

This section details the installation of the G06 generation of the driver.

3.2.1.2.2 Additional NVIDIA components #

Besides the NVIDIA open-driver provided by SUSE as part of SLES, you might also need additional NVIDIA components. These could include OpenGL libraries, CUDA toolkits, command-line utilities such as nvidia-smi, and container-integration components such as nvidia-container-toolkit. Many of these components are not shipped by SUSE as they are proprietary NVIDIA software. This section describes how to configure additional repositories that give you access to these components and provides examples of using these tools to achieve a fully functional system.

3.2.1.3 The installation procedure #

On the remote host, run the script SUSE-AI-mirror-nvidia.sh from the air-gapped stack (see Section 2.1, “Air-gapped stack”) to download all required NVIDIA RPM packages to a local directory, for example:
```
> SUSE-AI-mirror-nvidia.sh \
  -p /LOCAL_MIRROR_DIRECTORY \
  -l https://nvidia.github.io/libnvidia-container/stable/rpm/x86_64 \
  https://developer.download.nvidia.com/compute/cuda/repos/sles15/x86_64/
```
After the download is complete, transfer the downloaded directory with all its content to each GPU-enabled local host.
On each GPU-enabled local host in its transactional-update shell session, add a package repository from the safely transferred NVIDIA RPM packages directory. This allows pulling in additional utilities, for example, nvidia-smi.
```
# zypper ar --no-gpgcheck \
file://LOCAL_MIRROR_DIRECTORY \
nvidia-local-mirror
transactional update # zypper --gpg-auto-import-keys refresh
```

Install the Open Kernel driver KMP and detect the driver version.

# zypper install -y --auto-agree-with-licenses \
  nv-prefer-signed-open-driver
# version=$(rpm -qa --queryformat '%{VERSION}\n' \
  nv-prefer-signed-open-driver | cut -d "_" -f1 | sort -u | tail -n 1)

You can then install the appropriate packages for additional utilities that are useful for testing purposes.

# zypper install -y --auto-agree-with-licenses \
nvidia-compute-utils-G06=${version} \
nvidia-persistenced=${version}

Reboot the host to make the changes effective.
```
# reboot
```

Log back in and use the nvidia-smi tool to verify that the driver is loaded successfully and that it can both access and enumerate your GPUs.

# nvidia-smi

The output of this command should show you something similar to the following output. In the example below, the system has one GPU.

Fri Aug  1 15:32:10 2025       
+------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.07      Driver Version: 580.82.07    CUDA Version: 13.0    |
|------------------------------+------------------------+----------------------+
| GPU  Name      Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp Perf Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                              |                        |               MIG M. |
|==============================+========================+======================|
|   0  Tesla T4            On  |   00000000:00:1E.0 Off |                    0 |
| N/A   33C   P8   13W /   70W |       0MiB /  15360MiB |      0%      Default |
|                              |                        |                  N/A |
+------------------------------+------------------------+----------------------+
                                                                                   
+------------------------------------------------------------------------------+
| Processes:                                                                   |
|  GPU   GI   CI        PID   Type   Process name                   GPU Memory |
|        ID   ID                                                    Usage      |
|==============================================================================|
|  No running processes found                                                  |
+------------------------------------------------------------------------------+

3.2.1.4 Validation of the driver installation #

Running the nvidia-smi command has verified that, at the host level, the NVIDIA device can be accessed and that the drivers are loading successfully. To validate that it is functioning, you need to validate that the GPU can take instructions from a user-space application, ideally via a container and through the CUDA library, as that is typically what a real workload would use. For this, we can make a further modification to the host OS by installing nvidia-container-toolkit.

Install the nvidia-container-toolkit package from the NVIDIA Container Toolkit repository.
```
# zypper ar \
"https://nvidia.github.io/libnvidia-container/stable/rpm/"\
nvidia-container-toolkit.repo
# zypper --gpg-auto-import-keys install \
  -y nvidia-container-toolkit
```
The nvidia-container-toolkit.repo file contains a stable repository nvidia-container-toolkit and an experimental repository nvidia-container-toolkit-experimental. Use the stable repository for production use. The experimental repository is disabled by default.
Verify that the system can successfully enumerate the devices using the NVIDIA Container Toolkit. The output should be verbose, with INFO and WARN messages, but no ERROR messages.
```
# nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
```
This ensures that any container started on the machine can employ discovered NVIDIA GPU devices.
You can then run a Podman-based container. Doing this via podman gives you a good way of validating access to the NVIDIA device from within a container, which should give confidence for doing the same with Kubernetes at a later stage.
Give Podman access to the labeled NVIDIA devices that were taken care of by the previous command and simply run the bash command.
```
# podman run --rm --device nvidia.com/gpu=all \
  --security-opt=label=disable \
  -it registry.suse.com/bci/bci-base:latest bash
```
You can now execute commands from within a temporary Podman container. It does not have access to your underlying system and is ephemeral—whatever you change in the container does not persist. Also, you cannot break anything on the underlying host.
Inside the container, install the required CUDA libraries. Identify their version from the output of the nvidia-smi command. From the above example, we are installing CUDA version 13.0 with many examples, demos and development kits to fully validate the GPU.
```
# zypper ar \
  http://developer.download.nvidia.com/compute/cuda/repos/sles15/x86_64/ \
  cuda-sle15-sp6
# zypper --gpg-auto-import-keys refresh
# zypper install -y cuda-libraries-13-0 cuda-demo-suite-12-9
```

Inside the container, run the deviceQuery CUDA example of the same version, which comprehensively validates GPU access via CUDA and from within the container itself.

# /usr/local/cuda-12.9/extras/demo_suite/deviceQuery Starting...

 CUDA Device Query (Runtime API)

Detected 1 CUDA Capable device(s)

Device 0: "Tesla T4"
  CUDA Driver Version / Runtime Version          13.0/ 13.0
  CUDA Capability Major/Minor version number:    7.5
  Total amount of global memory:                 14913 MBytes (15637086208 bytes)
  (40) Multiprocessors, ( 64) CUDA Cores/MP:     2560 CUDA Cores
  GPU Max Clock rate:                            1590 MHz (1.59 GHz)
  Memory Clock rate:                             5001 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 4194304 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1024
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 3 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Enabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 30
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 13.0, CUDA Runtime Version = 13.0, NumDevs = 1, Device0 = Tesla T4
Result = PASS

From inside the container, you can continue to run any other CUDA workload—such as compilers—to run further tests. When finished, you can exit the container.

# exit

Important

Changes you have made in the container and packages you have installed inside will be lost and will not impact the underlying operating system.

3.2.2 Installing NVIDIA GPU drivers on SUSE Linux Micro #

3.2.2.1 Requirements #

If you are following this guide, it assumes that you have the following already available:

At least one host with SUSE Linux Micro 6.1 installed, physical or virtual.
Your hosts are attached to a subscription as this is required for package access.
A compatible NVIDIA GPU installed or fully passed through to the virtual machine in which SUSE Linux Micro is running.
Access to the root user—these instructions assume you are the root user, and not escalating your privileges via sudo.

3.2.2.2 Considerations before the installation #

3.2.2.2.1 Select the driver generation #

You must verify the driver generation for the NVIDIA GPU that your system has. For modern GPUs, the G06 driver is the most common choice. Find more details in the support database.

This section details the installation of the G06 generation of the driver.

3.2.2.2.2 Additional NVIDIA components #

Besides the NVIDIA open-driver provided by SUSE as part of SUSE Linux Micro, you might also need additional NVIDIA components. These could include OpenGL libraries, CUDA toolkits, command-line utilities such as nvidia-smi, and container-integration components such as nvidia-container-toolkit. Many of these components are not shipped by SUSE as they are proprietary NVIDIA software. This section describes how to configure additional repositories that give you access to these components and provides examples of using these tools to achieve a fully functional system.

3.2.2.3 The installation procedure #

On the remote host, run the script SUSE-AI-mirror-nvidia.sh from the air-gapped stack (see Section 2.1, “Air-gapped stack”) to download all required NVIDIA RPM packages to a local directory, for example:
```
> SUSE-AI-mirror-nvidia.sh \
  -p /LOCAL_MIRROR_DIRECTORY \
  -l https://nvidia.github.io/libnvidia-container/stable/rpm/x86_64 \
  https://developer.download.nvidia.com/compute/cuda/repos/sles15/x86_64/
```
After the download is complete, transfer the downloaded directory with all its content to each GPU-enabled local host.
On each local GPU-enabled host, open up a transactional-update shell session to create a new read/write snapshot of the underlying operating system so that we can make changes to the immutable platform.
```
# transactional-update shell
```
On each GPU-enabled local host in its transactional-update shell session, add a package repository from the safely transferred NVIDIA RPM packages directory. This allows pulling in additional utilities, for example, nvidia-smi.
```
transactional update # zypper ar --no-gpgcheck \
file://LOCAL_MIRROR_DIRECTORY \
nvidia-local-mirror
transactional update # zypper --gpg-auto-import-keys refresh
```

Install the Open Kernel driver KMP and detect the driver version.

transactional update # zypper install -y --auto-agree-with-licenses \
  nvidia-open-driver-G06-signed-cuda-kmp-default
transactional update # version=$(rpm -qa --queryformat '%{VERSION}\n' \
  nvidia-open-driver-G06-signed-cuda-kmp-default \
  | cut -d "_" -f1 | sort -u | tail -n 1)

You can then install the appropriate packages for additional utilities that are useful for testing purposes.

transactional update # zypper install -y --auto-agree-with-licenses \
nvidia-compute-utils-G06=${version} \
nvidia-persistenced=${version}

Exit the transactional-update session and reboot to the new snapshot that contains the changes you have made.
```
transactional update # exit
# reboot
```

After the system has rebooted, log back in and use the nvidia-smi tool to verify that the driver is loaded successfully and that it can both access and enumerate your GPUs.

# nvidia-smi

The output of this command should show you something similar to the following output. In the example below, the system has one GPU.

Fri Aug  1 14:53:26 2025       
+------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.07     Driver Version: 580.82.07     CUDA Version: 13.0    |
|---------------------------------+---------------------+----------------------+
| GPU  Name         Persistence-M | Bus-Id       Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf  Pwr:Usage/Cap |        Memory-Usage | GPU-Util  Compute M. |
|                                 |                     |               MIG M. |
|=================================+=====================+======================|
|   0  Tesla T4               On  |00000000:00:1E.0 Off |                    0 |
| N/A   34C    P8     10W /   70W |    0MiB /  15360MiB |      0%      Default |
|                                 |                     |                  N/A |
+---------------------------------+---------------------+----------------------+
                                                                                         
+------------------------------------------------------------------------------+
| Processes:                                                                   |
|  GPU   GI   CI         PID   Type   Process name                  GPU Memory |
|        ID   ID                                                    Usage      |
|==============================================================================|
|  No running processes found                                                  |
+------------------------------------------------------------------------------+

3.2.2.4 Validation of the driver installation #

Open another transactional-update shell.
```
#  transactional-update shell
```
Install the nvidia-container-toolkit package from the NVIDIA Container Toolkit repository.
```
transactional update # zypper ar \
"https://nvidia.github.io/libnvidia-container/stable/rpm/"\
nvidia-container-toolkit.repo
transactional update # zypper --gpg-auto-import-keys install \
  -y nvidia-container-toolkit
```
The nvidia-container-toolkit.repo file contains a stable repository nvidia-container-toolkit and an experimental repository nvidia-container-toolkit-experimental. Use the stable repository for production use. The experimental repository is disabled by default.
Exit the transactional-update session and reboot to the new snapshot that contains the changes you have made.
```
transactional update # exit
# reboot
```
Verify that the system can successfully enumerate the devices using the NVIDIA Container Toolkit. The output should be verbose, with INFO and WARN messages, but no ERROR messages.
```
# nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
```
This ensures that any container started on the machine can employ discovered NVIDIA GPU devices.
You can then run a Podman-based container. Doing this via podman gives you a good way of validating access to the NVIDIA device from within a container, which should give confidence for doing the same with Kubernetes at a later stage.
Give Podman access to the labeled NVIDIA devices that were taken care of by the previous command and simply run the bash command.
```
# podman run --rm --device nvidia.com/gpu=all \
  --security-opt=label=disable \
  -it registry.suse.com/bci/bci-base:latest bash
```
You can now execute commands from within a temporary Podman container. It does not have access to your underlying system and is ephemeral—whatever you change in the container does not persist. Also, you cannot break anything on the underlying host.
Inside the container, install the required CUDA libraries. Identify their version from the output of the nvidia-smi command. From the above example, we are installing CUDA version 13.0 with many examples, demos and development kits to fully validate the GPU.
```
# zypper ar \
  http://developer.download.nvidia.com/compute/cuda/repos/sles15/x86_64/ \
  cuda-sle15-sp6
# zypper --gpg-auto-import-keys refresh
# zypper install -y cuda-libraries-13-0 cuda-demo-suite-12-9
```

Inside the container, run the deviceQuery CUDA example of the same version, which comprehensively validates GPU access via CUDA and from within the container itself.

# /usr/local/cuda-12.9/extras/demo_suite/deviceQuery Starting...

 CUDA Device Query (Runtime API)

Detected 1 CUDA Capable device(s)

Device 0: "Tesla T4"
  CUDA Driver Version / Runtime Version          13.0 / 13.0
  CUDA Capability Major/Minor version number:    7.5
  Total amount of global memory:                 14914 MBytes (15638134784 bytes)
  (40) Multiprocessors, ( 64) CUDA Cores/MP:     2560 CUDA Cores
  GPU Max Clock rate:                            1590 MHz (1.59 GHz)
  Memory Clock rate:                             5001 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 4194304 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1024
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 3 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Enabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 30
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 13.0, CUDA Runtime Version = 13.0, NumDevs = 1, Device0 = Tesla T4
Result = PASS

From inside the container, you can continue to run any other CUDA workload—such as compilers—to run further tests. When finished, you can exit the container.

# exit

Important

Changes you have made in the container and packages you have installed inside will be lost and will not impact the underlying operating system.

3.3 Installing SUSE Rancher Prime: RKE2 in air-gapped environments #

This guide will help you quickly launch a cluster with default options.

SUSE Rancher Prime: RKE2 can be installed in an air-gapped environment with two different methods. You can either deploy via the rke2-airgap-images tarball artifact or by using a private registry.

Important

You can use any RKE2 Prime version listed on the Prime Artifacts URL for the assets mentioned in these steps. To learn more about the Prime Artifacts URL, see our Prime-only documentation. Authentication is required. Use your SUSE Customer Center (SCC) credentials to log in.

3.3.1 Prerequisites #

Verify that you meet the following prerequisites based on your environment before proceeding. All the steps listed on this page must be run as the root user or through sudo.

Important

If your node has NetworkManager installed and enabled, ensure you configure NetworkManager to ignore CNI-managed interfaces.

If running on an air-gapped node with SELinux enabled, you must manually install the necessary SELinux policy RPM before performing these steps. See our RPM Documentation to determine what you need.
If running on an air-gapped node with SELinux enabled, the following are required dependencies for SUSE Linux Server, CentOS or Red Hat Enterprise Linux 8 when doing an RPM install: container-selinux iptables libnetfilter_conntrack libnfnetlink libnftnl policycoreutils-python-utils rke2-common rke2-selinux.
If your nodes do not have an interface with a default route, a default route must be configured. Even a black-hole route via a dummy interface will suffice. SUSE Rancher Prime: RKE2 requires a default route to auto-detect the node's primary IP and for kube-proxy ClusterIP routing to function correctly. To add a dummy route, do the following:
```
> sudo ip link add dummy0 type dummy
> sudo ip link set dummy0 up
> sudo ip addr add 203.0.113.254/31 dev dummy0
> sudo ip route add default via 203.0.113.255 dev dummy0 metric 1000
```

3.3.2 Tarball method #

Download the air-gap images tarballs from the Prime Artifacts URL for the SUSE Rancher Prime: RKE2 version, CNI and platform you are using.
- rke2-images.linux-ARCH.tar.zst, or rke2-images-core.linux-ARCH.tar.gz. This tarball contains the core container images required for RKE2 Prime to function. Zstandard offers better compression ratios and faster decompression speeds compared to gzip.
- rke2-images-CNI.linux-ARCH.tar.gz. This tarball specifically contains the container images for your Container Network Interface (CNI). The default CNI in SUSE Rancher Prime: RKE2 is Canal.
  - If you are using the default CNI, Canal (--cni=canal), you can use either the rke2-image legacy archive as described above or the rke2-images-core and rke2-images-canal archives.
  - If you are using the alternative CNI, Cilium (--cni=cilium), you must download the rke2-images-core and rke2-images-cilium archives.
  - If using your own CNI (--cni=none), download only the rke2-images-core archive.
- If enabling the vSphere CPI/CSI charts (--cloud-provider-name=rancher-vsphere), you must also download the rke2-images-vsphere archive.
Create a directory named /rke2-artifacts on the node and save the previously downloaded files there.
Continue with Section 3.3.4, “Install SUSE Rancher Prime: RKE2”.

3.3.3 Private registry method #

Private registry support honors all settings from the containerd registry configuration, including endpoint override, transport protocol (HTTP/HTTPS), authentication, certificate verification and more.

Add all the required system images to your private registry. A list of images can be obtained from the .txt file corresponding to each tarball referenced above. Alternatively, you can docker load the air-gap image tarballs, then tag and push the loaded images.
Install SUSE Rancher Prime: RKE2 using the system-default-registry parameter, or use the containerd registry configuration to use your registry as a mirror for docker.io.

3.3.4 Install SUSE Rancher Prime: RKE2 #

The following options to install SUSE Rancher Prime: RKE2 should only be performed after completing either the Tarball method or Private registry method.

SUSE Rancher Prime: RKE2 can be installed either by running the binary directly or by using the install.sh script.

3.3.4.1 SUSE Rancher Prime: RKE2 binary install #

Obtain the SUSE Rancher Prime: RKE2 binary file rke2.linux-ARCH.tar.gz.
Ensure the binary is named rke2 and place it in /usr/local/bin. Ensure it is executable.
Run the binary with the desired parameters.
- If you are using the Rancher Prime registry, set the following values in config.yaml:
  - Set system-default-registry: registry.rancher.com.
  - If you are not using the default CNI, Canal, set cni: CNI.
    system-default-registry: registry.rancher.com cni: CNI
- If using the Private Registry Method, set the following values in config.yaml:
```
system-default-registry: "registry.example.com:5000"
```
  Note
  The system-default-registry parameter must specify only valid RFC 3986 URI authorities, i.e. a host and optional port.

3.3.4.2 SUSE Rancher Prime: RKE2 install.sh script install #

install.sh may be used in an offline mode by setting the INSTALL_RKE2_ARTIFACT_PATH variable to a path containing pre-downloaded artifacts. This will run through a standard install, including creating systemd units.

Download the install script, SUSE Rancher Prime: RKE2 binaries, SUSE Rancher Prime: RKE2 images, and SHA256 checksum archives from the Prime Artifacts URL into the /rke2-artifacts directory, as in the example below:

mkdir /rke2-artifacts && cd /rke2-artifacts/
curl -OLs PRIME-ARTIFACTS-URL/rke2/VERSION/rke2-images.linux-ARCH.tar.zst
curl -OLs PRIME-ARTIFACTS-URL/rke2/VERSION/rke2.linux-ARCH.tar.gz
curl -OLs PRIME-ARTIFACTS-URL/rke2/VERSION/sha256sum-ARCH.txt
curl -sfL https://get.rke2.io --output install.sh

Run install.sh using the directory, as in the example below:
```
INSTALL_RKE2_ARTIFACT_PATH=/rke2-artifacts sh install.sh
```
Enable and run the service as outlined in Section 3.4, “Installing SUSE Rancher Prime: RKE2”.

3.4 Installing SUSE Rancher Prime: RKE2 #

This guide will help you quickly launch a cluster with default options.

Tip

New to Kubernetes? The official Kubernetes docs already have great tutorials outlining the basics.

Important

3.4.1 Prerequisites #

Make sure your environment fulfills the requirements. If NetworkManager is installed and enabled on your hosts, ensure that it is configured to ignore CNI-managed interfaces.
If the host kernel supports AppArmor, the AppArmor tools (usually available via the apparmor-parser package) must also be present before installing RKE2.
The RKE2 installation process must be run as the root user or through sudo.

3.4.2 Server node installation #

SUSE Rancher Prime: RKE2 provides an installation script that is a convenient way to install it as a service on systemd-based systems. This script is available at https://get.rke2.io. To install RKE2 using this method, do the following:

Run the installer, where INSTALL_RKE2_ARTIFACT_URL is the Prime Artifacts URL and INSTALL_RKE2_CHANNEL is a release channel you can subscribe to and defaults to stable. In this example, INSTALL_RKE2_CHANNEL="latest" gives you the latest version of RKE2.
```
> sudo curl -sfL https://get.rke2.io/ | \
  sudo INSTALL_RKE2_ARTIFACT_URL=PRIME-ARTIFACTS-URL/rke2 \
  INSTALL_RKE2_CHANNEL="latest" sh -
```
To specify a version, set the INSTALL_RKE2_VERSION environment variable.
```
> sudo curl -sfL https://get.rke2.io/ | \
sudo INSTALL_RKE2_ARTIFACT_URL=PRIME-ARTIFACTS-URL/rke2 \
  INSTALL_RKE2_VERSION="VERSION" ./install.sh
```
This will install the rke2-server service and the rke2 binary onto your machine. Due to its nature, it will fail unless it runs as the root user or through sudo.

Enable the rke2-server service.

> sudo systemctl enable rke2-server.service

To pull images from the Rancher Prime registry, set the following value in etc/rancher/rke2/config.yaml:
```
system-default-registry: registry.rancher.com
```
This configuration tells RKE2 to use registry.rancher.com as the default location for all container images it needs to deploy within the cluster.

Start the service.

> sudo systemctl start rke2-server.service

Follow the logs with the following command:
```
> sudo journalctl -u rke2-server -f
```

After running this installation:

The rke2-server service will be installed. The rke2-server service will be configured to automatically restart after node reboots or if the process crashes or is killed.
Additional utilities will be installed at /var/lib/rancher/rke2/bin/. They include: kubectl, crictl, and ctr. Note that these are not on your path by default.
Two cleanup scripts, rke2-killall.sh and rke2-uninstall.sh, will be installed to the path at:
- /usr/local/bin for regular file systems
- /opt/rke2/bin for read-only and Btrfs file systems
- INSTALL_RKE2_TAR_PREFIX/bin if INSTALL_RKE2_TAR_PREFIX is set
A kubeconfig file will be written to /etc/rancher/rke2/rke2.yaml.
A token that can be used to register other server or agent nodes will be created at /var/lib/rancher/rke2/server/node-token.

Note

If you are adding additional server nodes, you must have an odd number in total. An odd number is needed to maintain a quorum. See the High Availability documentation for more details.

3.4.3 Linux agent (worker) node installation #

The steps on this section requires root-level access or sudo to work.

Run the installer.
```
> sudo curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="agent" sh -
```
This will install the rke2-agent service and the rke2 binary onto your machine. Due to its nature, it will fail unless it runs as the root user or through sudo.

Enable the rke2-agent service.

> sudo systemctl enable rke2-agent.service

Configure the rke2-agent service.
```
> sudo mkdir -p /etc/rancher/rke2/
vim /etc/rancher/rke2/config.yaml
```
Content for config.yaml:
```
> sudo server: https://SERVER_IP_OR_DNS:9345
token: TOKEN_FROM_SERVER_NODE
```
Note
The rke2 server process listens on port 9345 for new nodes to register. The Kubernetes API is still served on port 6443, as normal.

Start the service.

> sudo systemctl start rke2-agent.service

Follow the logs with the following command:
```
> sudo journalctl -u rke2-agent -f
```

Note

Each machine must have a unique host name. If your machines do not have unique host names, set the node-name parameter in the config.yaml file and provide a value with a valid and unique host name for each node. To learn more about the config.yaml file, refer to the Configuration options documentation.

3.4.4 Microsoft Windows agent (worker) node installation #

Windows Support works with Calico or Flannel as the CNI for the RKE2 cluster.

Procedure 1: Prepare the Microsoft Windows agent node #

Note

The Windows Server Containers feature needs to be enabled for the RKE2 agent to work.

Open a new PowerShell window with administrator privileges.
```
powershell -Command "Start-Process PowerShell -Verb RunAs"
```
In the new PowerShell window, run the following command to install the containers feature.
```
Enable-WindowsOptionalFeature -Online -FeatureName containers –All
```
This will require a reboot for the Containers feature to function properly.

Download the install script.

Invoke-WebRequest -Uri https://raw.githubusercontent.com/rancher/rke2/master/install.ps1 -Outfile install.ps1

This script will download the rke2.exe Windows binary onto your machine.

Configure the rke2-agent for Windows.

> sudo New-Item -Type Directory c:/etc/rancher/rke2 -Force
Set-Content -Path c:/etc/rancher/rke2/config.yaml -Value @"
server: https://SERVER_IP_OR_DNS:9345
token: TOKEN_FROM_SERVER_NODE
"@

To learn more about the config.yaml file, refer to the Configuration options documentation.

Configure the PATH.

> sudo $env:PATH+=";c:\var\lib\rancher\rke2\bin;c:\usr\local\bin"

[Environment]::SetEnvironmentVariable(
    "Path",
    [Environment]::GetEnvironmentVariable("Path", [EnvironmentVariableTarget]::Machine) + ";c:\var\lib\rancher\rke2\bin;c:\usr\local\bin",
    [EnvironmentVariableTarget]::Machine)

Run the installer.
```
> sudo ./install.ps1
```
Start the Windows RKE2 Service.
```
> sudo rke2.exe agent service --add
```

Note

Each machine must have a unique host name.

Do not forget to start the RKE2 service with:

Start-Service rke2

If you would prefer to use CLI parameters only instead, run the binary with the desired parameters.

rke2.exe agent --token TOKEN --server SERVER_URL

4 Preparing the cluster for AI Library #

This procedure assumes that you already have the base operating system installed on cluster nodes as well as the SUSE Rancher Prime: RKE2 Kubernetes distribution installed and operational. If you are installing from scratch, refer to Section 3, “Installing the Linux and Kubernetes distribution” first.

Install SUSE Rancher Prime on the cluster.
Install the NVIDIA GPU Operator on the cluster as described in Section 4.2, “Installing the NVIDIA GPU Operator on the SUSE Rancher Prime: RKE2 cluster”.
Connect the Kubernetes cluster to SUSE Rancher Prime as described in Section 4.3, “Registering existing clusters”
Configure the GPU-enabled nodes so that the SUSE AI containers are assigned to Pods that run on nodes equipped with NVIDIA GPU hardware. Find more details about assigning Pods to nodes in Section 4.4, “Assigning GPU nodes to applications”.
(Optional) Install SUSE Security as described in Section 4.5, “Installing SUSE Security”. Although this step is not required, we strongly encourage it to ensure data security in the production environment.
Install and configure SUSE Observability to observe the nodes used for SUSE AI application. Refer to Section 4.6, “Setting up SUSE Observability for SUSE AI” for more details.

4.1 Installing SUSE Rancher Prime on a Kubernetes cluster in air-gapped environments #

This section is about using the Helm CLI to install the Rancher server in an air-gapped environment.

4.1.1 Installation outline #

4.1.2 Set up the infrastructure and a private registry #

In this section, you will provision the underlying infrastructure for your Rancher management server in an air-gapped environment. You will also set up the private container image registry that must be available to your Rancher node(s). The procedures below focus on installing Rancher in the RKE2 cluster.

To install the Rancher management server on a high-availability SUSE Rancher Prime: RKE2 cluster, we recommend setting up the following infrastructure:

Three Linux nodes, typically virtual machines, in an infrastructure provider such as Amazon's EC2, Google Compute Engine or vSphere.
A load balancer to direct front-end traffic to the three nodes.
A DNS record to map a URL to the load balancer. This will become the Rancher server URL, and downstream Kubernetes clusters will need to reach it.
A private image registry to distribute container images to your machines.

These nodes must be in the same region or data center. You may place these servers in separate availability zones.

4.1.2.1 Why three nodes? #

In an RKE2 cluster, the Rancher server data is stored on etcd. This etcd database runs on all three nodes.

The etcd database requires an odd number of nodes so that it can always elect a leader with a majority of the etcd cluster. If the etcd database cannot elect a leader, etcd can suffer from split brain, requiring the cluster to be restored from backup. If one of the three etcd nodes fails, the two remaining nodes can elect a leader because they have the majority of the total number of etcd nodes.

4.1.2.2 Set up Linux nodes #

These hosts will be disconnected from the Internet, but require being able to connect with your private registry.

Make sure that your nodes fulfill the general installation requirements for OS, container runtime, hardware and networking.

For an example of one way to set up Linux nodes, refer to this tutorial for setting up nodes as instances in Amazon EC2.

4.1.2.3 Set up the load balancer #

You will also need to set up a load balancer to direct traffic to the Rancher replica on both nodes. That will prevent the outage of any single node from taking down communications to the Rancher management server.

When Kubernetes gets set up in a later step, the RKE2 tool will deploy an NGINX Ingress controller. This controller will listen on ports 80 and 443 of the worker nodes, answering traffic destined for specific hostnames.

When Rancher is installed (also in a later step), the Rancher system creates an Ingress resource. That Ingress tells the NGINX Ingress controller to listen for traffic destined for the Rancher host name. The NGINX Ingress controller, when receiving traffic destined for the Rancher host name, will forward that traffic to the running Rancher pods in the cluster.

For your implementation, consider if you want or need to use a Layer-4 or Layer-7 load balancer:

A layer-4 load balancer is the simpler of the two choices, in which you are forwarding TCP traffic to your nodes. We recommend configuring your load balancer as a Layer 4 balancer, forwarding traffic on ports TCP/80 and TCP/443 to the Rancher management cluster nodes. The Ingress controller on the cluster will redirect HTTP traffic to HTTPS and terminate SSL/TLS on port TCP/443. The Ingress controller will forward traffic on port TCP/80 to the Ingress pod in the Rancher deployment.
A layer-7 load balancer is a bit more complicated but can offer features that you may want. For instance, a layer-7 load balancer is capable of handling TLS termination at the load balancer, as opposed to Rancher doing TLS termination itself. This can be beneficial to centralize your TLS termination in your infrastructure. Layer-7 load balancing also allows your load balancer to make decisions based on HTTP attributes such as cookies—capabilities that a layer-4 load balancer cannot handle. If you decide to terminate the SSL/TLS traffic on a layer-7 load balancer, you will need to use the --set tls=external option when installing Rancher in a later step. For more information, refer to the Rancher Helm chart options.

For an example showing how to set up an NGINX load balancer, refer to this page.

For a how-to guide for setting up an Amazon ELB Network Load Balancer, refer to this page.

Important

Do not use this load balancer (that is, the local cluster Ingress) to load balance applications other than Rancher following installation. Sharing this Ingress with other applications may result in WebSocket errors to Rancher following Ingress configuration reloads for other apps. We recommend dedicating the local cluster to Rancher and no other applications.

4.1.2.4 Set up the DNS record #

Once you have set up your load balancer, you will need to create a DNS record to send traffic to this load balancer.

Depending on your environment, this may be an A record pointing to the LB IP, or it may be a CNAME pointing to the load balancer host name. In either case, make sure this record matches the host name you want Rancher to respond to.

You will need to specify this host name in a later step when you install Rancher, and it is not possible to change it later. Make sure that your decision is final.

For a how-to guide for setting up a DNS record to route domain traffic to an Amazon ELB load balancer, refer to the official AWS documentation.

4.1.2.5 Set up a private image registry #

Rancher supports air-gapped installations using a secure private registry. You must have your own private registry or other means of distributing container images to your machines.

In a later step, when you set up your RKE2 Kubernetes cluster, you will create a private registries configuration file with details from this registry.

If you need to create a private registry, refer to the documentation pages for your respective runtime:

4.1.3 Collect and publish images to your private registry #

This section describes how to set up your private registry so that when you install Rancher, it will pull all the required images from this registry.

By default, all images used to provision Kubernetes clusters or launch any tools in Rancher, e.g., monitoring, pipelines or alerts, are pulled from Docker Hub. In an air-gapped installation of Rancher, you will need a private registry that is located somewhere accessible by your Rancher server. You will then load every image into the registry.

Populating the private registry with images is the same process for installing Rancher with Docker and for installing Rancher on a Kubernetes cluster.

Prerequisites #

You must have a private registry available to use.
If the registry has certs, follow this K3s documentation about adding a private registry. The certs and registry configuration files need to be mounted into the Rancher container.

The following steps populate your private registry.

Find the required assets for your Rancher version
Collect the cert-manager image (unless you are bringing your own certificates or terminating TLS on a load balancer)
Save the images to your workstation
Populate the private registry

Prerequisites #

These steps expect you to use a Linux workstation that has Internet access, access to your private registry, and at least 20 GB of disk space.
If you use ARM64 hosts, the registry must support manifests. As of April 2020, Amazon Elastic Container Registry does not support manifests.

4.1.3.1 Find the required assets for your Rancher version #

Go to our releases page, find the Rancher v2.x.x release that you want to install, and click Assets. Note: Do not use releases marked rc or Pre-release, as they are not stable for production environments.

From the release's Assets section, download the following files, which are required to install Rancher in an air-gapped environment:

Table 1: Required assets #

Release File	Description
`rancher-images.txt`	This file contains a list of images needed to install Rancher, provision clusters and use Rancher tools.
`rancher-save-images.sh`	This script pulls all the images in the `rancher-images.txt` from Docker Hub and saves all the images as `rancher-images.tar.gz`.
`rancher-load-images.sh`	This script loads images from the `rancher-images.tar.gz` file and pushes them to your private registry.

4.1.3.2 Collect the cert-manager image #

Note

Skip this step if you are using your own certificates, or if you are terminating TLS on an external load balancer.

In a Kubernetes Install, if you elect to use the Rancher default self-signed TLS certificates, you must add the `cert-manager` image to rancher-images.txt as well.

Fetch the latest cert-manager Helm chart and parse the template for image details:
Note
Recent changes to cert-manager require an upgrade. If you are upgrading Rancher and using a version of cert-manager older than v0.12.0, please see our upgrade documentation.
```
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm fetch jetstack/cert-manager
helm template ./cert-manager-version.tgz | \
  awk '$1 ~ /image:/ {print $2}' | sed s/\"//g >> ./rancher-images.txt
```
Sort the image list and deduplicate it to remove any overlap between the sources:
```
> sort -u rancher-images.txt -o rancher-images.txt
```

4.1.3.3 Save the images to your workstation #

Make rancher-save-images.sh an executable:
```
> chmod +x rancher-save-images.sh
```
Run rancher-save-images.sh with the rancher-images.txt image list to create a tarball of all the required images:
```
> ./rancher-save-images.sh --image-list ./rancher-images.txt
```
Result: Docker begins pulling the images used for an air-gapped install. Be patient. This process takes a few minutes. When the process completes, your current directory will output a tarball named rancher-images.tar.gz. Check that the output is in the directory.

4.1.3.4 Populate the private registry #

Next, you will move the images in the rancher-images.tar.gz to your private registry using the scripts to load the images.

Move the images in the rancher-images.tar.gz to your private registry using the scripts to load the images.

The rancher-images.txt is expected to be on the workstation in the same directory that you are running the rancher-load-images.sh script. The rancher-images.tar.gz should also be in the same directory.

> docker login REGISTRY.YOURDOMAIN.COM:PORT

Make rancher-load-images.sh an executable:
```
> chmod +x rancher-load-images.sh
```
Use rancher-load-images.sh to extract, tag and push rancher-images.txt and rancher-images.tar.gz to your private registry:
```
> ./rancher-load-images.sh --image-list ./rancher-images.txt \
   --registry REGISTRY.YOURDOMAIN.COM:PORT
```

4.1.4 Install Kubernetes #

This section describes how to install a Kubernetes cluster according to our best practices for the Rancher server environment. This cluster should be dedicated to running only the Rancher server.

Note

Skip this section if you are installing Rancher on a single node with Docker.

Rancher can be installed on any Kubernetes cluster, including hosted Kubernetes providers. The following sections outline how to set up an air-gapped RKE2 cluster.

In this guide, we are assuming you have created your nodes in your air-gapped environment and have a secure Docker private registry on your bastion server.

4.1.4.1 Create RKE2 configuration #

Create the config.yaml file at /etc/rancher/rke2/config.yaml. This will contain all the configuration options necessary to create a highly available RKE2 cluster.

On the first server, the minimum configuration is:

token: my-shared-secret
tls-san:
  - loadbalancer-dns-domain.com

On each other server, the configuration file should contain the same token and tell RKE2 to connect to the existing first server:

server: https://ip-of-first-server:9345
token: my-shared-secret
tls-san:
  - loadbalancer-dns-domain.com

For more information, refer to the RKE2 documentation.

Note

RKE2 additionally provides a resolv-conf option for kubelets, which may help with configuring DNS in air-gap networks.

4.1.4.2 Create registry YAML file #

Create the registries.yaml file at /etc/rancher/rke2/registries.yaml. This will tell RKE2 the necessary details to connect to your private registry.

The registries.yaml file should look like this before plugging in the necessary information:

---
mirrors:
  customreg:
    endpoint:
      - "https://ip-to-server:5000"
configs:
  customreg:
    auth:
      username: xxxxxx # this is the registry username
      password: xxxxxx # this is the registry password
    tls:
      cert_file: path to the cert file used in the registry
      key_file:  path to the key file used in the registry
      ca_file: path to the ca file used in the registry

For more information on the private registry configuration file for RKE2, refer to the RKE2 documentation.

4.1.4.3 Install RKE2 #

Rancher needs to be installed on a supported Kubernetes version. To find out which versions of Kubernetes are supported for your Rancher version, refer to support maintenance terms.

Download the install script, rke2, rke2-images and sha256sum archives from the release and upload them into a directory on each server:

> mkdir /tmp/rke2-artifacts && cd /tmp/rke2-artifacts/
> wget https://github.com/rancher/rke2/releases/download/v1.21.5%2Brke2r2/rke2-images.linux-amd64.tar.zst
> wget https://github.com/rancher/rke2/releases/download/v1.21.5%2Brke2r2/rke2.linux-amd64.tar.gz
> wget https://github.com/rancher/rke2/releases/download/v1.21.5%2Brke2r2/sha256sum-amd64.txt
> curl -sfL https://get.rke2.io --output install.sh

Next, run install.sh using the directory on each server, as in the example below:

> INSTALL_RKE2_ARTIFACT_PATH=/tmp/rke2-artifacts sh install.sh

Then enable and start the service on all servers:

> sudo systemctl enable rke2-server.service
> sudo systemctl start rke2-server.service

For more information, refer to the Section 3.3, “Installing SUSE Rancher Prime: RKE2 in air-gapped environments”.

4.1.4.4 Save and start using the kubeconfig file #

When you installed RKE2 on each Rancher server node, a kubeconfig file was created on the node at /etc/rancher/rke2/rke2.yaml. This file contains credentials for full access to the cluster, and you should save this file in a secure location.

To use this kubeconfig file:

Install kubectl, the Kubernetes command-line tool.
Copy the file at /etc/rancher/rke2/rke2.yaml and save it to the directory ~/.kube/config on your local machine.

In the kubeconfig file, the server directive is defined as localhost. Configure the server as the DNS for your load balancer, referring to port 6443. (The Kubernetes API server will be reached at port 6443, while the Rancher server will be reached at ports 80 and 443.) Here is an example rke2.yaml:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: [CERTIFICATE-DATA]
    server: [LOAD-BALANCER-DNS]:6443 # Edit this line
  name: default
contexts:
- context:
    cluster: default
    user: default
  name: default
current-context: default
kind: Config
preferences: {}
users:
- name: default
  user:
    password: [PASSWORD]
    username: admin

Result: You can now use kubectl to manage your RKE2 cluster. If you have more than one kubeconfig file, you can specify which one you want to use by passing in the path to the file when using kubectl:

> kubectl --kubeconfig ~/.kube/config/rke2.yaml get pods \
  --all-namespaces

For more information about the kubeconfig file, refer to the RKE2 documentation or the official Kubernetes documentation about organizing cluster access using kubeconfig files.

4.1.5 Install SUSE Rancher Prime #

This section describes how to install a Kubernetes cluster according to our best practices for the Rancher server environment. This cluster should be dedicated to run only the Rancher server.

4.1.5.1 Privileged access for rancher #

When the Rancher server is deployed in the Docker container, a local Kubernetes cluster is installed within the container for Rancher to use. Because many features of Rancher run as deployments, and privileged mode is required to run containers within containers, you will need to install Rancher with the --privileged option.

4.1.5.2 Kubernetes instructions #

Rancher recommends installing Rancher on a Kubernetes cluster. A highly available Kubernetes install is comprised of three nodes running the Rancher server components on a Kubernetes cluster. The persistence layer (etcd) is also replicated on these three nodes, providing redundancy and data duplication in case one of the nodes fails.

4.1.5.2.1 Add the Helm chart repository #

From a system that has access to the Internet, fetch the latest Helm chart and copy the resulting manifests to a system that has access to the Rancher server cluster.

If you have not already, install helm locally on a workstation that has Internet access. Refer to the Helm version requirements to choose a version of Helm to install Rancher.
Use the helm repo add command to add the Helm chart repository that contains charts to install Rancher Prime.
```
> helm repo add rancher-prime helm-chart-repo-url
```
Fetch the Rancher Prime chart. This will pull down the chart and save it in the current directory as a .tgz file.
- To fetch the latest version:
```
> helm fetch rancher-prime/rancher
```
- To fetch a specific version:
  Check to see which version of Rancher Prime are available.
```
> helm search repo --versions rancher-prime
```
  Fetch a specific version by specifying the --version parameter:
```
> helm fetch rancher-prime/rancher --version=version
```

4.1.5.2.2 Choose your SSL configuration #

Rancher server is designed to be secure by default and requires SSL/TLS configuration.

When Rancher is installed on an air-gapped Kubernetes cluster, there are two recommended options for the source of the certificate.

Note

To terminate SSL/TLS externally, see TLS termination on an External Load Balancer.

Table 2: SSL configuration options #

Configuration	Chart option	Description	Requires cert-manager
Rancher Generated Self-Signed Certificates	`ingress.tls.source=rancher`	Use certificates issued by Rancher's generated CA (self signed). This is the default and does not need to be added when rendering the Helm template.	yes
Certificates from Files	`ingress.tls.source=secret`	Use your own certificate files by creating Kubernetes secret(s). This option must be passed when rendering the Rancher Helm template.	no

4.1.5.2.3 Helm chart options for air-gapped installations #

When setting up the Rancher Helm template, there are several options in the Helm chart that are designed specifically for air-gapped installations.

Table 3: Air-gapped Helm chart options #

Chart option	Chart value	Description
`certmanager.version`	version	Configure proper Rancher TLS issuer depending of running cert-manager version.
`systemDefaultRegistry`	REGISTRY.YOURDOMAIN.COM:PORT	Configure Rancher server to always pull from your private registry when provisioning clusters.
`useBundledSystemChart`	`true`	Configure Rancher server to use the packaged copy of Helm system charts. The system charts repository contains all the catalog items required for features such as monitoring, logging, alerting and global DNS. These Helm charts are located in GitHub, but since you are in an air-gapped environment, using the charts that are bundled within Rancher is much easier than setting up a Git mirror.

4.1.5.2.4 Fetch the cert-manager chart #

Based on the choice your made in Section 4.1.5.2.2, “Choose your SSL configuration”, complete one of the procedures below.

4.1.5.2.4.1 Option A: default self-signed certificate #

By default, Rancher generates a CA and uses cert-manager to issue the certificate for access to the Rancher server interface.

Note

Recent changes to cert-manager require an upgrade. If you are upgrading Rancher and using a version of cert-manager older than v0.11.0, please see our upgrade cert-manager documentation.

From a system connected to the internet, add the cert-manager repo to Helm:
```
> helm repo add jetstack https://charts.jetstack.io
helm repo update
```
Fetch the latest cert-manager chart available from the Helm chart repository.
```
> helm fetch jetstack/cert-manager --version v1.11.0
```

Download the required CRD file for cert-manager:

> curl -L -o cert-manager-crd.yaml https://github.com/cert-manager/cert-manager/releases/download/v1.11.0/cert-manager.crds.yaml

4.1.5.2.5 Install rancher #

Copy the fetched charts to a system that has access to the Rancher server cluster to complete installation.

4.1.5.2.5.1 1. Install cert-manager #

Install cert-manager with the same options you would use to install the chart. Remember to set the image.repository option to pull the image from your private registry.

Note

To see options on how to customize the cert-manager install (including for cases where your cluster uses PodSecurityPolicies), see the cert-manager docs.

If you are using self-signed certificates, install cert-manager:

Create the namespace for cert-manager.
```
> kubectl create namespace cert-manager
```
Create the cert-manager CustomResourceDefinitions (CRDs).
```
> kubectl apply -f cert-manager-crd.yaml
```

Install cert-manager.

> helm install cert-manager ./cert-manager-v1.11.0.tgz \
  --namespace cert-manager \
  --set image.repository=REGISTRY.YOURDOMAIN.COM:PORT/quay.io/jetstack/cert-manager-controller \
  --set webhook.image.repository=REGISTRY.YOURDOMAIN.COM:PORT/quay.io/jetstack/cert-manager-webhook \
  --set cainjector.image.repository=REGISTRY.YOURDOMAIN.COM:PORT/quay.io/jetstack/cert-manager-cainjector \
  --set startupapicheck.image.repository=REGISTRY.YOURDOMAIN.COM:PORT/quay.io/jetstack/cert-manager-ctl

4.1.5.2.5.2 Install SUSE Rancher Prime #

Refer to Adding TLS Secrets to publish the certificate files so Rancher and the Ingress controller can use them.

Create the namespace for Rancher using kubectl:

> kubectl create namespace cattle-system

Install Rancher, declaring your chosen options. Use the reference table below to replace each placeholder. Rancher needs to be configured to use the private registry to provision any Rancher launched Kubernetes clusters or Rancher tools.

Table 4: Placeholder descriptions #

Placeholder	Description
VERSION	The version number of the output tarball.
RANCHER.YOURDOMAIN.COM	The DNS name you pointed at your load balancer.
REGISTRY.YOURDOMAIN.COM:PORT	The DNS name for your private registry.
CERTMANAGER_VERSION	cert-manager version running on k8s cluster.

> helm install rancher ./rancher-VERSION.tgz \
  --namespace cattle-system \
  --set hostname=RANCHER.YOURDOMAIN.COM \
  --set certmanager.version=CERTMANAGER_VERSION \
  --set rancherImage=REGISTRY.YOURDOMAIN.COM:PORT/rancher/rancher \
  --set systemDefaultRegistry=REGISTRY.YOURDOMAIN.COM:PORT \ # Set a default private registry to be used in Rancher
  --set useBundledSystemChart=true # Use the packaged Rancher system charts

Optional: to install a specific Rancher version, set the rancherImageTag value, for example: --set rancherImageTag=v2.5.8

4.1.5.2.5.3 Option B: certificates from files using Kubernetes secrets #

Create Kubernetes secrets from your own certificates for Rancher to use. The common name for the cert will need to match the hostname option in the command below, or the Ingress controller will fail to provision the site for Rancher.

Install SUSE Rancher Prime, declaring your chosen options. Use the reference table below to replace each placeholder. Rancher needs to be configured to use the private registry to provision any Rancher launched Kubernetes clusters or Rancher tools.

Table 5: Placeholder descriptions #

Placeholder	Description
VERSION	The version number of the output tarball.
RANCHER.YOURDOMAIN.COM	The DNS name you pointed at your load balancer.
REGISTRY.YOURDOMAIN.COM:PORT	The DNS name for your private registry.

> helm install rancher ./rancher-VERSION.tgz \
  --namespace cattle-system \
  --set hostname=RANCHER.YOURDOMAIN.COM \
  --set rancherImage=REGISTRY.YOURDOMAIN.COM:PORT/rancher/rancher \
  --set ingress.tls.source=secret \
  --set systemDefaultRegistry=REGISTRY.YOURDOMAIN.COM:PORT \ # Set a default private registry to be used in Rancher
  --set useBundledSystemChart=true # Use the packaged Rancher system charts

If you are using a Private CA signed cert, add --set privateCA=true following --set ingress.tls.source=secret:

> helm install rancher ./rancher-VERSION.tgz \
  --namespace cattle-system \
  --set hostname=RANCHER.YOURDOMAIN.COM \
  --set rancherImage=REGISTRY.YOURDOMAIN.COM:PORT/rancher/rancher \
  --set ingress.tls.source=secret \
  --set privateCA=true \
  --set systemDefaultRegistry=REGISTRY.YOURDOMAIN.COM:PORT \ # Set a default private registry to be used in Rancher
  --set useBundledSystemChart=true # Use the packaged Rancher system charts

The installation is complete.

4.1.5.3 Additional resources #

These resources could be helpful when installing Rancher:

4.2 Installing the NVIDIA GPU Operator on the SUSE Rancher Prime: RKE2 cluster #

The NVIDIA operator allows administrators of Kubernetes clusters to manage GPUs just like CPUs. It includes everything needed for pods to be able to operate GPUs.

4.2.1 Host OS requirements #

To expose the GPU to the pod correctly, the NVIDIA kernel drivers and the libnvidia-ml library must be correctly installed in the host OS. The NVIDIA Operator can automatically install drivers and libraries on specific operating systems. Check the NVIDIA documentation for information on supported operating system releases. Installation of the NVIDIA components on your host OS is out of the scope of this document. Refer to the NVIDIA documentation for instructions.

The following three commands should return a correct output if the kernel driver is correctly installed.

lsmod | grep nvidia returns a list of NVIDIA kernel modules. For example:

nvidia_uvm           2129920  0
nvidia_drm            131072  0
nvidia_modeset       1572864  1 nvidia_drm
video                  77824  1 nvidia_modeset
nvidia               9965568  2 nvidia_uvm,nvidia_modeset
ecc                    45056  1 nvidia

cat /proc/driver/nvidia/version returns the NVRM and GCC versions of the driver. For example:

NVRM version: NVIDIA UNIX Open Kernel Module for x86_64  555.42.06
  Release Build  (abuild@host)  Thu Jul 11 12:00:00 UTC 2024
  GCC version:  gcc version 7.5.0 (SUSE Linux)

find /usr/ -iname libnvidia-ml.so returns a path to the libnvidia-ml.so library. For example:
```
/usr/lib64/libnvidia-ml.so
```
This library is used by Kubernetes components to interact with the kernel driver.

4.2.2 Operator installation #

Once the OS is ready and RKE2 is running, adjust the RKE2 nodes:

On the agent nodes of RKE2, run the following command:

# echo PATH=$PATH:/usr/local/nvidia/toolkit >> /etc/default/rke2-agent

On the server nodes of RKE2, run the following command:

# echo PATH=$PATH:/usr/local/nvidia/toolkit >> /etc/default/rke2-server

Then, install the NVIDIA GPU Operator using the following YAML manifest.

apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
  name: gpu-operator
  namespace: kube-system
spec:
  repo: https://helm.ngc.nvidia.com/nvidia
  chart: gpu-operator
  targetNamespace: gpu-operator
  createNamespace: true
  valuesContent: |-
    toolkit:
      env:
      - name: CONTAINERD_SOCKET
        value: /run/k3s/containerd/containerd.sock

Warning

The NVIDIA operator restarts containerd with a hangup call, which restarts RKE2.

After approximately one minute, you can make the following checks to verify that everything works as expected.

Assuming the drivers and libnvidia-ml.so library are installed, check if the operator detects them correctly.
```
> kubectl get node NODENAME \
  -o jsonpath='{.metadata.labels}' | grep "nvidia.com/gpu.deploy.driver"
```
You should see the value pre-installed. If you see true, the drivers are not installed correctly. If the requirements in Section 4.2.1, “Host OS requirements” are met, you may have forgotten to reboot the node after installing all packages.
You can also check other driver labels:
```
> kubectl get node NODENAME \
  -o jsonpath='{.metadata.labels}' | jq | grep "nvidia.com"
```
You should see labels specifying driver and GPU, for example, nvidia.com/gpu.machine or nvidia.com/cuda.driver.major.
Check if the GPU was added by nvidia-device-plugin-daemonset as an allocatable resource in the node.
```
> kubectl get node NODENAME \
  -o jsonpath='{.status.allocatable}' | jq
```
You should see "nvidia.com/gpu": followed by the number of GPUs in the node.
Check that the container runtime binary was installed by the operator (in particular, by the nvidia-container-toolkit-daemonset):
```
> ls /usr/local/nvidia/toolkit/nvidia-container-runtime
```
Verify whether the containerd configuration was updated to include the NVIDIA container runtime.
```
> grep nvidia /var/lib/rancher/rke2/agent/etc/containerd/config.toml
```

Run a pod to verify that the GPU resource can successfully be scheduled on a pod and the pod can detect it.

apiVersion: v1
kind: Pod
metadata:
  name: nbody-gpu-benchmark
  namespace: default
spec:
  restartPolicy: OnFailure
  runtimeClassName: nvidia
  containers:
  - name: cuda-container
    image: nvcr.io/nvidia/k8s/cuda-sample:nbody
    args: ["nbody", "-gpu", "-benchmark"]
    resources:
      limits:
        nvidia.com/gpu: 1
    env:
    - name: NVIDIA_VISIBLE_DEVICES
      value: all
    - name: NVIDIA_DRIVER_CAPABILITIES
      value: compute,utility

Note: Version gate

Available as of October 2024 releases: v1.28.15+rke2r1, v1.29.10+rke2r1, v1.30.6+rke2r1, v1.31.2+rke2r1.

RKE2 will now use PATH to find alternative container runtimes, in addition to checking the default paths used by the container runtime packages. To use this feature, you must modify the RKE2 service's PATH environment variable to add the directories containing the container runtime binaries.

We recommend modifying one of these two environment files:

/etc/default/rke2-server # or rke2-agent
/etc/sysconfig/rke2-server # or rke2-agent

This example adds the PATH in /etc/default/rke2-server:

> echo PATH=$PATH >> /etc/default/rke2-server

Warning

PATH changes should be done with care to avoid placing untrusted binaries in the path of services that run as root.

4.3 Registering existing clusters #

In this section, you will learn how to register existing RKE2 clusters in SUSE Rancher Prime (Rancher).

The cluster registration feature replaced the feature for importing clusters.

The control that Rancher has to manage a registered cluster depends on the type of cluster. For details, see Section 4.3.3, “Management capabilities for registered clusters”.

4.3.1 Prerequisites #

4.3.1.1 Kubernetes node roles #

Registered RKE Kubernetes clusters must have all three node roles: etcd, controlplane and worker. A cluster with only controlplane components cannot be registered in Rancher.

For more information on RKE node roles, see the best practices.

4.3.1.2 Permissions #

To register a cluster in Rancher, you must have cluster-admin privileges within that cluster. If you do not, grant these privileges to your user by running:

> kubectl create clusterrolebinding cluster-admin-binding \
  --clusterrole cluster-admin \
  --user USER_ACCOUNT

4.3.2 Registering a cluster #

Click ☰ > Cluster Management.
On the Clusters page, click Import Existing.
Choose the type of cluster.
Use Member Roles to configure user authorization for the cluster. Click Add Member to add users who can access the cluster. Use the Role drop-down list to set permissions for each user.
If you are importing a generic Kubernetes cluster in Rancher, perform the following steps for setup:
1. Click Agent Environment Variables under Cluster Options to set environment variables for the Rancher cluster agent. The environment variables can be set using key-value pairs. If the Rancher agent requires the use of a proxy to communicate with the Rancher server, HTTP_PROXY, HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables can be set using agent environment variables.
2. Enable Project Network Isolation to ensure the cluster supports Kubernetes NetworkPolicy resources. Users can select the Project Network Isolation option under the Advanced Options drop-down list to do so.
3. Configure the version management feature for imported RKE2 and K3s clusters.
Click Create.
The requirements for cluster-admin privileges are shown (see Section 4.3.1, “Prerequisites”), including an example command to fulfill them.
Copy the kubectl command to your clipboard and run it on a node where kubeconfig is configured to point to the cluster you want to import. If you are unsure it is configured correctly, run kubectl get nodes to verify before running the command shown in Rancher.
If you are using self-signed certificates, you will receive the message certificate signed by unknown authority. To work around this validation, copy the command starting with curl displayed in Rancher to your clipboard. Then run the command on a node where kubeconfig is configured to point to the cluster you want to import.
After you finish running the command(s) on your node, click Done.

Important

The NO_PROXY environment variable is not standardized, and the accepted format of the value can differ between applications. When configuring the NO_PROXY variable for Rancher, the value must adhere to the format expected by Golang.

Specifically, the value should be a comma-delimited string that contains only IP addresses, CIDR notation, domain names or special DNS labels (such as *). For a full description of the expected value format, refer to the upstream Golang documentation.

Note: Expected results

Your cluster is registered and assigned a state of Pending. Rancher is deploying resources to manage your cluster.
You can access your cluster after its state is updated to Active.
Active clusters are assigned two projects: Default (containing the namespace default) and System (containing the namespaces cattle-system, ingress-nginx, kube-public and kube-system, if present).

Note

You cannot re-register a cluster that is currently active in a Rancher setup.

4.3.3 Management capabilities for registered clusters #

The control that Rancher has to manage a registered cluster depends on the type of cluster.

4.3.3.1 Features for all registered clusters #

After registering a cluster, the cluster owner can:

Manage cluster access through role-based access control
Enable monitoring, alerts and notifiers
Enable logging
Enable Istio
Manage projects and workloads

4.3.3.2 Additional features for registered RKE2 and SUSE Rancher Prime: K3s clusters #

SUSE Rancher Prime: K3s is a lightweight, fully compliant Kubernetes distribution for edge installations.

RKE2 is Rancher's next-generation Kubernetes distribution for data center and cloud installations.

When an RKE2 or SUSE Rancher Prime: K3s cluster is registered in Rancher, Rancher will recognize it. The Rancher UI will expose features available to Section 4.3.3.1, “Features for all registered clusters”, along with the following options for editing and upgrading the cluster:

Enable or disable version management.
Upgrade the Kubernetes version when version management is enabled.
Configure the upgrade strategy.
View a read-only version of the cluster’s configuration arguments and environment variables used to launch each node.

4.3.4 Configuring version management for RKE2 and SUSE Rancher Prime: K3s clusters #

Warning

When version management is enabled for an imported cluster, upgrading it outside of Rancher may lead to unexpected consequences.

The version management feature for imported RKE2 and SUSE Rancher Prime: K3s clusters can be configured using one of the following options:

Global default (default): Inherits behavior from the global imported-cluster-version-management setting.
True: Enables version management, allowing users to control the Kubernetes version and upgrade strategy of the cluster through Rancher.
False: Disables version management, enabling users to manage the cluster’s Kubernetes version independently, outside of Rancher.

You can define the default behavior for newly created clusters or existing ones set to “Global default” by modifying the imported-cluster-version-management setting.

Changes to the global imported-cluster-version-management setting take effect during the cluster’s next reconciliation cycle.

Note

If version management is enabled for a cluster, Rancher will deploy the system-upgrade-controller app, along with the associated plans and other required Kubernetes resources, to the cluster. If version management is disabled, Rancher will remove these components from the cluster.

4.3.5 Configuring RKE2 and SUSE Rancher Prime: K3s cluster upgrades #

Tip

It is a Kubernetes best practice to back up the cluster before upgrading. When upgrading a high-availability SUSE Rancher Prime: K3s cluster with an external database, back up the database in whichever way is recommended by the relational database provider.

The concurrency is the maximum number of nodes that are permitted to be unavailable during an upgrade. If the number of unavailable nodes is larger than the concurrency, the upgrade will fail. If an upgrade fails, you may need to repair or remove failed nodes before the upgrade can succeed.

Control plane concurrency: the maximum number of server nodes to upgrade at a single time; also the maximum unavailable server nodes
Worker concurrency: the maximum number of worker nodes to upgrade at the same time; also the maximum unavailable worker nodes

In the RKE2 and SUSE Rancher Prime: K3s documentation, control plane nodes are called server nodes. These nodes run the Kubernetes master, which maintains the desired state of the cluster. By default, these control plane nodes can have workloads scheduled to them by default.

Also in the RKE2 and SUSE Rancher Prime: K3s documentation, nodes with the worker role are called agent nodes. Any workloads or pods that are deployed in the cluster can be scheduled to these nodes by default.

4.3.6 Debug logging and troubleshooting for registered RKE2 and SUSE Rancher Prime: K3s clusters #

Nodes are upgraded by the system upgrade controller running in the downstream cluster. Based on the cluster configuration, Rancher deploys two plans to upgrade nodes: one for control plane nodes and one for workers. The system upgrade controller follows the plans and upgrades the nodes.

To enable debug logging on the system upgrade controller deployment, edit the configmap to set the debug environment variable to true. Then restart the system-upgrade-controller pod.

Logs created by the system-upgrade-controller can be viewed by running this command:

> kubectl logs -n cattle-system system-upgrade-controller

The current status of the plans can be viewed with this command:

> kubectl get plans -A -o yaml

Tip

If the cluster becomes stuck during upgrading, restart the system-upgrade-controller.

To prevent issues when upgrading, the Kubernetes upgrade best practices should be followed.

4.3.7 Authorized cluster endpoint support for RKE2 and SUSE Rancher Prime: K3s clusters #

Rancher supports Authorized Cluster Endpoints (ACE) for registered RKE2 and SUSE Rancher Prime: K3s clusters. This support includes manual steps you will perform on the downstream cluster to enable the ACE. For additional information on the authorized cluster endpoint, refer to How the Authorized Cluster Endpoint Works.

Note: Notes

These steps only need to be performed on the control plane nodes of the downstream cluster. You must configure each control plane node individually.
The following steps will work on both RKE2 and SUSE Rancher Prime: K3s clusters registered in v2.6.x as well as those registered (or imported) from a previous version of Rancher with an upgrade to v2.6.x.
These steps will alter the configuration of the downstream RKE2 and SUSE Rancher Prime: K3s clusters and deploy the kube-api-authn-webhook. If a future implementation of the ACE requires an update to the kube-api-authn-webhook, then this would also have to be done manually. For more information on this webhook, see Authentication webhook documentation.

Procedure 2: Manual steps to be taken on the control plane of each downstream cluster to enable ACE #

Create a file at /var/lib/rancher/{rke2,k3s}/kube-api-authn-webhook.yaml with the following contents:

apiVersion: v1
kind: Config
clusters:
- name: Default
  cluster:
    insecure-skip-tls-verify: true
    server: http://127.0.0.1:6440/v1/authenticate
users:
- name: Default
  user:
    insecure-skip-tls-verify: true
current-context: webhook
contexts:
- name: webhook
  context:
    user: Default
    cluster: Default

Add the following to the configuration file (or create one if it does not exist). Note that the default location is /etc/rancher/{rke2,k3s}/config.yaml:
```
kube-apiserver-arg:
  - authentication-token-webhook-config-file=/var/lib/rancher/{rke2,k3s}/kube-api-authn-webhook.yaml
```

Run the following commands:

> sudo systemctl stop {rke2,k3s}-server
> sudo systemctl start {rke2,k3s}-server

Finally, you must go back to the Rancher UI and edit the imported cluster there to complete the ACE enablement. Click on ⋮ > Edit Config, then click the Networking tab under Cluster Configuration. Finally, click the Enabled button for Authorized Endpoint. Once the ACE is enabled, you then have the option of entering a fully qualified domain name (FQDN) and certificate information.

Note

The FQDN field is optional, and if one is entered, it should point to the downstream cluster. Certificate information is only needed if there is a load balancer in front of the downstream cluster that is using an untrusted certificate. If you have a valid certificate, then nothing needs to be added to the CA Certificates field.

4.3.8 Annotating registered clusters #

For all types of registered Kubernetes clusters except for RKE2 and SUSE Rancher Prime: K3s Kubernetes clusters, Rancher does not have any information about how the cluster is provisioned or configured.

Therefore, when Rancher registers a cluster, it assumes that several capabilities are disabled by default. Rancher assumes this to avoid exposing UI options to the user even when the capabilities are not enabled in the registered cluster.

However, if the cluster has a certain capability, a user of that cluster might still want to select the capability for the cluster in the Rancher UI. To do that, the user will need to manually indicate to Rancher that certain capabilities are enabled for the cluster.

By annotating a registered cluster, it is possible to indicate to Rancher that a cluster was given additional capabilities outside of Rancher.

The following annotation indicates Ingress capabilities. Note that the values of non-primitive objects need to be JSON-encoded, with quotations escaped.

"capabilities.cattle.io/ingressCapabilities": "[
  {
    "customDefaultBackend":true,
    "ingressProvider":"asdf"
  }
]"

These capabilities can be annotated for the cluster:

ingressCapabilities
loadBalancerCapabilities
nodePoolScalingSupported
nodePortRange
taintSupport

All the capabilities and their type definitions can be viewed in the Rancher API view, at RANCHER_SERVER_URL/v3/schemas/capabilities.

To annotate a registered cluster,

Click ☰ > Cluster Management.
On the Clusters page, go to the custom cluster you want to annotate and click ⋮ > Edit Config.
Expand the Labels & Annotations section.
Click Add Annotation.
Add an annotation to the cluster with the format capabilities/<capability>: <value> where value is the cluster capability that will be overridden by the annotation. In this scenario, Rancher is not aware of any capabilities of the cluster until you add the annotation.
Click Save.

Tip

The annotation does not give the capabilities to the cluster, but it does indicate to Rancher that the cluster has those capabilities.

4.4 Assigning GPU nodes to applications #

When deploying a containerized application to Kubernetes, you need to ensure that containers requiring GPU resources are run on appropriate worker nodes. For example, Ollama, a core component of SUSE AI, can deeply benefit from the use of GPU acceleration. This topic describes how to satisfy this requirement by explicitly requesting GPU resources and labeling worker nodes for configuring the node selector.

Requirements #

Kubernetes cluster—such as SUSE Rancher Prime: RKE2—must be available and configured with more than one worker node in which certain nodes have NVIDIA GPU resources and others do not.
This document assumes that any kind of deployment to the Kubernetes cluster is done using Helm charts.

4.4.1 Labeling GPU nodes #

To distinguish nodes with the GPU support from non-GPU nodes, Kubernetes uses labels. Labels are used for relevant metadata and should not be confused with annotations that provide simple information about a resource. It is possible to manipulate labels with the kubectl command, as well as by tweaking configuration files from the nodes. If an IaC tool such as Terraform is used, labels can be inserted in the node resource configuration files.

To label a single node, use the following command:

> kubectl label node GPU_NODE_NAME accelerator=nvidia-gpu

To achieve the same result by tweaking the node.yaml node configuration, add the following content and apply the changes with kubectl apply -f node.yaml:

apiVersion: v1
kind: Node
metadata:
  name: node-name
  labels:
    accelerator: nvidia-gpu

Tip: Labeling multiple nodes

To label multiple nodes, use the following command:

> kubectl label node \
  GPU_NODE_NAME1 \
  GPU_NODE_NAME2 ... \
  accelerator=nvidia-gpu

Tip

If Terraform is being used as an IaC tool, you can add labels to a group of nodes by editing the .tf files and adding the following values to a resource:

resource "node_group" "example" {
  labels = {
    "accelerator" = "nvidia-gpu"
  }
}

To check if the labels are correctly applied, use the following command:

> kubectl get nodes --show-labels

4.4.2 Assigning GPU nodes #

The matching between a container and a node is configured by the explicit resource allocation and the use of labels and node selectors. The use cases described below focus on NVIDIA GPUs.

4.4.2.1 Enable GPU passthrough #

Containers are isolated from the host environment by default. For the containers that rely on the allocation of GPU resources, their Helm charts must enable GPU passthrough so that the container can access and use the GPU resource. Without enabling the GPU passthrough, the container may still run, but it can only use the main CPU for all computations. Refer to Ollama Helm chart for an example of the configuration required for GPU acceleration.

4.4.2.2 Assignment by resource request #

After the NVIDIA GPU Operator is configured on a node, you can instantiate applications requesting the resource nvidia.com/gpu provided by the operator. Add the following content to your values.yaml file. Specify the number of GPUs according to your setup.

resources:
  requests:
    nvidia.com/gpu: 1
  limits:
    nvidia.com/gpu: 1

4.4.2.3 Assignment by labels and node selectors #

If affected cluster nodes are labeled with a label such as accelerator=nvidia-gpu, you can configure the node selector to check for the label. In this case, use the following values in your values.yaml file.

nodeSelector:
  accelerator: nvidia-gpu

4.4.3 Verifying Ollama GPU assignment #

If the GPU is correctly detected, the Ollama container logs this event:

| [...] source=routes.go:1172 msg="Listening on :11434 (version 0.0.0)"                                              │
│ [...] source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama2502346830/runners                       │
│ [...] source=payload.go:44 msg="Dynamic LLM libraries [cuda_v12 cpu cpu_avx cpu_avx2]"                             │
│ [...] source=gpu.go:204 msg="looking for compatible GPUs"                                                          │
│ [...] source=types.go:105 msg="inference compute" id=GPU-c9ad37d0-d304-5d2a-c2e6-d3788cd733a7 library=cuda compute │

4.5 Installing SUSE Security #

This article describes how to install SUSE Security to scan SUSE AI nodes for vulnerabilities and improve data protection.

4.5.1 Installing SUSE Security in air-gapped environments #

Follow the information in this section to perform an air-gapped deployment of SUSE Security.

4.5.1.1 Required tools #

We need to install three tools for downloading all components of SUSE Security.

Helm - Application Lifecycle Manager
Skopeo - Image/Registry Tool
ZStandard - Compression Algorithm

Install Helm.

 > curl -fsSL \
  https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Install Skopeo and zstd.

 > sudo zypper update && zypper install zstd skopeo -y

4.5.1.2 Get images and a Helm chart #

To get all the needed images, use the chart itself. Using Helm, add the repo and download the chart. Use Skopeo for downloading and uploading.

Create a directory for the images.
```
> > mkdir -p neuvector/images
```

Add a SUSE Security repository.

> helm repo add neuvector \
  https://neuvector.github.io/neuvector-helm/

Download the latest SUSE Security chart.

> helm repo update \
  && helm pull neuvector/core -d neuvector

You should see a file named core-X.Y.Z.tgz. To obtain the list of required images, run the following command:

> helm template neuvector/core-*.tgz \
  | awk '$1 ~ /image:/ {print $2}' | sed -e 's/\"//g' \
  > neuvector/images/list.txt

Download the images based on the generated list.

for i in $(cat neuvector/images/list.txt); do \
  skopeo copy docker://$i docker-archive:neuvector/images/$(echo $i| awk -F/ '{print $3}'|sed 's/:/_/g').tar:$(echo $i| awk -F/ '{print $3}') \
done

Now you have a directory similar to the following one:

> ls -lR neuvector
neuvector:
total 16
-rw-r--r--. 1 root root 15892 Jan  8 14:33 core-2.4.0.tgz
drwxr-xr-x. 2 root root   153 Jan  8 14:35 images

neuvector/images:
total 953920
-rw-r--r--. 1 root root 236693504 Jan  8 14:35 controller_5.3.2.tar
-rw-r--r--. 1 root root 226704384 Jan  8 14:35 enforcer_5.3.2.tar
-rw-r--r--. 1 root root       176 Jan  8 14:34 list.txt
-rw-r--r--. 1 root root 331550208 Jan  8 14:35 manager_5.3.2.tar
-rw-r--r--. 1 root root 169589760 Jan  8 14:35 scanner_latest.tar
-rw-r--r--. 1 root root  12265472 Jan  8 14:35 updater_latest.tar

4.5.1.3 Compress and move the images to the local network #

Use tar with the ZST format for maximum compression level.

> tar -I zstd -vcf neuvector_airgap.zst neuvector

Move the created neuvector_airgap.zst archive to the isolated local network.

4.5.1.4 Uncompress and load the images #

Uncompress the images from the archive. The following example leaves them in the neuvector directory.

> tar -I zstd -vxf neuvector_airgap.zst

Loading the images into a local image registry requires you to understand your local network. This example uses registry.awesome.sauce as the DNS name. Loading the images is done with the skopeo command. Make sure that skopeo is installed on the air-gapped local machine. You may need to authenticate with skopeo login first.

>  export REGISTRY=registry.awesome.sauce
for file in $(ls neuvector/images | grep -v txt ); do
     skopeo copy docker-archive:neuvector/images/$file docker://$(echo $file | sed 's/.tar//g' | awk -F_ '{print "'$REGISTRY'/neuvector/"$1":"$2}')
done

With all the images loaded in a local image registry, you can install using Helm.

4.5.1.5 Install the transferred images on a local cluster #

To install the images from the local registry, you must override REGISTRY and NEU_URL variables. Also, adjust the imagePullSecrets option to include the secret for your cluster to authenticate to the registry.

> export REGISTRY=registry.awesome.sauce
> export NEU_URL=neuvector.awesome.sauce
> helm upgrade -i neuvector --namespace neuvector neuvector/core \
--create-namespace --set imagePullSecrets=regsecret --set k3s.enabled=true \
--set k3s.runtimePath=/run/k3s/containerd/containerd.sock \
--set manager.ingress.enabled=true --set controller.pvc.enabled=true \
--set controller.pvc.capacity=10Gi --set manager.svc.type=ClusterIP \
--set registry=$REGISTRY --set tag=5.3.2 \
--set controller.image.repository=neuvector/controller \
--set enforcer.image.repository=neuvector/enforcer \
--set manager.image.repository=neuvector/manager \
--set cve.updater.image.repository=neuvector/updater \
--set manager.ingress.host=$NEU_URL

4.6 Setting up SUSE Observability for SUSE AI #

SUSE Observability provides comprehensive monitoring and insights into your infrastructure and applications. It enables efficient tracking of metrics, logs and traces, helping you maintain optimal performance and troubleshoot issues effectively. This procedure guides you through setting up SUSE Observability for the SUSE AI environment using the SUSE AI Observability Extension.

4.6.1 Deployment scenarios #

You can deploy SUSE Observability and SUSE AI in two different ways:

Single-Cluster setup: Both SUSE AI and SUSE Observability are installed in the same Kubernetes cluster. This is a simpler approach ideal for testing and proof-of-concept deployments. Communication between components can use internal cluster DNS.
Multi-Cluster setup: SUSE AI and SUSE Observability are installed on separate, dedicated Kubernetes clusters. This setup is recommended for production environments because it isolates workloads. Communication requires exposing the SUSE Observability endpoints externally, for example, via an Ingress.

This section provides instructions for both scenarios.

4.6.2 Requirements #

To set up SUSE Observability for SUSE AI, you need to meet the following requirements:

Have access to SUSE Application Collection
Have a valid SUSE AI subscription
Have a valid license for SUSE Observability in SUSE Customer Center
Instrument your applications for telemetry data acquisition with OpenTelemetry.

For details on how to collect traces and metrics from SUSE AI components and user-developed applications, refer to Monitoring SUSE AI with OpenTelemetry and SUSE Observability. It includes configurations that are essential for full observability.

Important: SUSE Application Collection not instrumented by default

Applications from the SUSE Application Collection are not instrumented by default. If you want to monitor your AI applications, you need to follow the instrumentation guidelines that we provide in the document Monitoring SUSE AI with OpenTelemetry and SUSE Observability.

4.6.3 Setup process overview #

The following chart shows the high-level steps for the setup procedure. You will first set up the SUSE Observability cluster, then configure the SUSE AI cluster, and finally instrument your applications. Execute the steps in each column from left to right and top to bottom.

Blue steps are related to Helm chart installations.
Gray steps represent another type of interaction, such as coding.

The chart showing a high-level overview of the SUSE Observability setup

Figure 17: High-level overview of the SUSE Observability setup #

Tip: Setup clusters

You can create and configure Kubernetes clusters for SUSE AI and SUSE Observability as you prefer. If you are using SUSE Rancher Prime, check its documentation. For testing purposes, you can even share one cluster for both deployments. You can skip instructions on setting up a specific cluster if you already have one configured.

The diagram below shows the result of the above steps. There are two clusters represented, one for the SUSE Observability workload and another one for SUSE AI. You may use identical setup or customize it for your environment.

The chart showing setup of separate clusters for SUSE AI and SUSE Observability

Figure 18: Separate clusters for SUSE AI and SUSE Observability #

Points to notice #

You can install SUSE AI Observability Extension alongside SUSE Observability. It means that you can confidently use the internal Kubernetes DNS.
SUSE Observability contains several components and the following two of them need to be accessible by the AI Cluster:
- The Collector endpoint. Refer to Exposing SUSE Observability outside of the cluster or Self-hosted SUSE Observability for details about exposing it.
- The SUSE Observability API. Refer to Exposing SUSE Observability outside of the cluster for details about exposing it.
- Milvus metrics and traces can be scraped by the OpenTelemetry Collector with simple configurations, provided below. The same is true for GPU metrics.
- To get information from Open WebUI, Ollama or vLLM, you must have a specific instrumentation set. It can be an application instrumented with the OpenLIT SDK or other form of instrumentation following the same patterns.
Important
Remember that in multi-cluster setups, it is critical to properly expose your endpoints. Configure TLS, be careful with the configuration, and make sure to provide the right keys and tokens. More details are provided in the respective instructions.

4.6.4 Setting up the SUSE Observability cluster #

This initial step is identical for both single-cluster and multi-cluster deployments.

Install SUSE Observability. You can follow the official SUSE Observability air-gapped installation documentation for all installation instructions. Remember to expose your APIs and collector endpoints to your SUSE AI cluster.
Important: Multi-cluster setup
For multi-cluster setups, you must expose the SUSE Observability API and collector endpoints so that the SUSE AI cluster can reach them. Refer to the guide on exposing SUSE Observability outside of the cluster.

Install the SUSE Observability extension. Create a new Helm values file named genai_values.yaml. Before creating the file, review the placeholders below.

SUSE_OBSERVABILITY_API_URL: The URL of the SUSE Observability API. For multi-cluster deployments, this is the external URL. For single-cluster deployments, this can be the internal service URL. Example: http://suse-observability-api.your-domain.com
SUSE_OBSERVABILITY_API_KEY: The API key from the baseConfig_values.yaml file used during the SUSE Observability installation.
SUSE_OBSERVABILITY_API_TOKEN_TYPE: Can be api for a token from the Web UI or service for a Service Token.
SUSE_OBSERVABILITY_TOKEN: The API or Service token itself.
OBSERVED_SERVER_NAME: The name of the cluster to observe. It must match the name used in the Kubernetes StackPack configuration. Example: suse-ai-cluster.

Create the genai_values.yaml file with the following content:

global:
  imagePullSecrets:
  - application-collection 1
  imageRegistry: LOCAL_DOCKER_REGISTRY_URL:5043
serverUrl: SUSE_OBSERVABILITY_API_URL
apiKey: SUSE_OBSERVABILITY_API_KEY
tokenType: SUSE_OBSERVABILITY_API_TOKEN_TYPE
apiToken: SUSE_OBSERVABILITY_TOKEN
clusterName: OBSERVED_SERVER_NAME

1	Instructs Helm to use credentials from the SUSE Application Collection. For instructions on how to configure the image pull secrets for the SUSE Application Collection, refer to the official documentation.

Run the install command.

> helm upgrade --install ai-obs \
  charts/suse-ai-observability-extension-X.Y.Z.tgz \
  -f genai_values.yaml --namespace so-extensions --create-namespace

Note: Self-signed certificates not supported

Self-signed certificates are not supported. Consider running the extension in the same cluster as SUSE Observability and then use the internal Kubernetes address.

After the installation is complete, a new menu called GenAI is added to the Web interface and also a Kubernetes cron job is created that synchronizes the topology view with the components found in the SUSE AI cluster.

Verify SUSE Observability extension. After the installation, you can verify that a new lateral menu appears:
Figure 19: New GenAI Observability menu item #

4.6.5 Setting up the SUSE AI cluster #

Follow the instructions for your deployment scenario.

Single-cluster deployment: In this setup, the SUSE AI components are installed in the same cluster as SUSE Observability and can communicate using internal service DNS.
Multi-cluster deployment: In this setup, the SUSE AI cluster is separate. Communication relies on externally exposed endpoints of the SUSE Observability cluster.

The difference between deployment scenarios affects the OTEL Collector exporter configuration and the SUSE Observability Agent URL as described in the following list.

SUSE_OBSERVABILITY_API_URL

The URL of the SUSE Observability API.

Single-cluster example: http://suse-observability-otel-collector.suse-observability.svc.cluster.local:4317

Multi-cluster example: https://suse-observability-api.your-domain.com

SUSE_OBSERVABILITY_COLLECTOR_ENDPOINT

The endpoint of the SUSE Observability Collector.

Single-cluster example: http://suse-observability-router.suse-observability.svc.cluster.local:8080/receiver/stsAgent

Multi-cluster example: https://suse-observability-router.your-domain.com/receiver/stsAgent

Install NVIDIA GPU Operator. Follow the instructions in https://documentation.suse.com/cloudnative/rke2/latest/en/advanced.html#_deploy_nvidia_operator.

Install OpenTelemetry collector. Create a secret with your SUSE Observability API key in the namespace where you want to install the collector. Retrieve the API key using the Web UI or from the baseConfig_values.yaml file that you used during the SUSE Observability installation. If the namespace does not exist yet, create it.

kubectl create namespace observability
kubectl create secret generic open-telemetry-collector \
  --namespace observability \
  --from-literal=API_KEY='SUSE_OBSERVABILITY_API_KEY'

Create a new file named otel-values.yaml with the following content.

global:
  imagePullSecrets:
  - application-collection
repository: LOCAL_DOCKER_REGISTRY_URL:5043/opentelemetry-collector-k8s
extraEnvsFrom:
  - secretRef:
      name: open-telemetry-collector
mode: deployment
ports:
  metrics:
    enabled: true
presets:
  kubernetesAttributes:
    enabled: true
    extractAllPodLabels: true
config:
  receivers:
    prometheus:
      config:
        scrape_configs:
          - job_name: 'gpu-metrics'
            scrape_interval: 10s
            scheme: http
            kubernetes_sd_configs:
              - role: endpoints
                namespaces:
                  names:
                    - gpu-operator
          - job_name: 'milvus'
            scrape_interval: 15s
            metrics_path: '/metrics'
            static_configs:
              - targets: ['MILVUS_SERVICE_NAME.SUSE_AI_NAMESPACE.svc.cluster.local:9091'] 1
          - job_name: 'vllm'
            scrape_interval: 10s
            scheme: http
            kubernetes_sd_configs:
              - role: service
            relabel_configs:
              - source_labels: [__meta_kubernetes_namespace]
                action: keep
                regex: 'VLLM_NAMESPACE' 2

              - source_labels: [__meta_kubernetes_service_name]
                action: keep
                regex: '.*VLLM_RELEASE_NAME.*' 3
  exporters:
    otlp:
      endpoint: https://OPEN_TELEMETRY_COLLECTOR_NAME.suse-observability.svc.cluster.local:4317 4
      headers:
        Authorization: "SUSEObservability ${env:API_KEY}"
      tls:
        insecure: true
  processors:
    tail_sampling:
      decision_wait: 10s
      policies:
      - name: rate-limited-composite
        type: composite
        composite:
          max_total_spans_per_second: 500
          policy_order: [errors, slow-traces, rest]
          composite_sub_policy:
          - name: errors
            type: status_code
            status_code:
              status_codes: [ ERROR ]
          - name: slow-traces
            type: latency
            latency:
              threshold_ms: 1000
          - name: rest
            type: always_sample
          rate_allocation:
          - policy: errors
            percent: 33
          - policy: slow-traces
            percent: 33
          - policy: rest
            percent: 34
    resource:
      attributes:
      - key: k8s.cluster.name
        action: upsert
        value: CLUSTER_NAME 5
      - key: service.instance.id
        from_attribute: k8s.pod.uid
        action: insert
    filter/dropMissingK8sAttributes:
      error_mode: ignore
      traces:
        span:
          - resource.attributes["k8s.node.name"] == nil
          - resource.attributes["k8s.pod.uid"] == nil
          - resource.attributes["k8s.namespace.name"] == nil
          - resource.attributes["k8s.pod.name"] == nil
  connectors:
    spanmetrics:
      metrics_expiration: 5m
      namespace: otel_span
    routing/traces:
      error_mode: ignore
      table:
      - statement: route()
        pipelines: [traces/sampling, traces/spanmetrics]
  service:
    extensions:
      - health_check
    pipelines:
      traces:
        receivers: [otlp, jaeger]
        processors: [filter/dropMissingK8sAttributes, memory_limiter, resource]
        exporters: [routing/traces]
      traces/spanmetrics:
        receivers: [routing/traces]
        processors: []
        exporters: [spanmetrics]
      traces/sampling:
        receivers: [routing/traces]
        processors: [tail_sampling, batch]
        exporters: [debug, otlp]
      metrics:
        receivers: [otlp, spanmetrics, prometheus]
        processors: [memory_limiter, resource, batch]
        exporters: [debug, otlp]

1	Configure the Milvus service and namespace for the Prometheus scraper. Because Milvus will be installed in subsequent steps, you can return to this step and edit the endpoint if necessary.
4	Set the exporter to your exposed SUSE Observability collector. Remember that the value can be distinct, depending on the deployment pattern. For production usage, we recommend using TLS communication.
5	Replace CLUSTER_NAME with the cluster's name.

Finally, run the installation command.

> helm upgrade --install opentelemetry-collector \
  oci://dp.apps.rancher.io/charts/opentelemetry-collector \
  -f otel-values.yaml --namespace observability

Verify the installation by checking the existence of a new deployment and service in the observability namespace.

The GPU metrics scraper that we configure in the OTEL Collector requires custom RBAC rules. Create a file named otel-rbac.yaml with the following content:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: suse-observability-otel-scraper
rules:
  - apiGroups:
      - ""
    resources:
      - services
      - endpoints
    verbs:
      - list
      - watch

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: suse-observability-otel-scraper
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: suse-observability-otel-scraper
subjects:
  - kind: ServiceAccount
    name: opentelemetry-collector
    namespace: observability

Then apply the configuration by running the following command.

> kubectl apply -n gpu-operator -f otel-rbac.yaml

Install the SUSE Observability Agent.

> helm upgrade --install \
  --namespace suse-observability --create-namespace \
  --set-string 'stackstate.apiKey'='YOUR_API_KEY'1 \
  --set-string 'stackstate.cluster.name'='CLUSTER_NAME2' \
  --set-string 'stackstate.url'='http://suse-observability-router.suse-observability.svc.cluster.local:8080/receiver/stsAgent'3 \
  --set 'nodeAgent.skipKubeletTLSVerify'=true suse-observability-agent \
  suse-observability/suse-observability-agent

1	Retrieve the API key using the Web UI or from the `baseConfig_values.yaml` file that you used during the SUSE Observability installation.
2	Replace CLUSTER_NAME with the cluster's name.
3	Replace with your SUSE Observability server URL.

Warning: SUSE Observability version 2.6.2 and above

With SUSE Observability version 2.6.2, a change of the standard behavior broke the vLLM monitoring performed by the extension. To fix it, update otel-values.yaml to include the following additions. No changes are required for people using SUSE Observability version 2.6.1 and below.

Add a new processor.

config:
  processors:
    ... # same as before
    transform:
      metric_statements:
        - context: metric
          statements:
            - replace_pattern(name, "^vllm:", "vllm_")

Modify the metrics pipeline to perform the transformation defined above:

config:
  service:
    pipelines:
      ... # same as before
      metrics:
        receivers: [otlp, spanmetrics, prometheus]
        processors: [transform, memory_limiter, resource, batch]
        exporters: [debug, otlp]

Install SUSE AI. Refer to Section 5, “Installing applications from AI Library” for the complete procedure.

4.6.6 Instrument applications #

Instrumentation is the act of configuring your applications for telemetry data acquisition. Our stack employs OpenTelemetry standards as a vendor-neutral and open base for our telemetry. For a comprehensive guide on how to set up your instrumentation, please refer to Monitoring SUSE AI with OpenTelemetry and SUSE Observability.

By following the instructions in the document referenced above, you will be able to retrieve all relevant telemetry data from Open WebUI, Ollama, Milvus and vLLM by simply applying specific configuration to their Helm chart values. You can find links for advanced use cases (auto-instrumentation with the OTEL Operator) at the end of the document.

5 Installing applications from AI Library #

SUSE AI is delivered as a set of components that you can combine to meet specific use cases. To enable the full integrated stack, you need to deploy multiple applications in sequence. Applications with the fewest dependencies must be installed first, followed by dependent applications once their required dependencies are in place within the cluster.

5.1 Installation procedure #

This procedure includes steps to install AI Library applications in an air-gapped environments.

Note

If the following steps do not specify on which part of the air-gapped architecture—local or remote—the task should be performed, assume remote. The isolated local part is always be specified.

Purchase the SUSE AI entitlement. It is a separate entitlement from SUSE Rancher Prime.
Access SUSE AI via the SUSE Application Collection at https://apps.rancher.io/ to perform the check for the SUSE AI entitlement.
If the entitlement check is successful, you are given access to the SUSE AI-related Helm charts and container images, and can deploy directly from the SUSE Application Collection.
Visit the SUSE Application Collection, sign in and get the user access token as described in https://docs.apps.rancher.io/get-started/authentication/.
On the local cluster, create a Kubernetes namespace if it does not already exist. The steps in this procedure assume that all containers are deployed into the same namespace referred to as SUSE_AI_NAMESPACE. Replace its name to match your preferences.
```
> kubectl create namespace SUSE_AI_NAMESPACE
```

Create the SUSE Application Collection secret.

> kubectl create secret docker-registry application-collection \
  --docker-server=dp.apps.rancher.io \
  --docker-username=APPCO_USERNAME \
  --docker-password=APPCO_USER_TOKEN \
  -n SUSE_AI_NAMESPACE

> helm registry login dp.apps.rancher.io/charts \
  -u APPCO_USERNAME \
  -p APPCO_USER_TOKEN

On the remote host, download the SUSE-AI-get-images.sh script from the air-gap stack and run it.
```
> ./SUSE-AI-get-images.sh
```
This script creates a subdirectory with all necessary Helm charts plus suse-ai-containers.tgz and suse-ai-containers.txt files.
Create a Docker registry on one of the local hosts so that the local Kubernetes cluster can access it.
Securely transfer the created subdirectory with Helm charts plus suse-ai-containers.tgz and suse-ai-containers.txt files from the remote host to a local host and load all container images to the local Docker registry. Set DST_REGISTRY_USERNAME and DST_REGISTRY_PASSWORD environment variables if they are required to access the registry.
```
> ./SUSE-AI-load-images.sh \
  -d LOCAL_DOCKER_REGISTRY_URL \
  -i charts/suse-ai-containers.txt \
  -f charts/suse-ai-containers.tgz
```
Install cert-manager as described in Section 5.2, “Installing cert-manager”.
Install AI Library components.
1. Install Milvus as described in Section 5.3, “Installing Milvus”.
2. (Optional) Install Ollama as described in Section 5.4, “Installing Ollama”.
3. Install Open WebUI as described in Section 5.5, “Installing Open WebUI”.

5.2 Installing cert-manager #

cert-manager is an extensible X.509 certificate controller for Kubernetes workloads. It supports certificates from popular public issuers as well as private issuers. cert-manager ensures that the certificates are valid and up-to-date, and attempts to renew certificates at a configured time before expiry.

In previous releases, cert-manager was automatically installed together with Open WebUI. Currently, cert-manager is no longer part of the Open WebUI Helm chart and you need to install it separately.

5.2.1 Details about the cert-manager application #

Before deploying cert-manager, it is important to know more about the supported configurations and documentation. The following command provides the corresponding details:

helm show values oci://dp.apps.rancher.io/charts/cert-manager

Alternatively, you can also refer to the cert-manager Helm chart page on the SUSE Application Collection site at https://apps.rancher.io/applications/cert-manager. It contains available versions and the link to pull the cert-manager container image.

5.2.2 cert-manager installation procedure #

Tip

Before the installation, you need to get user access to the SUSE Application Collection, create a Kubernetes namespace, and log in to the Helm registry as described in Section 5.1, “Installation procedure”.

Install the cert-manager chart.

> helm upgrade
--install cert-manager charts/cert-manager-X.Y.Z.tgz \
  -n CERT_MANAGER_NAMESPACE \
  --set crds.enabled=true \
  --set 'global.imagePullSecrets[0].name'=application-collection \
  --set 'global.imageRegistry'=LOCAL_DOCKER_REGISTRY_URL:5043

5.2.3 Uninstalling cert-manager #

To uninstall cert-manager, run the following command:

> helm uninstall cert-manager -n CERT_MANAGER_NAMESPACE

5.3 Installing Milvus #

Milvus is a scalable, high-performance vector database designed for AI applications. It enables efficient organization and searching of massive unstructured datasets, including text, images and multi-modal content. This procedure walks you through the installation of Milvus and its dependencies.

5.3.1 Details about the Milvus application #

Before deploying Milvus, it is important to know more about the supported configurations and documentation. The following command provides the corresponding details:

helm show values oci://dp.apps.rancher.io/charts/milvus

Alternatively, you can also refer to the Milvus Helm chart page on the SUSE Application Collection site at https://apps.rancher.io/applications/milvus. It contains Milvus dependencies, available versions and the link to pull the Milvus container image.

Figure 20: Milvus page in the SUSE Application Collection #

5.3.2 Milvus installation procedure #

Tip

When installed as part of SUSE AI, Milvus depends on etcd, MinIO and Apache Kafka. Because the Milvus chart uses a non-default configuration, create an override file milvus_custom_overrides.yaml with the following content.

Tip

As a template, you can use the values.yaml file that is included in the charts/milvus-X.Y.Z.tgz TAR archive.

global:
  imagePullSecrets:
  - application-collection
  imageRegistry: LOCAL_DOCKER_REGISTRY_URL:5043
cluster:
  enabled: True
standalone:
  persistence:
    persistentVolumeClaim:
      storageClassName: "local-path"
etcd:
  replicaCount: 1
  persistence:
    storageClassName: "local-path"
minio:
  mode: distributed
  replicas: 4
  rootUser: "admin"
  rootPassword: "adminminio"
  persistence:
    storageClass: "local-path"
  resources:
    requests:
      memory: 1024Mi
kafka:
  enabled: true
  name: kafka
  replicaCount: 3
  broker:
    enabled: true
  cluster:
    listeners:
      client:
        protocol: 'PLAINTEXT'
      controller:
        protocol: 'PLAINTEXT'
  persistence:
    enabled: true
    annotations: {}
    labels: {}
    existingClaim: ""
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 8Gi
    storageClassName: "local-path"
extraConfigFiles:1
  user.yaml: |+
    trace:
      exporter: jaeger
      sampleFraction: 1
      jaeger:
        url: "http://opentelemetry-collector.observability.svc.cluster.local:14268/api/traces"2

1	The `extraConfigFiles` section is optional, required only to receive telemetry data from Open WebUI.
2	The URL of the OpenTelemetry Collector installed by the user.

Tip

The above example uses local storage. For production environments, we recommend using an enterprise class storage solution such as SUSE Storage in which case the storageClassName option must be set to longhorn.

Install the Milvus Helm chart using the milvus_custom_overrides.yaml override file.

> helm upgrade --install \
  milvus charts/milvus-X.Y.Z.tgz \
  -n SUSE_AI_NAMESPACE \
  --version X.Y.Z -f milvus_custom_overrides.yaml

5.3.2.1 Using Apache Kafka with SUSE Storage #

When Milvus is deployed in cluster mode, it uses Apache Kafka as a message queue. If Apache Kafka uses SUSE Storage as a storage back-end, you need to create an XFS storage class and make it available for the Apache Kafka deployment. Otherwise deploying Apache Kafka with a storage class of an Ext4 file system fails with the following error:

"Found directory /mnt/kafka/logs/lost+found, 'lost+found' is not
  in the form of topic-partition or topic-partition.uniqueId-delete
  (if marked for deletion)"

To introduce the XFS storage class, follow these steps:

Create a file named longhorn-xfs.yaml with the following content:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: longhorn-xfs
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
  numberOfReplicas: "3"
  staleReplicaTimeout: "30"
  fromBackup: ""
  fsType: "xfs"
  dataLocality: "disabled"
  unmapMarkSnapChainRemoved: "ignored"

Create the new storage class using the kubectl command.
```
> kubectl apply -f longhorn-xfs.yaml
```
Update the Milvus overrides YAML file to reference the Apache Kafka storage class, as in the following example:
```
  [...]
    kafka:
    enabled: true
    persistence:
      storageClassName: longhorn-xfs
```

5.3.3 Uninstalling Milvus #

To uninstall Milvus, run the following command:

> helm uninstall milvus -n SUSE_AI_NAMESPACE

5.4 Installing Ollama #

Ollama is a tool for running and managing language models locally on your computer. It offers a simple interface to download, run and interact with models without relying on cloud resources.

Tip

When installing SUSE AI, Ollama is installed by the Open WebUI installation by default. If you decide to install Ollama separately, disable its installation during the installation of Open WebUI as outlined in Example 4, “Open WebUI override file with Ollama installed separately”.

5.4.1 Details about the Ollama application #

Before deploying Ollama, it is important to know more about the supported configurations and documentation. The following command provides the corresponding details:

helm show values oci://dp.apps.rancher.io/charts/ollama

Alternatively, you can also refer to the Ollama Helm chart page on the SUSE Application Collection site at https://apps.rancher.io/applications/ollama. It contains the available versions and a link to pull the Ollama container image.

5.4.2 Ollama installation procedure #

Tip

Create the ollama_custom_overrides.yaml file to override the values of the parent Helm chart. Refer to Section 5.4.4, “Values for the Ollama Helm chart” for more details.
Install the Ollama Helm chart using the ollama-custom-overrides.yaml override file.
```
> helm upgrade \
  --install ollama charts/ollama-X.Y.Z.tgz \
  -n SUSE_AI_NAMESPACE \
  -f ollama_custom_overrides.yaml
```
Important: Downloading AI models
Ollama normally needs to have an active Internet connection to download AI models. In an air-gapped environment, you must download the models manually and copy them to your local Ollama instance, for example:
```
kubectl cp
  PATH_TO_LOCALLY_DOWNLOADED_MODELS/blobs/* \
  OLLAMA_POD_NAME:~/.ollama/models/blobs/
```
Tip: Hugging Face models
Models downloaded from Hugging Face need to be converted before they can be used by Ollama. Refer to https://github.com/ollama/ollama/blob/main/docs/import.md for more details.

5.4.3 Uninstalling Ollama #

To uninstall Ollama, run the following command:

> helm uninstall ollama -n SUSE_AI_NAMESPACE

5.4.4 Values for the Ollama Helm chart #

To override the default values during the Helm chart installation or update, you can create an override YAML file with custom values. Then, apply these values by specifying the path to the override file with the -f option of the helm command. Remember to replace SUSE_AI_NAMESPACE with your Kubernetes namespace.

Important: GPU section

Ollama can run optimized for NVIDIA GPUs if the following conditions are fulfilled:

The NVIDIA driver and NVIDIA GPU Operator are installed as described in Installing NVIDIA GPU Drivers on SLES or Installing NVIDIA GPU Drivers on SUSE Linux Micro.
The workloads are set to run on NVIDIA-enabled nodes as described in https://documentation.suse.com/suse-ai/1.0/html/AI-deployment-intro/index.html#ai-gpu-nodes-assigning.

If you do not want to use the NVIDIA GPU, remove the gpu section from ollama_custom_overrides.yaml or disable it.

 ollama:
  [...]
  gpu:
    enabled: false
    type: 'nvidia'
    number: 1

Example 1: Basic override file with GPU and two models pulled at startup #

global:
  imagePullSecrets:
  - application-collection
ingress:
  enabled: false
defaultModel: "gemma:2b"
runtimeClassName: nvidia
ollama:
  models:
    pull:
      - "gemma:2b"
      - "llama3.1"
    run:
      - "gemma:2b"
      - "llama3.1"
  gpu:
    enabled: true
    type: 'nvidia'
    number: 1
    nvidiaResource: "nvidia.com/gpu"
persistentVolume:1
  enabled: true
  storageClass: local-path2

1	Without the `persistentVolume` option enabled, changes made to Ollama—such as downloading other LLM— are lost when the container is restarted.
2	Use `local-path` storage only for testing purposes. For production use, we recommend using a storage solution suitable for persistent storage, such as SUSE Storage.

Example 2: Basic override file with Ingress and no GPU #

ollama:
  models:
    pull:
      - llama2
    run:
      - llama2
  persistentVolume:
    enabled: true
    storageClass: local-path1
ingress:
  enabled: true
  hosts:
  - host: OLLAMA_API_URL
    paths:
      - path: /
        pathType: Prefix

1	Use `local-path` storage (requires installing the corresponding provisioner) only for testing purposes. For production use, we recommend using a storage solution suitable for persistent storage, such as SUSE Storage.

Table 6: Override file options for the Ollama Helm chart #

Key	Type	Default	Description
affinity	object	{}	Affinity for pod assignment
autoscaling.enabled	bool	false	Enable autoscaling
autoscaling.maxReplicas	int	100	Number of maximum replicas
autoscaling.minReplicas	int	1	Number of minimum replicas
autoscaling.targetCPUUtilizationPercentage	int	80	CPU usage to target replica
extraArgs	list	[]	Additional arguments on the output Deployment definition.
extraEnv	list	[]	Additional environment variables on the output Deployment definition.
fullnameOverride	string	""	String to fully override template
global.imagePullSecrets	list	[]	Global override for container image registry pull secrets
global.imageRegistry	string	""	Global override for container image registry
hostIPC	bool	false	Use the host’s IPC namespace
hostNetwork	bool	false	Use the host's network namespace
hostPID	bool	false	Use the host's PID namespace.
image.pullPolicy	string	"IfNotPresent"	Image pull policy to use for the Ollama container
image.registry	string	"dp.apps.rancher.io"	Image registry to use for the Ollama container
image.repository	string	"containers/ollama"	Image repository to use for the Ollama container
image.tag	string	"0.3.6"	Image tag to use for the Ollama container
imagePullSecrets	list	[]	Docker registry secret names as an array
ingress.annotations	object	{}	Additional annotations for the Ingress resource
ingress.className	string	""	IngressClass that is used to implement the Ingress (Kubernetes 1.18+)
ingress.enabled	bool	false	Enable Ingress controller resource
ingress.hosts[0].host	string	"ollama.local"
ingress.hosts[0].paths[0].path	string	"/"
ingress.hosts[0].paths[0].pathType	string	"Prefix"
ingress.tls	list	[]	The TLS configuration for host names to be covered with this Ingress record
initContainers	list	[]	Init containers to add to the pod
knative.containerConcurrency	int	0	Knative service container concurrency
knative.enabled	bool	false	Enable Knative integration
knative.idleTimeoutSeconds	int	300	Knative service idle timeout seconds
knative.responseStartTimeoutSeconds	int	300	Knative service response start timeout seconds
knative.timeoutSeconds	int	300	Knative service timeout seconds
livenessProbe.enabled	bool	true	Enable livenessProbe
livenessProbe.failureThreshold	int	6	Failure threshold for livenessProbe
livenessProbe.initialDelaySeconds	int	60	Initial delay seconds for livenessProbe
livenessProbe.path	string	"/"	Request path for livenessProbe
livenessProbe.periodSeconds	int	10	Period seconds for livenessProbe
livenessProbe.successThreshold	int	1	Success threshold for livenessProbe
livenessProbe.timeoutSeconds	int	5	Timeout seconds for livenessProbe
nameOverride	string	""	String to partially override template (maintains the release name)
nodeSelector	object	{}	Node labels for pod assignment
ollama.gpu.enabled	bool	false	Enable GPU integration
ollama.gpu.number	int	1	Specify the number of GPUs
ollama.gpu.nvidiaResource	string	"nvidia.com/gpu"	Only for NVIDIA cards; change to `nvidia.com/mig-1g.10gb` to use MIG slice
ollama.gpu.type	string	"nvidia"	GPU type: “nvidia” or “amd.” If “ollama.gpu.enabled” is enabled, the default value is “nvidia.” If set to “amd,” this adds the “rocm” suffix to the image tag if “image.tag” is not override. This is because AMD and CPU/CUDA are different images.
ollama.insecure	bool	false	Add insecure flag for pulling at container startup
ollama.models	list	[]	List of models to pull at container startup. The more you add, the longer the container takes to start if models are not present models: - llama2 - mistral
ollama.mountPath	string	""	Override ollama-data volume mount path, default: "/root/.ollama"
persistentVolume.accessModes	list	["ReadWriteOnce"]	Ollama server data Persistent Volume access modes. Must match those of existing PV or dynamic provisioner, see https://kubernetes.io/docs/concepts/storage/persistent-volumes/.
persistentVolume.annotations	object	{}	Ollama server data Persistent Volume annotations
persistentVolume.enabled	bool	false	Enable persistence using PVC
persistentVolume.existingClaim	string	""	If you want to bring your own PVC for persisting Ollama state, pass the name of the created + ready PVC here. If set, this Chart does not create the default PVC. Requires `server.persistentVolume.enabled: true`
persistentVolume.size	string	"30Gi"	Ollama server data Persistent Volume size
persistentVolume.storageClass	string	""	If persistentVolume.storageClass is present, and is set to either a dash (“-”) or empty string (“ /”), dynamic provisioning is disabled. Otherwise, the storageClassName for persistent volume claim is set to the given value specified by persistentVolume.storageClass. If persistentVolume.storageClass is absent, the default storage class is used for dynamic provisioning whenever possible. See https://kubernetes.io/docs/concepts/storage/storage-classes/ for more details.
persistentVolume.subPath	string	""	Subdirectory of Ollama server data Persistent Volume to mount. Useful if the volume's root directory is not empty.
persistentVolume.volumeMode	string	""	Ollama server data Persistent Volume Binding Mode. If empty (the default) or set to null, no volumeBindingMode specification is set, choosing the default mode.
persistentVolume.volumeName	string	""	Ollama server Persistent Volume name. It can be used to force-attach the created PVC to a specific PV.
podAnnotations	object	{}	Map of annotations to add to the pods
podLabels	object	{}	Map of labels to add to the pods
podSecurityContext	object	{}	Pod Security Context
readinessProbe.enabled	bool	true	Enable readinessProbe
readinessProbe.failureThreshold	int	6	Failure threshold for readinessProbe
readinessProbe.initialDelaySeconds	int	30	Initial delay seconds for readinessProbe
readinessProbe.path	string	"/"	Request path for readinessProbe
readinessProbe.periodSeconds	int	5	Period seconds for readinessProbe
readinessProbe.successThreshold	int	1	Success threshold for readinessProbe
readinessProbe.timeoutSeconds	int	3	Timeout seconds for readinessProbe
replicaCount	int	1	Number of replicas
resources.limits	object	{}	Pod limit
resources.requests	object	{}	Pod requests
runtimeClassName	string	""	Specify runtime class
securityContext	object	{}	Container Security Context
service.annotations	object	{}	Annotations to add to the service
service.nodePort	int	31434	Service node port when service type is “NodePort”
service.port	int	11434	Service port
service.type	string	"ClusterIP"	Service type
serviceAccount.annotations	object	{}	Annotations to add to the service account
serviceAccount.automount	bool	true	Whether to automatically mount a ServiceAccount's API credentials
serviceAccount.create	bool	true	Whether a service account should be created
serviceAccount.name	string	""	The name of the service account to use. If not set and “create” is “true”, a name is generated using the full name template.
tolerations	list	[]	Tolerations for pod assignment
topologySpreadConstraints	object	{}	Topology Spread Constraints for pod assignment
updateStrategy	object	{"type":""}	How to replace existing pods.
updateStrategy.type	string	""	Can be “Recreate” or “RollingUpdate”; default is “RollingUpdate”
volumeMounts	list	[]	Additional volumeMounts on the output Deployment definition
volumes	list	[]	Additional volumes on the output Deployment definition

5.5 Installing Open WebUI #

Open WebUI is a Web-based user interface designed for interacting with AI models.

5.5.1 Details about the Open WebUI application #

Before deploying Open WebUI, it is important to know more about the supported configurations and documentation. The following command provides the corresponding details:

helm show values oci://dp.apps.rancher.io/charts/open-webui

Alternatively, you can also refer to the Open WebUI Helm chart page on the SUSE Application Collection site at https://apps.rancher.io/applications/open-webui. It contains available versions and the link to pull the Open WebUI container image.

5.5.2 Open WebUI installation procedure #

Tip

Requirements #

To install Open WebUI, you need to have the following:

An installed cert-manager. If cert-manager is not installed from previous Open WebUI releases, install it by following the steps in Section 5.2, “Installing cert-manager”.

Create the owui_custom_overrides.yaml file to override the values of the parent Helm chart. The file contains URLs for Milvus and Ollama and specifies whether a stand-alone Ollama deployment is used or whether Ollama is installed as part of the Open WebUI installation. Find more details in Section 5.5.4, “Examples of Open WebUI Helm chart override files”. For a list of all installation options with examples, refer to Section 5.5.5, “Values for the Open WebUI Helm chart”.

Install the Open WebUI Helm chart using the owui_custom_overrides.yaml override file.

> helm upgrade --install \
  open-webui charts/open-webui-X.Y.Z.tgz \
  -n SUSE_AI_NAMESPACE \
  --version X.Y.Z -f owui_custom_overrides.yaml

5.5.3 Uninstalling Open WebUI #

To uninstall Open WebUI, run the following command:

> helm uninstall open-webui -n SUSE_AI_NAMESPACE

5.5.4 Examples of Open WebUI Helm chart override files #

Example 3: Open WebUI override file with Ollama included #

The following override file installs Ollama during the Open WebUI installation.

global:
  imagePullSecrets:
  - application-collection
  imageRegistry: LOCAL_DOCKER_REGISTRY_URL:5043
ollamaUrls:
- http://open-webui-ollama.SUSE_AI_NAMESPACE.svc.cluster.local:11434
persistence:
  enabled: true
  storageClass: local-path1
ollama:
  enabled: true
  ingress:
    enabled: false
  defaultModel: "gemma:2b"
  ollama:
    models:2
      - "gemma:2b"
      - "llama3.1"
    gpu:3
      enabled: true
      type: 'nvidia'
      number: 1
    persistentVolume:4
      enabled: true
      storageClass: local-path
pipelines:
  enabled: true
  persistence:
    storageClass: local-path
  extraEnvVars: 5
    - name: PIPELINES_URLS 6
      value: "https://raw.githubusercontent.com/SUSE/suse-ai-observability-extension/refs/heads/main/integrations/oi-filter/suse_ai_filter.py"
    - name: OTEL_SERVICE_NAME 7
      value: "Open WebUI"
    - name: OTEL_EXPORTER_HTTP_OTLP_ENDPONT 8
      value: "http://opentelemetry-collector.suse-observability.svc.cluster.local:4318"
    - name: PRICING_JSON 9
      value: "https://raw.githubusercontent.com/SUSE/suse-ai-observability-extension/refs/heads/main/integrations/oi-filter/pricing.json"
ingress:
  enabled: true
  class: ""
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/proxy-body-size: "1024m"
  host: suse-ollama-webui10
  tls: true
extraEnvVars:
- name: DEFAULT_MODELS11
  value: "gemma:2b"
- name: DEFAULT_USER_ROLE
  value: "user"
- name: WEBUI_NAME
  value: "SUSE AI"
- name: GLOBAL_LOG_LEVEL
  value: INFO
- name: RAG_EMBEDDING_MODEL
  value: "sentence-transformers/all-MiniLM-L6-v2"
- name: VECTOR_DB
  value: "milvus"
- name: MILVUS_URI
  value: http://milvus.SUSE_AI_NAMESPACE.svc.cluster.local:19530
- name: INSTALL_NLTK_DATASETS12
  value: "true"
- name: OMP_NUM_THREADS
  value: "1"
- name: OPENAI_API_KEY 13
  value: "0p3n-w3bu!"

1	Use `local-path` storage only for testing purposes. For production use, we recommend using a storage solution more suitable for persistent storage. To use SUSE Storage, specify `longhorn`.
2	Specifies that two large language models (LLM) will be loaded in Ollama when the container starts.
3	Enables GPU support for Ollama. The `type` must be `nvidia` because NVIDIA GPUs are the only supported devices. `number` must be between 1 and the number of NVIDIA GPUs present on the system.
4	Without the `persistentVolume` option enabled, changes made to Ollama—such as downloading other LLM— are lost when the container is restarted.
5	The environment variables that you are making available for the pipeline's runtime container.
6	A list of pipeline URLs to be downloaded and installed by default. Individual URLs are separated by a semicolon `;`. For air-gapped deployments, you need to provide the pipelines at URLs that are accessible from the local host, such as an internal GitLab instance.
7	The service name that appears in traces and topological representations in SUSE Observability.
8	The endpoint for the OpenTelemetry collector. Make sure to use the HTTP port of your collector.
9	A file for the model multipliers in cost estimation. You can customize it to match your actual infrastructure experimentally. For air-gapped deployments, you need to provide the pipelines at URLs that are accessible from the local host, such as an internal GitLab instance.
11	Specifies the default LLM for Ollama.
10	Specifies the host name for the Open WebUI Web UI.
12	Installs the natural language toolkit (NLTK) datasets for Ollama. Refer to https://www.nltk.org/index.html for licensing information.
13	API key value for communication between Open WebUI and Open WebUI Pipelines. The default value is “0p3n-w3bu!”.

Example 4: Open WebUI override file with Ollama installed separately #

The following override file installs Ollama separately from the Open WebUI installation.

global:
  imagePullSecrets:
  - application-collection
  imageRegistry: LOCAL_DOCKER_REGISTRY_URL:5043
ollamaUrls:
- http://ollama.SUSE_AI_NAMESPACE.svc.cluster.local:11434
persistence:
  enabled: true
  storageClass: local-path1
ollama:
  enabled: false
pipelines:
  enabled: False
  persistence:
    storageClass: local-path2
ingress:
  enabled: true
  class: ""
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  host: suse-ollama-webui
  tls: true
extraEnvVars:
- name: DEFAULT_MODELS3
  value: "gemma:2b"
- name: DEFAULT_USER_ROLE
  value: "user"
- name: WEBUI_NAME
  value: "SUSE AI"
- name: GLOBAL_LOG_LEVEL
  value: INFO
- name: RAG_EMBEDDING_MODEL
  value: "sentence-transformers/all-MiniLM-L6-v2"
- name: VECTOR_DB
  value: "milvus"
- name: MILVUS_URI
  value: http://milvus.SUSE_AI_NAMESPACE.svc.cluster.local:19530
- name: ENABLE_OTEL4
  value: "true"
- name: OTEL_EXPORTER_OTLP_ENDPOINT5
  value: http://opentelemetry-collector.observability.svc.cluster.local:43176
- name: OMP_NUM_THREADS
  value: "1"

1 2	Use `local-path` storage only for testing purposes. For production use, we recommend using a storage solution suitable for persistent storage, such as SUSE Storage.
3	Specifies the default LLM for Ollama.
4 5	These values are optional, required only to receive telemetry data from Open WebUI.
6	The URL of the OpenTelemetry Collector installed by the user.

Example 5: Open WebUI override file with pipelines enabled #

The following override file installs Ollama separately and enables Open WebUI pipelines. This simple filter adds a limit to the number of question and answer turns during the LLM chat.

Tip

Pipelines normally require additional configuration provided either via environment variables or specified in the Open WebUI Web UI.

global:
  imagePullSecrets:
  - application-collection
  imageRegistry: LOCAL_DOCKER_REGISTRY_URL:5043
ollamaUrls:
- http://ollama.SUSE_AI_NAMESPACE.svc.cluster.local:11434
persistence:
  enabled: true
  storageClass: local-path
ollama:
  enabled: false
pipelines:
  enabled: true
  persistence:
    storageClass: local-path
  extraEnvVars:
  - name: PIPELINES_URLS 1
    value: "https://raw.githubusercontent.com/SUSE/suse-ai-observability-extension/refs/heads/main/integrations/oi-filter/conversation_turn_limit_filter.py"
ingress:
  enabled: true
  class: ""
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  host: suse-ollama-webui
  tls: true
[...]

1	A list of pipeline URLs to be downloaded and installed by default. Individual URLs are separated by a semicolon `;`. For air-gapped deployments, you need to provide the pipelines at URLs that are accessible from the local host, such as an internal GitLab instance.

5.5.5 Values for the Open WebUI Helm chart #

Table 7: Available options for the Open WebUI Helm chart #

Key	Type	Default	Description
affinity	object	{}	Affinity for pod assignment
annotations	object	{}
cert-manager.enabled	bool	true
clusterDomain	string	"cluster.local"	Value of cluster domain
containerSecurityContext	object	{}	Configure container security context, see https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-containe.
extraEnvVars	list	[{"name":"OPENAI_API_KEY", "value":"0p3n-w3bu!"}]	Environment variables added to the Open WebUI deployment. Most up-to-date environment variables can be found in https://docs.openwebui.com/getting-started/env-configuration/.
extraEnvVars[0]	object	{"name":"OPENAI_API_KEY","value":"0p3n-w3bu!"}	Default API key value for Pipelines. It should be updated in a production deployment and changed to the required API key if not using Pipelines.
global.imagePullSecrets	list	[]	Global override for container image registry pull secrets
global.imageRegistry	string	""	Global override for container image registry
global.tls.additionalTrustedCAs	bool	false
global.tls.issuerName	string	"suse-private-ai"
global.tls.letsEncrypt.email	string	"none@example.com"
global.tls.letsEncrypt.environment	string	"staging"
global.tls.letsEncrypt.ingress.class	string	""
global.tls.source	string	"suse-private-ai"	The source of Open WebUI TLS keys, see Section 5.5.5.1, “TLS sources”.
image.pullPolicy	string	"IfNotPresent"	Image pull policy to use for the Open WebUI container
image.registry	string	"dp.apps.rancher.io"	Image registry to use for the Open WebUI container
image.repository	string	"containers/open-webui"	Image repository to use for the Open WebUI container
image.tag	string	"0.3.32"	Image tag to use for the Open WebUI container
imagePullSecrets	list	[]	Configure imagePullSecrets to use private registry, see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry.
ingress.annotations	object	{"nginx.ingress.kubernetes.io/ssl-redirect":"true"}	Use appropriate annotations for your Ingress controller, such as `nginx.ingress.kubernetes.io/rewrite-target: /` for NGINX.
ingress.class	string	""
ingress.enabled	bool	true
ingress.existingSecret	string	""
ingress.host	string	""
ingress.tls	bool	true
nameOverride	string	""
nodeSelector	object	{}	Node labels for pod assignment
ollama.enabled	bool	true	Automatically install Ollama Helm chart from https://otwld.github.io/ollama-helm/. Configure the following Helm values.
ollama.fullnameOverride	string	"open-webui-ollama"	If enabling embedded Ollama, update fullnameOverride to your desired Ollama name value, or else it will use the default ollama.name value from the Ollama chart.
ollamaUrls	list	[]	A list of Ollama API endpoints. These can be added instead of automatically installing the Ollama Helm chart, or in addition to it.
openaiBaseApiUrl	string	""	OpenAI base API URL to use. Defaults to the Pipelines service endpoint when Pipelines are enabled, or to `https://api.openai.com/v1` if Pipelines are not enabled and this value is blank.
persistence.accessModes	list	["ReadWriteOnce"]	If using multiple replicas, you must update accessModes to ReadWriteMany.
persistence.annotations	object	{}
persistence.enabled	bool	true
persistence.existingClaim	string	""	Use existingClaim to reuse an existing Open WebUI PVC instead of creating a new one.
persistence.selector	object	{}
persistence.size	string	"2Gi"
persistence.storageClass	string	""
pipelines.enabled	bool	false	Automatically install Pipelines chart to extend Open WebUI functionality using Pipelines, see https://github.com/open-webui/pipelines.
pipelines.extraEnvVars	list	[]	This section can be used to pass the required environment variables to your pipelines (such as the Langfuse host name).
podAnnotations	object	{}
podSecurityContext	object	{}	Configure pod security context, see https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-containe.
replicaCount	int	1
resources	object	{}
service	object	{"annotations":{},"containerPort":8080, "labels":{},"loadBalancerClass":"", "nodePort":"","port":80,"type":"ClusterIP"}	Service values to expose Open WebUI pods to cluster
tolerations	list	[]	Tolerations for pod assignment
topologySpreadConstraints	list	[]	Topology Spread Constraints for pod assignment

5.5.5.1 TLS sources #

There are three recommended options where Open WebUI can obtain TLS certificates for secure communication.

Self-Signed TLS certificate

This is the default method. You need to install cert-manager on the cluster to issue and maintain the certificates. This method generates a CA and signs the Open WebUI certificate using the CA. cert-manager then manages the signed certificate.

For this method, use the following Helm chart option:

global.tls.source=suse-private-ai

Let's Encrypt

This method also uses cert-manager, but it is combined with a special issuer for Let's Encrypt that performs all actions—including request and validation—to get the Let's Encrypt certificate issued. This configuration uses HTTP validation (HTTP-01) and therefore the load balancer must have a public DNS record and be accessible from the Internet.

For this method, use the following Helm chart option:

global.tls.source=letsEncrypt

Provide your own certificate

This method allows you to bring your own signed certificate to secure the HTTPS traffic. In this case, you must upload this certificate and associated key as PEM-encoded files named tls.crt and tls.key.

For this method, use the following Helm chart option:

global.tls.source=secret

6 Steps after the installation is complete #

Once the SUSE AI installation is finished, follow these tasks to complete the initial setup and configuration.

Log in to SUSE AI Open WebUI using the default credentials.
After you have logged in, update the administrator password for SUSE AI.
From the available language models, configure the one you prefer. Optionally, install a custom language model. Refer to the section Setting base AI models and Setting the default AI model for more details
Configure user management with role-base access control (RBAC) as described in https://documentation.suse.com/suse-ai/1.0/html/openwebui-configuring/index.html#openwebui-managing-user-roles
Integrate single sign-on authentication manager—such as Okta—with Open WebUI as described in https://documentation.suse.com/suse-ai/1.0/html/openwebui-configuring/index.html#openwebui-authentication-via-okta.
Configure retrieval-augmented generation (RAG) to let the model process content relevant to the customer.

7 Legal Notice #

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”.

For SUSE trademarks, see https://www.suse.com/company/legal/. All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its affiliates. Asterisks (*) denote third-party trademarks.

All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its affiliates, the authors, nor the translators shall be held liable for possible errors or the consequences thereof.