Deploying and Installing SUSE AI in Air-Gapped Environments|Installing applications from AI Library
Applies to SUSE AI 1.0

5 Installing applications from AI Library

SUSE AI is delivered as a set of components that you can combine to meet specific use cases. To enable the full integrated stack, you need to deploy multiple applications in sequence. Applications with the fewest dependencies must be installed first, followed by dependent applications once their required dependencies are in place within the cluster.

Important
Important: Separate clusters for specific SUSE AI components

For production deployments, we strongly recommended deploying Rancher, SUSE Observability, and workloads from the AI library to separate Kubernetes clusters.

5.1 Installation procedure

This procedure includes steps to install AI Library applications in an air-gapped environments.

Note
Note

If the following steps do not specify on which part of the air-gapped architecture--local or remote--the task should be performed, assume remote. The isolated local part is always be specified.

  1. Purchase the SUSE AI entitlement. It is a separate entitlement from SUSE Rancher Prime.

  2. Access SUSE AI via the SUSE Application Collection at https://apps.rancher.io/ to perform the check for the SUSE AI entitlement.

  3. If the entitlement check is successful, you are given access to the SUSE AI-related Helm charts and container images, and can deploy directly from the SUSE Application Collection.

  4. Visit the SUSE Application Collection, sign in and get the user access token as described in https://docs.apps.rancher.io/get-started/authentication/.

  5. On the local cluster, create a Kubernetes namespace if it does not already exist. The steps in this procedure assume that all containers are deployed into the same namespace referred to as SUSE_AI_NAMESPACE. Replace its name to match your preferences. Helm charts of AI applications are hosted in the private SUSE-trusted registries: SUSE Application Collection and SUSE Registry.

    > kubectl create namespace <SUSE_AI_NAMESPACE>
  6. Create the SUSE Application Collection secret.

    > kubectl create secret docker-registry application-collection \
      --docker-server=dp.apps.rancher.io \
      --docker-username=<APPCO_USERNAME> \
      --docker-password=<APPCO_USER_TOKEN> \
      -n <SUSE_AI_NAMESPACE>
  7. Create the SUSE Registry secret.

    > kubectl create secret docker-registry suse-ai-registry \
      --docker-server=registry.suse.com \
      --docker-username=regcode \
      --docker-password=<SCC_REG_CODE> \
      -n <SUSE_AI_NAMESPACE>
  8. Log in to the SUSE Application Collection Helm registry.

    > helm registry login dp.apps.rancher.io/charts \
      -u <APPCO_USERNAME> \
      -p <APPCO_USER_TOKEN>
  9. Log in to the SUSE Registry Helm registry. User is regcode and Password is the SCC registration code of your SUSE AI subscription.

    > helm registry login registry.suse.com \
      -u regcode \
      -p <SCC_REG_CODE>
  10. On the remote host, download the SUSE-AI-get-images.sh script from the air-gap stack (Section 2.1, “SUSE AI air-gapped stack”) and run it.

    > ./SUSE-AI-get-images.sh

    This script creates a subdirectory with all necessary Helm charts plus suse-ai-containers.tgz and suse-ai-containers.txt files.

  11. Create a Docker registry on one of the local hosts so that the local Kubernetes cluster can access it.

  12. Securely transfer the created subdirectory with Helm charts plus suse-ai-containers.tgz and suse-ai-containers.txt files from the remote host to a local host and load all container images to the local Docker registry. Set DST_REGISTRY_USERNAME and DST_REGISTRY_PASSWORD environment variables if they are required to access the registry.

    > ./SUSE-AI-load-images.sh \
      -d <LOCAL_DOCKER_REGISTRY_URL> \
      -i charts/suse-ai-containers.txt \
      -f charts/suse-ai-containers.tgz
  13. Install cert-manager as described in Section 5.2, “Installing cert-manager”.

  14. Install AI Library components.

    1. Install Milvus as described in Section 5.3, “Installing Milvus”.

    2. (Optional) Install Ollama as described in Section 5.4, “Installing Ollama”.

    3. Install Open WebUI as described in Section 5.5, “Installing Open WebUI”.

5.2 Installing cert-manager

cert-manager is an extensible X.509 certificate controller for Kubernetes workloads. It supports certificates from popular public issuers as well as private issuers. cert-manager ensures that the certificates are valid and up-to-date, and attempts to renew certificates at a configured time before expiry.

In previous releases, cert-manager was automatically installed together with Open WebUI. Currently, cert-manager is no longer part of the Open WebUI Helm chart and you need to install it separately.

5.2.1 Details about the cert-manager application

Before deploying cert-manager, it is important to know more about the supported configurations and documentation. The following command provides the corresponding details:

helm show values oci://dp.apps.rancher.io/charts/cert-manager

Alternatively, you can also refer to the cert-manager Helm chart page on the SUSE Application Collection site at https://apps.rancher.io/applications/cert-manager. It contains available versions and the link to pull the cert-manager container image.

5.2.2 cert-manager installation procedure

Tip
Tip

Before the installation, you need to get user access to the SUSE Application Collection and SUSE Registry, create a Kubernetes namespace, and log in to the Helm registry as described in Section 5.1, “Installation procedure”.

Install the cert-manager chart.

> helm upgrade \
--install cert-manager charts/cert-manager-<X.Y.Z>.tgz \
  -n <CERT_MANAGER_NAMESPACE> \
  --set crds.enabled=true \
  --set 'global.imagePullSecrets[0].name'=application-collection \
  --set 'global.imageRegistry'=<LOCAL_DOCKER_REGISTRY_URL>:5043

5.2.3 Uninstalling cert-manager

To uninstall cert-manager, run the following command:

> helm uninstall cert-manager -n <CERT_MANAGER_NAMESPACE>

5.3 Installing Milvus

Milvus is a scalable, high-performance vector database designed for AI applications. It enables efficient organization and searching of massive unstructured datasets, including text, images and multi-modal content. This procedure walks you through the installation of Milvus and its dependencies.

5.3.1 Details about the Milvus application

Before deploying Milvus, it is important to know more about the supported configurations and documentation. The following command provides the corresponding details:

> helm show values oci://dp.apps.rancher.io/charts/milvus

Alternatively, you can also refer to the Milvus Helm chart page on the SUSE Application Collection site at https://apps.rancher.io/applications/milvus. It contains Milvus dependencies, available versions and the link to pull the Milvus container image.

Milvus page in the SUSE Application Collection
Figure 5.1: Milvus page in the SUSE Application Collection

5.3.2 Milvus installation procedure

Tip
Tip

Before the installation, you need to get user access to the SUSE Application Collection and SUSE Registry, create a Kubernetes namespace, and log in to the Helm registry as described in Section 5.1, “Installation procedure”.

  1. When installed as part of SUSE AI, Milvus depends on etcd, MinIO and Apache Kafka. Because the Milvus chart uses a non-default configuration, create an override file milvus_custom_overrides.yaml with the following content.

    Tip
    Tip

    As a template, you can use the values.yaml file that is included in the charts/milvus-<X.Y.Z>.tgz TAR archive.

    global:
      imagePullSecrets:
      - application-collection
      imageRegistry: <LOCAL_DOCKER_REGISTRY_URL>:5043
    cluster:
      enabled: true
    standalone:
      persistence:
        persistentVolumeClaim:
          storageClassName: "local-path"
    etcd:
      replicaCount: 1
      persistence:
        storageClassName: "local-path"
    minio:
      mode: distributed
      replicas: 4
      rootUser: "admin"
      rootPassword: "adminminio"
      persistence:
        storageClass: "local-path"
      resources:
        requests:
          memory: 1024Mi
    kafka:
      enabled: true
      name: kafka
      replicaCount: 3
      broker:
        enabled: true
      cluster:
        listeners:
          client:
            protocol: 'PLAINTEXT'
          controller:
            protocol: 'PLAINTEXT'
      persistence:
        enabled: true
        annotations: {}
        labels: {}
        existingClaim: ""
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 8Gi
        storageClassName: "local-path"
    extraConfigFiles: 1
      user.yaml: |+
        trace:
          exporter: jaeger
          sampleFraction: 1
          jaeger:
            url: "http://opentelemetry-collector.observability.svc.cluster.local:14268/api/traces" 2

    1

    The extraConfigFiles section is optional, required only to receive telemetry data from Open WebUI.

    2

    The URL of the OpenTelemetry Collector installed by the user.

    Tip
    Tip

    The above example uses local storage. For production environments, we recommend using an enterprise class storage solution such as SUSE Storage in which case the storageClassName option must be set to longhorn.

  2. Install the Milvus Helm chart using the milvus_custom_overrides.yaml override file.

    > helm upgrade --install \
      milvus charts/milvus-<X.Y.Z>.tgz \
      -n <SUSE_AI_NAMESPACE> \
      --version <X.Y.Z> -f <milvus_custom_overrides.yaml>

5.3.2.1 Using Apache Kafka with SUSE Storage

When Milvus is deployed in cluster mode, it uses Apache Kafka as a message queue. If Apache Kafka uses SUSE Storage as a storage back-end, you need to create an XFS storage class and make it available for the Apache Kafka deployment. Otherwise deploying Apache Kafka with a storage class of an Ext4 file system fails with the following error:

"Found directory /mnt/kafka/logs/lost+found, 'lost+found' is not
  in the form of topic-partition or topic-partition.uniqueId-delete
  (if marked for deletion)"

To introduce the XFS storage class, follow these steps:

  1. Create a file named longhorn-xfs.yaml with the following content:

    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: longhorn-xfs
    provisioner: driver.longhorn.io
    allowVolumeExpansion: true
    reclaimPolicy: Delete
    volumeBindingMode: Immediate
    parameters:
      numberOfReplicas: "3"
      staleReplicaTimeout: "30"
      fromBackup: ""
      fsType: "xfs"
      dataLocality: "disabled"
      unmapMarkSnapChainRemoved: "ignored"
  2. Create the new storage class using the kubectl command.

    > kubectl apply -f longhorn-xfs.yaml
  3. Update the Milvus overrides YAML file to reference the Apache Kafka storage class, as in the following example:

      [...]
        kafka:
        enabled: true
        persistence:
          storageClassName: longhorn-xfs

5.3.3 Uninstalling Milvus

To uninstall Milvus, run the following command:

> helm uninstall milvus -n <SUSE_AI_NAMESPACE>

5.4 Installing Ollama

Ollama is a tool for running and managing language models locally on your computer. It offers a simple interface to download, run and interact with models without relying on cloud resources.

Tip
Tip

When installing SUSE AI, Ollama is installed by the Open WebUI installation by default. If you decide to install Ollama separately, disable its installation during the installation of Open WebUI as outlined in Example 5.4, “Open WebUI override file with Ollama installed separately”.

5.4.1 Details about the Ollama application

Before deploying Ollama, it is important to know more about the supported configurations and documentation. The following command provides the corresponding details:

helm show values oci://dp.apps.rancher.io/charts/ollama

Alternatively, you can also refer to the Ollama Helm chart page on the SUSE Application Collection site at https://apps.rancher.io/applications/ollama. It contains the available versions and a link to pull the Ollama container image.

5.4.2 Ollama installation procedure

Tip
Tip

Before the installation, you need to get user access to the SUSE Application Collection and SUSE Registry, create a Kubernetes namespace, and log in to the Helm registry as described in Section 5.1, “Installation procedure”.

  1. Create the ollama_custom_overrides.yaml file to override the values of the parent Helm chart. Refer to Section 5.4.4, “Values for the Ollama Helm chart” for more details.

  2. Install the Ollama Helm chart using the ollama-custom-overrides.yaml override file.

    > helm upgrade \
      --install ollama charts/ollama-<X.Y.Z>.tgz \
      -n <SUSE_AI_NAMESPACE> \
      -f ollama_custom_overrides.yaml
    Important
    Important: Downloading AI models

    Ollama normally needs to have an active Internet connection to download AI models. In an air-gapped environment, you must download the models manually and copy them to your local Ollama instance, for example:

    kubectl cp
      <PATH_TO_LOCALLY_DOWNLOADED_MODELS>/blobs/* \
      <OLLAMA_POD_NAME>:~/.ollama/models/blobs/
    Tip
    Tip: Hugging Face models

    Models downloaded from Hugging Face need to be converted before they can be used by Ollama. Refer to https://github.com/ollama/ollama/blob/main/docs/import.md for more details.

5.4.3 Uninstalling Ollama

To uninstall Ollama, run the following command:

> helm uninstall ollama -n <SUSE_AI_NAMESPACE>

5.4.4 Values for the Ollama Helm chart

To override the default values during the Helm chart installation or update, you can create an override YAML file with custom values. Then, apply these values by specifying the path to the override file with the -f option of the helm command. Remember to replace <SUSE_AI_NAMESPACE> with your Kubernetes namespace.

Important
Important: GPU section

Ollama can run optimized for NVIDIA GPUs if the following conditions are fulfilled:

If you do not want to use the NVIDIA GPU, remove the gpu section from ollama_custom_overrides.yaml or disable it.

 ollama:
  [...]
  gpu:
    enabled: false
    type: 'nvidia'
    number: 1
Example 5.1: Basic override file with GPU and two models pulled at startup
global:
  imagePullSecrets:
  - application-collection
ingress:
  enabled: false
defaultModel: "gemma:2b"
runtimeClassName: nvidia
ollama:
  models:
    pull:
      - "gemma:2b"
      - "llama3.1"
    run:
      - "gemma:2b"
      - "llama3.1"
  gpu:
    enabled: true
    type: 'nvidia'
    number: 1
    nvidiaResource: "nvidia.com/gpu"
persistentVolume: 1
  enabled: true
  storageClass: local-path 2

1

Without the persistentVolume option enabled, changes made to Ollama—​such as downloading other LLM-- are lost when the container is restarted.

2

Use local-path storage only for testing purposes. For production use, we recommend using a storage solution suitable for persistent storage, such as SUSE Storage.

Example 5.2: Basic override file with Ingress and no GPU
ollama:
  models:
    pull:
      - llama2
    run:
      - llama2
  persistentVolume:
    enabled: true
    storageClass: local-path 1
ingress:
  enabled: true
  hosts:
  - host: <OLLAMA_API_URL>
    paths:
      - path: /
        pathType: Prefix

1

Use local-path storage (requires installing the corresponding provisioner) only for testing purposes. For production use, we recommend using a storage solution suitable for persistent storage, such as SUSE Storage.

Table 5.1: Override file options for the Ollama Helm chart
KeyTypeDefaultDescription

affinity

object

{}

Affinity for pod assignment

autoscaling.enabled

bool

false

Enable autoscaling

autoscaling.maxReplicas

int

100

Number of maximum replicas

autoscaling.minReplicas

int

1

Number of minimum replicas

autoscaling.targetCPUUtilizationPercentage

int

80

CPU usage to target replica

extraArgs

list

[]

Additional arguments on the output Deployment definition.

extraEnv

list

[]

Additional environment variables on the output Deployment definition.

fullnameOverride

string

""

String to fully override template

global.imagePullSecrets

list

[]

Global override for container image registry pull secrets

global.imageRegistry

string

""

Global override for container image registry

hostIPC

bool

false

Use the host’s IPC namespace

hostNetwork

bool

false

Use the host’s network namespace

hostPID

bool

false

Use the host’s PID namespace.

image.pullPolicy

string

"IfNotPresent"

Image pull policy to use for the Ollama container

image.registry

string

"dp.apps.rancher.io"

Image registry to use for the Ollama container

image.repository

string

"containers/ollama"

Image repository to use for the Ollama container

image.tag

string

"0.3.6"

Image tag to use for the Ollama container

imagePullSecrets

list

[]

Docker registry secret names as an array

ingress.annotations

object

{}

Additional annotations for the Ingress resource

ingress.className

string

""

IngressClass that is used to implement the Ingress (Kubernetes 1.18+)

ingress.enabled

bool

false

Enable Ingress controller resource

ingress.hosts[0].host

string

"ollama.local"

 

ingress.hosts[0].paths[0].path

string

"/"

 

ingress.hosts[0].paths[0].pathType

string

"Prefix"

 

ingress.tls

list

[]

The TLS configuration for host names to be covered with this Ingress record

initContainers

list

[]

Init containers to add to the pod

knative.containerConcurrency

int

0

Knative service container concurrency

knative.enabled

bool

false

Enable Knative integration

knative.idleTimeoutSeconds

int

300

Knative service idle timeout seconds

knative.responseStartTimeoutSeconds

int

300

Knative service response start timeout seconds

knative.timeoutSeconds

int

300

Knative service timeout seconds

livenessProbe.enabled

bool

true

Enable livenessProbe

livenessProbe.failureThreshold

int

6

Failure threshold for livenessProbe

livenessProbe.initialDelaySeconds

int

60

Initial delay seconds for livenessProbe

livenessProbe.path

string

"/"

Request path for livenessProbe

livenessProbe.periodSeconds

int

10

Period seconds for livenessProbe

livenessProbe.successThreshold

int

1

Success threshold for livenessProbe

livenessProbe.timeoutSeconds

int

5

Timeout seconds for livenessProbe

nameOverride

string

""

String to partially override template (maintains the release name)

nodeSelector

object

{}

Node labels for pod assignment

ollama.gpu.enabled

bool

false

Enable GPU integration

ollama.gpu.number

int

1

Specify the number of GPUs

ollama.gpu.nvidiaResource

string

"nvidia.com/gpu"

Only for NVIDIA cards; change to nvidia.com/mig-1g.10gb to use MIG slice

ollama.gpu.type

string

"nvidia"

GPU type: 'nvidia' or 'amd.' If 'ollama.gpu.enabled' is enabled, the default value is 'nvidia.' If set to 'amd,' this adds the 'rocm' suffix to the image tag if 'image.tag' is not override. This is because AMD and CPU/CUDA are different images.

ollama.insecure

bool

false

Add insecure flag for pulling at container startup

ollama.models

list

[]

List of models to pull at container startup. The more you add, the longer the container takes to start if models are not present models: - llama2 - mistral

ollama.mountPath

string

""

Override ollama-data volume mount path, default: "/root/.ollama"

persistentVolume.accessModes

list

["ReadWriteOnce"]

Ollama server data Persistent Volume access modes. Must match those of existing PV or dynamic provisioner, see https://kubernetes.io/docs/concepts/storage/persistent-volumes/.

persistentVolume.annotations

object

{}

Ollama server data Persistent Volume annotations

persistentVolume.enabled

bool

false

Enable persistence using PVC

persistentVolume.existingClaim

string

""

If you want to bring your own PVC for persisting Ollama state, pass the name of the created + ready PVC here. If set, this Chart does not create the default PVC. Requires server.persistentVolume.enabled: true

persistentVolume.size

string

"30Gi"

Ollama server data Persistent Volume size

persistentVolume.storageClass

string

""

If persistentVolume.storageClass is present, and is set to either a dash ('-') or empty string (''), dynamic provisioning is disabled. Otherwise, the storageClassName for persistent volume claim is set to the given value specified by persistentVolume.storageClass. If persistentVolume.storageClass is absent, the default storage class is used for dynamic provisioning whenever possible. See https://kubernetes.io/docs/concepts/storage/storage-classes/ for more details.

persistentVolume.subPath

string

""

Subdirectory of Ollama server data Persistent Volume to mount. Useful if the volume’s root directory is not empty.

persistentVolume.volumeMode

string

""

Ollama server data Persistent Volume Binding Mode. If empty (the default) or set to null, no volumeBindingMode specification is set, choosing the default mode.

persistentVolume.volumeName

string

""

Ollama server Persistent Volume name. It can be used to force-attach the created PVC to a specific PV.

podAnnotations

object

{}

Map of annotations to add to the pods

podLabels

object

{}

Map of labels to add to the pods

podSecurityContext

object

{}

Pod Security Context

readinessProbe.enabled

bool

true

Enable readinessProbe

readinessProbe.failureThreshold

int

6

Failure threshold for readinessProbe

readinessProbe.initialDelaySeconds

int

30

Initial delay seconds for readinessProbe

readinessProbe.path

string

"/"

Request path for readinessProbe

readinessProbe.periodSeconds

int

5

Period seconds for readinessProbe

readinessProbe.successThreshold

int

1

Success threshold for readinessProbe

readinessProbe.timeoutSeconds

int

3

Timeout seconds for readinessProbe

replicaCount

int

1

Number of replicas

resources.limits

object

{}

Pod limit

resources.requests

object

{}

Pod requests

runtimeClassName

string

""

Specify runtime class

securityContext

object

{}

Container Security Context

service.annotations

object

{}

Annotations to add to the service

service.nodePort

int

31434

Service node port when service type is 'NodePort'

service.port

int

11434

Service port

service.type

string

"ClusterIP"

Service type

serviceAccount.annotations

object

{}

Annotations to add to the service account

serviceAccount.automount

bool

true

Whether to automatically mount a ServiceAccount’s API credentials

serviceAccount.create

bool

true

Whether a service account should be created

serviceAccount.name

string

""

The name of the service account to use. If not set and 'create' is 'true', a name is generated using the full name template.

tolerations

list

[]

Tolerations for pod assignment

topologySpreadConstraints

object

{}

Topology Spread Constraints for pod assignment

updateStrategy

object

{"type":""}

How to replace existing pods.

updateStrategy.type

string

""

Can be 'Recreate' or 'RollingUpdate'; default is 'RollingUpdate'

volumeMounts

list

[]

Additional volumeMounts on the output Deployment definition

volumes

list

[]

Additional volumes on the output Deployment definition

5.5 Installing Open WebUI

Open WebUI is a user-friendly web interface for interacting with Large Language Models (LLMs). It supports various LLM runners, including Ollama and vLLM.

5.5.1 Details about the Open WebUI application

Before deploying Open WebUI, it is important to know more about the supported configurations and documentation. The following command provides the corresponding details:

helm show values oci://dp.apps.rancher.io/charts/open-webui

Alternatively, you can also refer to the Open WebUI Helm chart page on the SUSE Application Collection site at https://apps.rancher.io/applications/open-webui. It contains available versions and the link to pull the Open WebUI container image.

5.5.2 Open WebUI installation procedure

Tip
Tip

Before the installation, you need to get user access to the SUSE Application Collection and SUSE Registry, create a Kubernetes namespace, and log in to the Helm registry as described in Section 5.1, “Installation procedure”.

Requirements
  1. Create the owui_custom_overrides.yaml file to override the values of the parent Helm chart. The file contains URLs for Milvus and Ollama, and specifies whether a stand-alone Ollama deployment is used or whether Ollama is installed as part of the Open WebUI installation. Find more details in Section 5.5.4, “Examples of Open WebUI Helm chart override files”. For a list of all installation options with examples, refer to Section 5.5.5, “Values for the Open WebUI Helm chart”.

  2. Install the Open WebUI Helm chart using the owui_custom_overrides.yaml override file.

    > helm upgrade --install \
      open-webui charts/open-webui-<X.Y.Z>.tgz \
      -n <SUSE_AI_NAMESPACE> \
      --version <X.Y.Z> -f <owui_custom_overrides.yaml>

5.5.3 Uninstalling Open WebUI

To uninstall Open WebUI, run the following command:

> helm uninstall open-webui -n <SUSE_AI_NAMESPACE>

5.5.4 Examples of Open WebUI Helm chart override files

To override the default values during the Helm chart installation or update, you can create an override YAML file with custom values. Then, apply these values by specifying the path to the override file with the -f option of the helm command. Remember to replace <SUSE_AI_NAMESPACE> with your Kubernetes namespace.

Example 5.3: Open WebUI override file with Ollama included

The following override file installs Ollama during the Open WebUI installation.

global:
  imagePullSecrets:
  - application-collection
  imageRegistry: <LOCAL_DOCKER_REGISTRY_URL>:5043
ollamaUrls:
- http://open-webui-ollama.<SUSE_AI_NAMESPACE>.svc.cluster.local:11434
persistence:
  enabled: true
  storageClass: local-path 1
ollama:
  enabled: true
  ingress:
    enabled: false
  defaultModel: "gemma:2b"
  ollama:
    models: 2
      pull:
        - "gemma:2b"
        - "llama3.1"
    gpu: 3
      enabled: true
      type: 'nvidia'
      number: 1
    persistentVolume: 4
      enabled: true
      storageClass: local-path
pipelines:
  enabled: true
  persistence:
    storageClass: local-path
  extraEnvVars: 5
    - name: PIPELINES_URLS 6
      value: "https://raw.githubusercontent.com/SUSE/suse-ai-observability-extension/refs/heads/main/integrations/oi-filter/suse_ai_filter.py"
    - name: OTEL_SERVICE_NAME 7
      value: "Open WebUI"
    - name: OTEL_EXPORTER_HTTP_OTLP_ENDPONT 8
      value: "http://opentelemetry-collector.suse-observability.svc.cluster.local:4318"
    - name: PRICING_JSON 9
      value: "https://raw.githubusercontent.com/SUSE/suse-ai-observability-extension/refs/heads/main/integrations/oi-filter/pricing.json"
ingress:
  enabled: true
  class: ""
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/proxy-body-size: "1024m"
  host: suse-ollama-webui 10
  tls: true
extraEnvVars:
- name: DEFAULT_MODELS 11
  value: "gemma:2b"
- name: DEFAULT_USER_ROLE
  value: "user"
- name: WEBUI_NAME
  value: "SUSE AI"
- name: GLOBAL_LOG_LEVEL
  value: INFO
- name: RAG_EMBEDDING_MODEL
  value: "sentence-transformers/all-MiniLM-L6-v2"
- name: VECTOR_DB
  value: "milvus"
- name: MILVUS_URI
  value: http://milvus.<SUSE_AI_NAMESPACE>.svc.cluster.local:19530
- name: INSTALL_NLTK_DATASETS 12
  value: "true"
- name: OMP_NUM_THREADS
  value: "1"
- name: OPENAI_API_KEY 13
  value: "0p3n-w3bu!"

1

Use local-path storage only for testing purposes. For production use, we recommend using a storage solution more suitable for persistent storage. To use SUSE Storage, specify longhorn.

2

Specifies that two large language models (LLM) will be loaded in Ollama when the container starts.

3

Enables GPU support for Ollama. The type must be nvidia because NVIDIA GPUs are the only supported devices. number must be between 1 and the number of NVIDIA GPUs present on the system.

4

Without the persistentVolume option enabled, changes made to Ollama—​such as downloading other LLM-- are lost when the container is restarted.

5

The environment variables that you are making available for the pipeline’s runtime container.

6

A list of pipeline URLs to be downloaded and installed by default. Individual URLs are separated by a semicolon ;. For air-gapped deployments, you need to provide the pipelines at URLs that are accessible from the local host, such as an internal GitLab instance.

7

The service name that appears in traces and topological representations in SUSE Observability.

8

The endpoint for the OpenTelemetry collector. Make sure to use the HTTP port of your collector.

9

A file for the model multipliers in cost estimation. You can customize it to match your actual infrastructure experimentally. For air-gapped deployments, you need to provide the pipelines at URLs that are accessible from the local host, such as an internal GitLab instance.

10

Specifies the default LLM for Ollama.

11

Specifies the host name for the Open WebUI Web UI.

12

Installs the natural language toolkit (NLTK) datasets for Ollama. Refer to https://www.nltk.org/index.html for licensing information.

13

API key value for communication between Open WebUI and Open WebUI Pipelines. The default value is '0p3n-w3bu!'.

Example 5.4: Open WebUI override file with Ollama installed separately

The following override file installs Ollama separately from the Open WebUI installation.

global:
  imagePullSecrets:
  - application-collection
  imageRegistry: <LOCAL_DOCKER_REGISTRY_URL>:5043
ollamaUrls:
- http://ollama.<SUSE_AI_NAMESPACE>.svc.cluster.local:11434
persistence:
  enabled: true
  storageClass: local-path 1
ollama:
  enabled: false
pipelines:
  enabled: False
  persistence:
    storageClass: local-path 2
ingress:
  enabled: true
  class: ""
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  host: suse-ollama-webui
  tls: true
extraEnvVars:
- name: DEFAULT_MODELS 3
  value: "gemma:2b"
- name: DEFAULT_USER_ROLE
  value: "user"
- name: WEBUI_NAME
  value: "SUSE AI"
- name: GLOBAL_LOG_LEVEL
  value: INFO
- name: RAG_EMBEDDING_MODEL
  value: "sentence-transformers/all-MiniLM-L6-v2"
- name: VECTOR_DB
  value: "milvus"
- name: MILVUS_URI
  value: http://milvus.<SUSE_AI_NAMESPACE>.svc.cluster.local:19530
- name: ENABLE_OTEL 4
  value: "true"
- name: OTEL_EXPORTER_OTLP_ENDPOINT 5
  value: http://opentelemetry-collector.observability.svc.cluster.local:4317 6
- name: OMP_NUM_THREADS
  value: "1"

1

Use local-path storage only for testing purposes. For production use, we recommend using a storage solution suitable for persistent storage, such as SUSE Storage.

2

Use local-path storage only for testing purposes. For production use, we recommend using a storage solution suitable for persistent storage, such as SUSE Storage.

3

Specifies the default LLM for Ollama.

4

These values are optional, required only to receive telemetry data from Open WebUI.

5

These values are optional, required only to receive telemetry data from Open WebUI.

6

The URL of the OpenTelemetry Collector installed by the user.

Example 5.5: Open WebUI override file with pipelines enabled

The following override file installs Ollama separately and enables Open WebUI pipelines. This simple filter adds a limit to the number of question and answer turns during the LLM chat.

Tip
Tip

Pipelines normally require additional configuration provided either via environment variables or specified in the Open WebUI Web UI.

global:
  imagePullSecrets:
  - application-collection
ollamaUrls:
- http://ollama.<SUSE_AI_NAMESPACE>.svc.cluster.local:11434
persistence:
  enabled: true
  storageClass: local-path
ollama:
  enabled: false
pipelines:
  enabled: true
  persistence:
    storageClass: local-path
  extraEnvVars:
  - name: PIPELINES_URLS 1
    value: "https://raw.githubusercontent.com/SUSE/suse-ai-observability-extension/refs/heads/main/integrations/oi-filter/conversation_turn_limit_filter.py"
ingress:
  enabled: true
  class: ""
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  host: suse-ollama-webui
  tls: true
[...]

1

A list of pipeline URLs to be downloaded and installed by default. Individual URLs are separated by a semicolon ;. For air-gapped deployments, you need to provide the pipelines at URLs that are accessible from the local host, such as an internal GitLab instance.

5.5.5 Values for the Open WebUI Helm chart

To override the default values during the Helm chart installation or update, you can create an override YAML file with custom values. Then, apply these values by specifying the path to the override file with the -f option of the helm command. Remember to replace <SUSE_AI_NAMESPACE> with your Kubernetes namespace.

Table 5.2: Available options for the Open WebUI Helm chart
KeyTypeDefaultDescription

affinity

object

{}

Affinity for pod assignment

annotations

object

{}

 

cert-manager.enabled

bool

true

 

clusterDomain

string

"cluster.local"

Value of cluster domain

containerSecurityContext

object

{}

Configure container security context, see https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-containe.

extraEnvVars

list

[{"name":"OPENAI_API_KEY", "value":"0p3n-w3bu!"}]

Environment variables added to the Open WebUI deployment. Most up-to-date environment variables can be found in Environment Variable Configuration.

extraEnvVars[0]

object

{"name":"OPENAI_API_KEY","value":"0p3n-w3bu!"}

Default API key value for Pipelines. It should be updated in a production deployment and changed to the required API key if not using Pipelines.

global.imagePullSecrets

list

[]

Global override for container image registry pull secrets

global.imageRegistry

string

""

Global override for container image registry

global.tls.additionalTrustedCAs

bool

false

 

global.tls.issuerName

string

"suse-private-ai"

 

global.tls.letsEncrypt.email

string

"none@example.com"

 

global.tls.letsEncrypt.environment

string

"staging"

 

global.tls.letsEncrypt.ingress.class

string

""

 

global.tls.source

string

"suse-private-ai"

The source of Open WebUI TLS keys, see Section 5.5.5.1, “TLS sources”.

image.pullPolicy

string

"IfNotPresent"

Image pull policy to use for the Open WebUI container

image.registry

string

"dp.apps.rancher.io"

Image registry to use for the Open WebUI container

image.repository

string

"containers/open-webui"

Image repository to use for the Open WebUI container

image.tag

string

"0.3.32"

Image tag to use for the Open WebUI container

imagePullSecrets

list

[]

Configure imagePullSecrets to use private registry, see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/.

ingress.annotations

object

{"nginx.ingress.kubernetes.io/ssl-redirect":"true"}

Use appropriate annotations for your Ingress controller, such as nginx.ingress.kubernetes.io/rewrite-target: / for NGINX.

ingress.class

string

""

 

ingress.enabled

bool

true

 

ingress.existingSecret

string

""

 

ingress.host

string

""

 

ingress.tls

bool

true

 

nameOverride

string

""

 

nodeSelector

object

{}

Node labels for pod assignment

ollama.enabled

bool

true

Automatically install Ollama Helm chart from \oci://dp.apps.rancher.io/charts/ollama. Configure the following Helm values.

ollama.fullnameOverride

string

"open-webui-ollama"

If enabling embedded Ollama, update fullnameOverride to your desired Ollama name value, or else it will use the default ollama.name value from the Ollama chart.

ollamaUrls

list

[]

A list of Ollama API endpoints. These can be added instead of automatically installing the Ollama Helm chart, or in addition to it.

openaiBaseApiUrl

string

""

OpenAI base API URL to use. Defaults to the Pipelines service endpoint when Pipelines are enabled, or to https://api.openai.com/v1 if Pipelines are not enabled and this value is blank.

persistence.accessModes

list

["ReadWriteOnce"]

If using multiple replicas, you must update accessModes to ReadWriteMany.

persistence.annotations

object

{}

 

persistence.enabled

bool

true

 

persistence.existingClaim

string

""

Use existingClaim to reuse an existing Open WebUI PVC instead of creating a new one.

persistence.selector

object

{}

 

persistence.size

string

"2Gi"

 

persistence.storageClass

string

""

 

pipelines.enabled

bool

false

Automatically install Pipelines chart to extend Open WebUI functionality using Pipelines.

pipelines.extraEnvVars

list

[]

This section can be used to pass the required environment variables to your pipelines (such as the Langfuse host name).

podAnnotations

object

{}

 

podSecurityContext

object

{}

Configure pod security context, see https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-containe.

replicaCount

int

1

 

resources

object

{}

 

service

object

{"annotations":{},"containerPort":8080, "labels":{},"loadBalancerClass":"", "nodePort":"","port":80,"type":"ClusterIP"}

Service values to expose Open WebUI pods to cluster

tolerations

list

[]

Tolerations for pod assignment

topologySpreadConstraints

list

[]

Topology Spread Constraints for pod assignment

5.5.5.1 TLS sources

There are three recommended options where Open WebUI can obtain TLS certificates for secure communication.

Self-Signed TLS certificate

This is the default method. You need to install cert-manager on the cluster to issue and maintain the certificates. This method generates a CA and signs the Open WebUI certificate using the CA. cert-manager then manages the signed certificate. For this method, use the following Helm chart option:

global.tls.source=suse-private-ai
Let’s Encrypt

This method also uses cert-manager, but it is combined with a special issuer for Let’s Encrypt that performs all actions—​including request and validation—​to get the Let’s Encrypt certificate issued. This configuration uses HTTP validation (HTTP-01) and therefore the load balancer must have a public DNS record and be accessible from the Internet. For this method, use the following Helm chart option:

global.tls.source=letsEncrypt
Provide your own certificate

This method allows you to bring your own signed certificate to secure the HTTPS traffic. In this case, you must upload this certificate and associated key as PEM-encoded files named tls.crt and tls.key. For this method, use the following Helm chart option:

global.tls.source=secret