5 Installing applications from AI Library #
SUSE AI is delivered as a set of components that you can combine to meet specific use cases. To enable the full integrated stack, you need to deploy multiple applications in sequence. Applications with the fewest dependencies must be installed first, followed by dependent applications once their required dependencies are in place within the cluster.
For production deployments, we strongly recommended deploying Rancher, SUSE Observability, and workloads from the AI library to separate Kubernetes clusters.
5.1 Installation procedure #
This procedure includes steps to install AI Library applications in an air-gapped environments.
If the following steps do not specify on which part of the air-gapped architecture--local or remote--the task should be performed, assume remote. The isolated local part is always be specified.
Purchase the SUSE AI entitlement. It is a separate entitlement from SUSE Rancher Prime.
Access SUSE AI via the SUSE Application Collection at https://apps.rancher.io/ to perform the check for the SUSE AI entitlement.
If the entitlement check is successful, you are given access to the SUSE AI-related Helm charts and container images, and can deploy directly from the SUSE Application Collection.
Visit the SUSE Application Collection, sign in and get the user access token as described in https://docs.apps.rancher.io/get-started/authentication/.
On the local cluster, create a Kubernetes namespace if it does not already exist. The steps in this procedure assume that all containers are deployed into the same namespace referred to as
SUSE_AI_NAMESPACE. Replace its name to match your preferences. Helm charts of AI applications are hosted in the private SUSE-trusted registries: SUSE Application Collection and SUSE Registry.> kubectl create namespace <SUSE_AI_NAMESPACE>Create the SUSE Application Collection secret.
> kubectl create secret docker-registry application-collection \ --docker-server=dp.apps.rancher.io \ --docker-username=<APPCO_USERNAME> \ --docker-password=<APPCO_USER_TOKEN> \ -n <SUSE_AI_NAMESPACE>Create the SUSE Registry secret.
> kubectl create secret docker-registry suse-ai-registry \ --docker-server=registry.suse.com \ --docker-username=regcode \ --docker-password=<SCC_REG_CODE> \ -n <SUSE_AI_NAMESPACE>Log in to the SUSE Application Collection Helm registry.
> helm registry login dp.apps.rancher.io/charts \ -u <APPCO_USERNAME> \ -p <APPCO_USER_TOKEN>Log in to the SUSE Registry Helm registry. User is regcode and Password is the SCC registration code of your SUSE AI subscription.
> helm registry login registry.suse.com \ -u regcode \ -p <SCC_REG_CODE>On the remote host, download the
SUSE-AI-get-images.shscript from the air-gap stack (Section 2.1, “SUSE AI air-gapped stack”) and run it.> ./SUSE-AI-get-images.shThis script creates a subdirectory with all necessary Helm charts plus
suse-ai-containers.tgzandsuse-ai-containers.txtfiles.Create a Docker registry on one of the local hosts so that the local Kubernetes cluster can access it.
Securely transfer the created subdirectory with Helm charts plus
suse-ai-containers.tgzandsuse-ai-containers.txtfiles from the remote host to a local host and load all container images to the local Docker registry. SetDST_REGISTRY_USERNAMEandDST_REGISTRY_PASSWORDenvironment variables if they are required to access the registry.> ./SUSE-AI-load-images.sh \ -d <LOCAL_DOCKER_REGISTRY_URL> \ -i charts/suse-ai-containers.txt \ -f charts/suse-ai-containers.tgzInstall cert-manager as described in Section 5.2, “Installing cert-manager”.
Install AI Library components.
Install Milvus as described in Section 5.3, “Installing Milvus”.
(Optional) Install Ollama as described in Section 5.4, “Installing Ollama”.
Install Open WebUI as described in Section 5.5, “Installing Open WebUI”.
5.2 Installing cert-manager #
cert-manager is an extensible X.509 certificate controller for Kubernetes workloads. It supports certificates from popular public issuers as well as private issuers. cert-manager ensures that the certificates are valid and up-to-date, and attempts to renew certificates at a configured time before expiry.
In previous releases, cert-manager was automatically installed together with Open WebUI. Currently, cert-manager is no longer part of the Open WebUI Helm chart and you need to install it separately.
5.2.1 Details about the cert-manager application #
Before deploying cert-manager, it is important to know more about the supported configurations and documentation. The following command provides the corresponding details:
helm show values oci://dp.apps.rancher.io/charts/cert-managerAlternatively, you can also refer to the cert-manager Helm chart page on the SUSE Application Collection site at https://apps.rancher.io/applications/cert-manager. It contains available versions and the link to pull the cert-manager container image.
5.2.2 cert-manager installation procedure #
Before the installation, you need to get user access to the SUSE Application Collection and SUSE Registry, create a Kubernetes namespace, and log in to the Helm registry as described in Section 5.1, “Installation procedure”.
Install the cert-manager chart.
> helm upgrade \
--install cert-manager charts/cert-manager-<X.Y.Z>.tgz \
-n <CERT_MANAGER_NAMESPACE> \
--set crds.enabled=true \
--set 'global.imagePullSecrets[0].name'=application-collection \
--set 'global.imageRegistry'=<LOCAL_DOCKER_REGISTRY_URL>:50435.2.3 Uninstalling cert-manager #
To uninstall cert-manager, run the following command:
> helm uninstall cert-manager -n <CERT_MANAGER_NAMESPACE>5.3 Installing Milvus #
Milvus is a scalable, high-performance vector database designed for AI applications. It enables efficient organization and searching of massive unstructured datasets, including text, images and multi-modal content. This procedure walks you through the installation of Milvus and its dependencies.
5.3.1 Details about the Milvus application #
Before deploying Milvus, it is important to know more about the supported configurations and documentation. The following command provides the corresponding details:
> helm show values oci://dp.apps.rancher.io/charts/milvusAlternatively, you can also refer to the Milvus Helm chart page on the SUSE Application Collection site at https://apps.rancher.io/applications/milvus. It contains Milvus dependencies, available versions and the link to pull the Milvus container image.
5.3.2 Milvus installation procedure #
Before the installation, you need to get user access to the SUSE Application Collection and SUSE Registry, create a Kubernetes namespace, and log in to the Helm registry as described in Section 5.1, “Installation procedure”.
When installed as part of SUSE AI, Milvus depends on etcd, MinIO and Apache Kafka. Because the Milvus chart uses a non-default configuration, create an override file
milvus_custom_overrides.yamlwith the following content.TipAs a template, you can use the
values.yamlfile that is included in thecharts/milvus-<X.Y.Z>.tgzTAR archive.global: imagePullSecrets: - application-collection imageRegistry: <LOCAL_DOCKER_REGISTRY_URL>:5043 cluster: enabled: true standalone: persistence: persistentVolumeClaim: storageClassName: "local-path" etcd: replicaCount: 1 persistence: storageClassName: "local-path" minio: mode: distributed replicas: 4 rootUser: "admin" rootPassword: "adminminio" persistence: storageClass: "local-path" resources: requests: memory: 1024Mi kafka: enabled: true name: kafka replicaCount: 3 broker: enabled: true cluster: listeners: client: protocol: 'PLAINTEXT' controller: protocol: 'PLAINTEXT' persistence: enabled: true annotations: {} labels: {} existingClaim: "" accessModes: - ReadWriteOnce resources: requests: storage: 8Gi storageClassName: "local-path" extraConfigFiles: 1 user.yaml: |+ trace: exporter: jaeger sampleFraction: 1 jaeger: url: "http://opentelemetry-collector.observability.svc.cluster.local:14268/api/traces" 2The
extraConfigFilessection is optional, required only to receive telemetry data from Open WebUI.The URL of the OpenTelemetry Collector installed by the user.
TipThe above example uses local storage. For production environments, we recommend using an enterprise class storage solution such as SUSE Storage in which case the
storageClassNameoption must be set tolonghorn.Install the Milvus Helm chart using the
milvus_custom_overrides.yamloverride file.> helm upgrade --install \ milvus charts/milvus-<X.Y.Z>.tgz \ -n <SUSE_AI_NAMESPACE> \ --version <X.Y.Z> -f <milvus_custom_overrides.yaml>
5.3.2.1 Using Apache Kafka with SUSE Storage #
When Milvus is deployed in cluster mode, it uses Apache Kafka as a message queue. If Apache Kafka uses SUSE Storage as a storage back-end, you need to create an XFS storage class and make it available for the Apache Kafka deployment. Otherwise deploying Apache Kafka with a storage class of an Ext4 file system fails with the following error:
"Found directory /mnt/kafka/logs/lost+found, 'lost+found' is not in the form of topic-partition or topic-partition.uniqueId-delete (if marked for deletion)"
To introduce the XFS storage class, follow these steps:
Create a file named
longhorn-xfs.yamlwith the following content:apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: longhorn-xfs provisioner: driver.longhorn.io allowVolumeExpansion: true reclaimPolicy: Delete volumeBindingMode: Immediate parameters: numberOfReplicas: "3" staleReplicaTimeout: "30" fromBackup: "" fsType: "xfs" dataLocality: "disabled" unmapMarkSnapChainRemoved: "ignored"Create the new storage class using the
kubectlcommand.> kubectl apply -f longhorn-xfs.yamlUpdate the Milvus overrides YAML file to reference the Apache Kafka storage class, as in the following example:
[...] kafka: enabled: true persistence: storageClassName: longhorn-xfs
5.3.3 Uninstalling Milvus #
To uninstall Milvus, run the following command:
> helm uninstall milvus -n <SUSE_AI_NAMESPACE>5.4 Installing Ollama #
Ollama is a tool for running and managing language models locally on your computer. It offers a simple interface to download, run and interact with models without relying on cloud resources.
When installing SUSE AI, Ollama is installed by the Open WebUI installation by default. If you decide to install Ollama separately, disable its installation during the installation of Open WebUI as outlined in Example 5.4, “Open WebUI override file with Ollama installed separately”.
5.4.1 Details about the Ollama application #
Before deploying Ollama, it is important to know more about the supported configurations and documentation. The following command provides the corresponding details:
helm show values oci://dp.apps.rancher.io/charts/ollamaAlternatively, you can also refer to the Ollama Helm chart page on the SUSE Application Collection site at https://apps.rancher.io/applications/ollama. It contains the available versions and a link to pull the Ollama container image.
5.4.2 Ollama installation procedure #
Before the installation, you need to get user access to the SUSE Application Collection and SUSE Registry, create a Kubernetes namespace, and log in to the Helm registry as described in Section 5.1, “Installation procedure”.
Create the
ollama_custom_overrides.yamlfile to override the values of the parent Helm chart. Refer to Section 5.4.4, “Values for the Ollama Helm chart” for more details.Install the Ollama Helm chart using the
ollama-custom-overrides.yamloverride file.> helm upgrade \ --install ollama charts/ollama-<X.Y.Z>.tgz \ -n <SUSE_AI_NAMESPACE> \ -f ollama_custom_overrides.yamlImportant: Downloading AI modelsOllama normally needs to have an active Internet connection to download AI models. In an air-gapped environment, you must download the models manually and copy them to your local Ollama instance, for example:
kubectl cp <PATH_TO_LOCALLY_DOWNLOADED_MODELS>/blobs/* \ <OLLAMA_POD_NAME>:~/.ollama/models/blobs/Tip: Hugging Face modelsModels downloaded from Hugging Face need to be converted before they can be used by Ollama. Refer to https://github.com/ollama/ollama/blob/main/docs/import.md for more details.
5.4.3 Uninstalling Ollama #
To uninstall Ollama, run the following command:
> helm uninstall ollama -n <SUSE_AI_NAMESPACE>5.4.4 Values for the Ollama Helm chart #
To override the default values during the Helm chart installation or update, you can create an override YAML file with custom values. Then, apply these values by specifying the path to the override file with the -f option of the helm command.
Remember to replace <SUSE_AI_NAMESPACE> with your Kubernetes namespace.
Ollama can run optimized for NVIDIA GPUs if the following conditions are fulfilled:
The NVIDIA driver and NVIDIA GPU Operator are installed as described in Installing NVIDIA GPU Drivers on SLES or Installing NVIDIA GPU Drivers on SUSE Linux Micro.
The workloads are set to run on NVIDIA-enabled nodes as described in https://documentation.suse.com/suse-ai/1.0/html/AI-deployment-intro/index.html#ai-gpu-nodes-assigning.
If you do not want to use the NVIDIA GPU, remove the gpu section from ollama_custom_overrides.yaml or disable it.
ollama:
[...]
gpu:
enabled: false
type: 'nvidia'
number: 1global:
imagePullSecrets:
- application-collection
ingress:
enabled: false
defaultModel: "gemma:2b"
runtimeClassName: nvidia
ollama:
models:
pull:
- "gemma:2b"
- "llama3.1"
run:
- "gemma:2b"
- "llama3.1"
gpu:
enabled: true
type: 'nvidia'
number: 1
nvidiaResource: "nvidia.com/gpu"
persistentVolume: 1
enabled: true
storageClass: local-path 2Without the | |
Use |
ollama:
models:
pull:
- llama2
run:
- llama2
persistentVolume:
enabled: true
storageClass: local-path 1
ingress:
enabled: true
hosts:
- host: <OLLAMA_API_URL>
paths:
- path: /
pathType: PrefixUse |
| Key | Type | Default | Description |
|---|---|---|---|
affinity | object | {} | Affinity for pod assignment |
autoscaling.enabled | bool | false | Enable autoscaling |
autoscaling.maxReplicas | int | 100 | Number of maximum replicas |
autoscaling.minReplicas | int | 1 | Number of minimum replicas |
autoscaling.targetCPUUtilizationPercentage | int | 80 | CPU usage to target replica |
extraArgs | list | [] | Additional arguments on the output Deployment definition. |
extraEnv | list | [] | Additional environment variables on the output Deployment definition. |
fullnameOverride | string | "" | String to fully override template |
global.imagePullSecrets | list | [] | Global override for container image registry pull secrets |
global.imageRegistry | string | "" | Global override for container image registry |
hostIPC | bool | false | Use the host’s IPC namespace |
hostNetwork | bool | false | Use the host’s network namespace |
hostPID | bool | false | Use the host’s PID namespace. |
image.pullPolicy | string | "IfNotPresent" | Image pull policy to use for the Ollama container |
image.registry | string | "dp.apps.rancher.io" | Image registry to use for the Ollama container |
image.repository | string | "containers/ollama" | Image repository to use for the Ollama container |
image.tag | string | "0.3.6" | Image tag to use for the Ollama container |
imagePullSecrets | list | [] | Docker registry secret names as an array |
ingress.annotations | object | {} | Additional annotations for the Ingress resource |
ingress.className | string | "" | IngressClass that is used to implement the Ingress (Kubernetes 1.18+) |
ingress.enabled | bool | false | Enable Ingress controller resource |
ingress.hosts[0].host | string | "ollama.local" | |
ingress.hosts[0].paths[0].path | string | "/" | |
ingress.hosts[0].paths[0].pathType | string | "Prefix" | |
ingress.tls | list | [] | The TLS configuration for host names to be covered with this Ingress record |
initContainers | list | [] | Init containers to add to the pod |
knative.containerConcurrency | int | 0 | Knative service container concurrency |
knative.enabled | bool | false | Enable Knative integration |
knative.idleTimeoutSeconds | int | 300 | Knative service idle timeout seconds |
knative.responseStartTimeoutSeconds | int | 300 | Knative service response start timeout seconds |
knative.timeoutSeconds | int | 300 | Knative service timeout seconds |
livenessProbe.enabled | bool | true | Enable livenessProbe |
livenessProbe.failureThreshold | int | 6 | Failure threshold for livenessProbe |
livenessProbe.initialDelaySeconds | int | 60 | Initial delay seconds for livenessProbe |
livenessProbe.path | string | "/" | Request path for livenessProbe |
livenessProbe.periodSeconds | int | 10 | Period seconds for livenessProbe |
livenessProbe.successThreshold | int | 1 | Success threshold for livenessProbe |
livenessProbe.timeoutSeconds | int | 5 | Timeout seconds for livenessProbe |
nameOverride | string | "" | String to partially override template (maintains the release name) |
nodeSelector | object | {} | Node labels for pod assignment |
ollama.gpu.enabled | bool | false | Enable GPU integration |
ollama.gpu.number | int | 1 | Specify the number of GPUs |
ollama.gpu.nvidiaResource | string | "nvidia.com/gpu" | Only for NVIDIA cards; change to |
ollama.gpu.type | string | "nvidia" | GPU type: 'nvidia' or 'amd.' If 'ollama.gpu.enabled' is enabled, the default value is 'nvidia.' If set to 'amd,' this adds the 'rocm' suffix to the image tag if 'image.tag' is not override. This is because AMD and CPU/CUDA are different images. |
ollama.insecure | bool | false | Add insecure flag for pulling at container startup |
ollama.models | list | [] | List of models to pull at container startup. The more you add, the longer the container takes to start if models are not present models: - llama2 - mistral |
ollama.mountPath | string | "" | Override ollama-data volume mount path, default: "/root/.ollama" |
persistentVolume.accessModes | list | ["ReadWriteOnce"] | Ollama server data Persistent Volume access modes. Must match those of existing PV or dynamic provisioner, see https://kubernetes.io/docs/concepts/storage/persistent-volumes/. |
persistentVolume.annotations | object | {} | Ollama server data Persistent Volume annotations |
persistentVolume.enabled | bool | false | Enable persistence using PVC |
persistentVolume.existingClaim | string | "" | If you want to bring your own PVC for persisting Ollama state, pass the name of the created + ready PVC here. If set, this Chart does not create the default PVC. Requires |
persistentVolume.size | string | "30Gi" | Ollama server data Persistent Volume size |
persistentVolume.storageClass | string | "" | If persistentVolume.storageClass is present, and is set to either a dash ('-') or empty string (''), dynamic provisioning is disabled. Otherwise, the storageClassName for persistent volume claim is set to the given value specified by persistentVolume.storageClass. If persistentVolume.storageClass is absent, the default storage class is used for dynamic provisioning whenever possible. See https://kubernetes.io/docs/concepts/storage/storage-classes/ for more details. |
persistentVolume.subPath | string | "" | Subdirectory of Ollama server data Persistent Volume to mount. Useful if the volume’s root directory is not empty. |
persistentVolume.volumeMode | string | "" | Ollama server data Persistent Volume Binding Mode. If empty (the default) or set to null, no volumeBindingMode specification is set, choosing the default mode. |
persistentVolume.volumeName | string | "" | Ollama server Persistent Volume name. It can be used to force-attach the created PVC to a specific PV. |
podAnnotations | object | {} | Map of annotations to add to the pods |
podLabels | object | {} | Map of labels to add to the pods |
podSecurityContext | object | {} | Pod Security Context |
readinessProbe.enabled | bool | true | Enable readinessProbe |
readinessProbe.failureThreshold | int | 6 | Failure threshold for readinessProbe |
readinessProbe.initialDelaySeconds | int | 30 | Initial delay seconds for readinessProbe |
readinessProbe.path | string | "/" | Request path for readinessProbe |
readinessProbe.periodSeconds | int | 5 | Period seconds for readinessProbe |
readinessProbe.successThreshold | int | 1 | Success threshold for readinessProbe |
readinessProbe.timeoutSeconds | int | 3 | Timeout seconds for readinessProbe |
replicaCount | int | 1 | Number of replicas |
resources.limits | object | {} | Pod limit |
resources.requests | object | {} | Pod requests |
runtimeClassName | string | "" | Specify runtime class |
securityContext | object | {} | Container Security Context |
service.annotations | object | {} | Annotations to add to the service |
service.nodePort | int | 31434 | Service node port when service type is 'NodePort' |
service.port | int | 11434 | Service port |
service.type | string | "ClusterIP" | Service type |
serviceAccount.annotations | object | {} | Annotations to add to the service account |
serviceAccount.automount | bool | true | Whether to automatically mount a ServiceAccount’s API credentials |
serviceAccount.create | bool | true | Whether a service account should be created |
serviceAccount.name | string | "" | The name of the service account to use. If not set and 'create' is 'true', a name is generated using the full name template. |
tolerations | list | [] | Tolerations for pod assignment |
topologySpreadConstraints | object | {} | Topology Spread Constraints for pod assignment |
updateStrategy | object | {"type":""} | How to replace existing pods. |
updateStrategy.type | string | "" | Can be 'Recreate' or 'RollingUpdate'; default is 'RollingUpdate' |
volumeMounts | list | [] | Additional volumeMounts on the output Deployment definition |
volumes | list | [] | Additional volumes on the output Deployment definition |
5.5 Installing Open WebUI #
Open WebUI is a user-friendly web interface for interacting with Large Language Models (LLMs). It supports various LLM runners, including Ollama and vLLM.
5.5.1 Details about the Open WebUI application #
Before deploying Open WebUI, it is important to know more about the supported configurations and documentation. The following command provides the corresponding details:
helm show values oci://dp.apps.rancher.io/charts/open-webuiAlternatively, you can also refer to the Open WebUI Helm chart page on the SUSE Application Collection site at https://apps.rancher.io/applications/open-webui. It contains available versions and the link to pull the Open WebUI container image.
5.5.2 Open WebUI installation procedure #
Before the installation, you need to get user access to the SUSE Application Collection and SUSE Registry, create a Kubernetes namespace, and log in to the Helm registry as described in Section 5.1, “Installation procedure”.
An installed cert-manager. If cert-manager is not installed from previous Open WebUI releases, install it by following the steps in Section 5.2, “Installing cert-manager”.
Create the
owui_custom_overrides.yamlfile to override the values of the parent Helm chart. The file contains URLs for Milvus and Ollama, and specifies whether a stand-alone Ollama deployment is used or whether Ollama is installed as part of the Open WebUI installation. Find more details in Section 5.5.4, “Examples of Open WebUI Helm chart override files”. For a list of all installation options with examples, refer to Section 5.5.5, “Values for the Open WebUI Helm chart”.Install the Open WebUI Helm chart using the
owui_custom_overrides.yamloverride file.> helm upgrade --install \ open-webui charts/open-webui-<X.Y.Z>.tgz \ -n <SUSE_AI_NAMESPACE> \ --version <X.Y.Z> -f <owui_custom_overrides.yaml>
5.5.3 Uninstalling Open WebUI #
To uninstall Open WebUI, run the following command:
> helm uninstall open-webui -n <SUSE_AI_NAMESPACE>5.5.4 Examples of Open WebUI Helm chart override files #
To override the default values during the Helm chart installation or update, you can create an override YAML file with custom values. Then, apply these values by specifying the path to the override file with the -f option of the helm command.
Remember to replace <SUSE_AI_NAMESPACE> with your Kubernetes namespace.
The following override file installs Ollama during the Open WebUI installation.
global:
imagePullSecrets:
- application-collection
imageRegistry: <LOCAL_DOCKER_REGISTRY_URL>:5043
ollamaUrls:
- http://open-webui-ollama.<SUSE_AI_NAMESPACE>.svc.cluster.local:11434
persistence:
enabled: true
storageClass: local-path 1
ollama:
enabled: true
ingress:
enabled: false
defaultModel: "gemma:2b"
ollama:
models: 2
pull:
- "gemma:2b"
- "llama3.1"
gpu: 3
enabled: true
type: 'nvidia'
number: 1
persistentVolume: 4
enabled: true
storageClass: local-path
pipelines:
enabled: true
persistence:
storageClass: local-path
extraEnvVars: 5
- name: PIPELINES_URLS 6
value: "https://raw.githubusercontent.com/SUSE/suse-ai-observability-extension/refs/heads/main/integrations/oi-filter/suse_ai_filter.py"
- name: OTEL_SERVICE_NAME 7
value: "Open WebUI"
- name: OTEL_EXPORTER_HTTP_OTLP_ENDPONT 8
value: "http://opentelemetry-collector.suse-observability.svc.cluster.local:4318"
- name: PRICING_JSON 9
value: "https://raw.githubusercontent.com/SUSE/suse-ai-observability-extension/refs/heads/main/integrations/oi-filter/pricing.json"
ingress:
enabled: true
class: ""
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "1024m"
host: suse-ollama-webui 10
tls: true
extraEnvVars:
- name: DEFAULT_MODELS 11
value: "gemma:2b"
- name: DEFAULT_USER_ROLE
value: "user"
- name: WEBUI_NAME
value: "SUSE AI"
- name: GLOBAL_LOG_LEVEL
value: INFO
- name: RAG_EMBEDDING_MODEL
value: "sentence-transformers/all-MiniLM-L6-v2"
- name: VECTOR_DB
value: "milvus"
- name: MILVUS_URI
value: http://milvus.<SUSE_AI_NAMESPACE>.svc.cluster.local:19530
- name: INSTALL_NLTK_DATASETS 12
value: "true"
- name: OMP_NUM_THREADS
value: "1"
- name: OPENAI_API_KEY 13
value: "0p3n-w3bu!"Use | |
Specifies that two large language models (LLM) will be loaded in Ollama when the container starts. | |
Enables GPU support for Ollama.
The | |
Without the | |
The environment variables that you are making available for the pipeline’s runtime container. | |
A list of pipeline URLs to be downloaded and installed by default.
Individual URLs are separated by a semicolon | |
The service name that appears in traces and topological representations in SUSE Observability. | |
The endpoint for the OpenTelemetry collector. Make sure to use the HTTP port of your collector. | |
A file for the model multipliers in cost estimation. You can customize it to match your actual infrastructure experimentally. For air-gapped deployments, you need to provide the pipelines at URLs that are accessible from the local host, such as an internal GitLab instance. | |
Specifies the default LLM for Ollama. | |
Specifies the host name for the Open WebUI Web UI. | |
Installs the natural language toolkit (NLTK) datasets for Ollama. Refer to https://www.nltk.org/index.html for licensing information. | |
API key value for communication between Open WebUI and Open WebUI Pipelines. The default value is '0p3n-w3bu!'. |
The following override file installs Ollama separately from the Open WebUI installation.
global:
imagePullSecrets:
- application-collection
imageRegistry: <LOCAL_DOCKER_REGISTRY_URL>:5043
ollamaUrls:
- http://ollama.<SUSE_AI_NAMESPACE>.svc.cluster.local:11434
persistence:
enabled: true
storageClass: local-path 1
ollama:
enabled: false
pipelines:
enabled: False
persistence:
storageClass: local-path 2
ingress:
enabled: true
class: ""
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
host: suse-ollama-webui
tls: true
extraEnvVars:
- name: DEFAULT_MODELS 3
value: "gemma:2b"
- name: DEFAULT_USER_ROLE
value: "user"
- name: WEBUI_NAME
value: "SUSE AI"
- name: GLOBAL_LOG_LEVEL
value: INFO
- name: RAG_EMBEDDING_MODEL
value: "sentence-transformers/all-MiniLM-L6-v2"
- name: VECTOR_DB
value: "milvus"
- name: MILVUS_URI
value: http://milvus.<SUSE_AI_NAMESPACE>.svc.cluster.local:19530
- name: ENABLE_OTEL 4
value: "true"
- name: OTEL_EXPORTER_OTLP_ENDPOINT 5
value: http://opentelemetry-collector.observability.svc.cluster.local:4317 6
- name: OMP_NUM_THREADS
value: "1"Use | |
Use | |
Specifies the default LLM for Ollama. | |
These values are optional, required only to receive telemetry data from Open WebUI. | |
These values are optional, required only to receive telemetry data from Open WebUI. | |
The URL of the OpenTelemetry Collector installed by the user. |
The following override file installs Ollama separately and enables Open WebUI pipelines. This simple filter adds a limit to the number of question and answer turns during the LLM chat.
Pipelines normally require additional configuration provided either via environment variables or specified in the Open WebUI Web UI.
global:
imagePullSecrets:
- application-collection
ollamaUrls:
- http://ollama.<SUSE_AI_NAMESPACE>.svc.cluster.local:11434
persistence:
enabled: true
storageClass: local-path
ollama:
enabled: false
pipelines:
enabled: true
persistence:
storageClass: local-path
extraEnvVars:
- name: PIPELINES_URLS 1
value: "https://raw.githubusercontent.com/SUSE/suse-ai-observability-extension/refs/heads/main/integrations/oi-filter/conversation_turn_limit_filter.py"
ingress:
enabled: true
class: ""
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
host: suse-ollama-webui
tls: true
[...]A list of pipeline URLs to be downloaded and installed by default.
Individual URLs are separated by a semicolon |
5.5.5 Values for the Open WebUI Helm chart #
To override the default values during the Helm chart installation or update, you can create an override YAML file with custom values. Then, apply these values by specifying the path to the override file with the -f option of the helm command.
Remember to replace <SUSE_AI_NAMESPACE> with your Kubernetes namespace.
| Key | Type | Default | Description |
|---|---|---|---|
affinity | object | {} | Affinity for pod assignment |
annotations | object | {} | |
cert-manager.enabled | bool | true | |
clusterDomain | string | "cluster.local" | Value of cluster domain |
containerSecurityContext | object | {} | Configure container security context, see https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-containe. |
extraEnvVars | list | [{"name":"OPENAI_API_KEY", "value":"0p3n-w3bu!"}] | Environment variables added to the Open WebUI deployment. Most up-to-date environment variables can be found in Environment Variable Configuration. |
extraEnvVars[0] | object | {"name":"OPENAI_API_KEY","value":"0p3n-w3bu!"} | Default API key value for Pipelines. It should be updated in a production deployment and changed to the required API key if not using Pipelines. |
global.imagePullSecrets | list | [] | Global override for container image registry pull secrets |
global.imageRegistry | string | "" | Global override for container image registry |
global.tls.additionalTrustedCAs | bool | false | |
global.tls.issuerName | string | "suse-private-ai" | |
global.tls.letsEncrypt.email | string | ||
global.tls.letsEncrypt.environment | string | "staging" | |
global.tls.letsEncrypt.ingress.class | string | "" | |
global.tls.source | string | "suse-private-ai" | The source of Open WebUI TLS keys, see Section 5.5.5.1, “TLS sources”. |
image.pullPolicy | string | "IfNotPresent" | Image pull policy to use for the Open WebUI container |
image.registry | string | "dp.apps.rancher.io" | Image registry to use for the Open WebUI container |
image.repository | string | "containers/open-webui" | Image repository to use for the Open WebUI container |
image.tag | string | "0.3.32" | Image tag to use for the Open WebUI container |
imagePullSecrets | list | [] | Configure imagePullSecrets to use private registry, see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/. |
ingress.annotations | object | {"nginx.ingress.kubernetes.io/ssl-redirect":"true"} | Use appropriate annotations for your Ingress controller, such as |
ingress.class | string | "" | |
ingress.enabled | bool | true | |
ingress.existingSecret | string | "" | |
ingress.host | string | "" | |
ingress.tls | bool | true | |
nameOverride | string | "" | |
nodeSelector | object | {} | Node labels for pod assignment |
ollama.enabled | bool | true | Automatically install Ollama Helm chart from \oci://dp.apps.rancher.io/charts/ollama. Configure the following Helm values. |
ollama.fullnameOverride | string | "open-webui-ollama" | If enabling embedded Ollama, update fullnameOverride to your desired Ollama name value, or else it will use the default ollama.name value from the Ollama chart. |
ollamaUrls | list | [] | A list of Ollama API endpoints. These can be added instead of automatically installing the Ollama Helm chart, or in addition to it. |
openaiBaseApiUrl | string | "" | OpenAI base API URL to use.
Defaults to the Pipelines service endpoint when Pipelines are enabled, or to |
persistence.accessModes | list | ["ReadWriteOnce"] | If using multiple replicas, you must update accessModes to ReadWriteMany. |
persistence.annotations | object | {} | |
persistence.enabled | bool | true | |
persistence.existingClaim | string | "" | Use existingClaim to reuse an existing Open WebUI PVC instead of creating a new one. |
persistence.selector | object | {} | |
persistence.size | string | "2Gi" | |
persistence.storageClass | string | "" | |
pipelines.enabled | bool | false | Automatically install Pipelines chart to extend Open WebUI functionality using Pipelines. |
pipelines.extraEnvVars | list | [] | This section can be used to pass the required environment variables to your pipelines (such as the Langfuse host name). |
podAnnotations | object | {} | |
podSecurityContext | object | {} | Configure pod security context, see https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-containe. |
replicaCount | int | 1 | |
resources | object | {} | |
service | object | {"annotations":{},"containerPort":8080, "labels":{},"loadBalancerClass":"", "nodePort":"","port":80,"type":"ClusterIP"} | Service values to expose Open WebUI pods to cluster |
tolerations | list | [] | Tolerations for pod assignment |
topologySpreadConstraints | list | [] | Topology Spread Constraints for pod assignment |
5.5.5.1 TLS sources #
There are three recommended options where Open WebUI can obtain TLS certificates for secure communication.
- Self-Signed TLS certificate
This is the default method. You need to install
cert-manageron the cluster to issue and maintain the certificates. This method generates a CA and signs the Open WebUI certificate using the CA.cert-managerthen manages the signed certificate. For this method, use the following Helm chart option:global.tls.source=suse-private-ai- Let’s Encrypt
This method also uses
cert-manager, but it is combined with a special issuer for Let’s Encrypt that performs all actions—including request and validation—to get the Let’s Encrypt certificate issued. This configuration uses HTTP validation (HTTP-01) and therefore the load balancer must have a public DNS record and be accessible from the Internet. For this method, use the following Helm chart option:global.tls.source=letsEncrypt- Provide your own certificate
This method allows you to bring your own signed certificate to secure the HTTPS traffic. In this case, you must upload this certificate and associated key as PEM-encoded files named
tls.crtandtls.key. For this method, use the following Helm chart option:global.tls.source=secret
