Introduction to SUSE AI|SUSE AI architecture
Applies to SUSE AI 1.0

3 SUSE AI architecture

SUSE AI is a cloud native solution that comprises multiple software building blocks. These blocks include the Linux operating system, a Kubernetes cluster with a Web UI management layer, tools for GPU support, and other containerized applications for monitoring and security. The SUSE Application Collection includes a collection of AI-related workloads called AI Library.

3.1 SUSE AI building blocks

Linux operating system

The underlying operating system with the optional NVIDIA GPU driver installed. We prefer SUSE Linux Enterprise Server. If you require an immutable operating system, SLE Micro is the recommended alternative.

Kubernetes cluster

Kubernetes cluster managed by SUSE Rancher Prime, ensuring container and application lifecycle management. We recommend using the SUSE Rancher Prime: RKE2 (https://documentation.suse.com/cloudnative/rke2/) distribution.

NVIDIA GPU Operator

Uses the NVIDIA GPU’s computing power and capabilities for processing AI-related tasks.

SUSE Security

For security and compliance.

SUSE Observability

Provides advanced performance and data monitoring.

SUSE Storage

Enterprise-grade storage solution.

SUSE Virtualization

For virtualized workloads.

SUSE Multi-Linux Manager

For managing multiple Linux distributions.

SUSE Application Collection

Provides curated, trusted, compliant and up-to-date applications for Kubernetes. Learn more on its dedicated Web site and in the product summary.

An image showing a basic structure of SUSE AI
Figure 3.1: Basic schema of SUSE AI

3.2 AI Library workloads

Following is a list of AI applications that you can find in the SUSE Application Collection. For a complete and up-to-date list, refer to https://apps.rancher.io/stacks/suse-ai.

cert-manager

An extensible X.509 certificate controller for Kubernetes workloads.

OpenSearch

A search and analytics suite for analyzing and visualizing search data.

Milvus

A vector database built for generative AI applications with minimal performance loss.

Ollama

A platform that simplifies the installation and management of large language models (LLM) on local devices.

Open WebUI

An extensible Web user interface for the Ollama LLM runner.

vLLM

A high-performance inference and serving engine for large language models (LLMs).

mcpo

The MCP-to-OpenAPI proxy server provided by Open WebUI.

PyTorch

An open source machine learning framework.

MLflow

An open source platform to manage the machine learning lifecycle, including experimentation, reproducibility, deployment and a central model registry.

Qdrant

A vector database and similarity search engine for storing, searching and managing high-dimensional vectors.

LiteLLM

An open source LLM proxy and abstraction layer that lets you interact with many large language model providers through a single, OpenAI-compatible API.