Applies to SUSE AI 1.0

3 SUSE AI architecture #

SUSE AI is a cloud native solution that comprises multiple software building blocks. These blocks include the Linux operating system, a Kubernetes cluster with a Web UI management layer, tools for GPU support, and other containerized applications for monitoring and security. The SUSE Application Collection includes a collection of AI-related workloads called AI Library.

3.1 SUSE AI building blocks #

Linux operating system: The underlying operating system with the optional NVIDIA GPU driver installed. We prefer SUSE Linux Enterprise Server. If you require an immutable operating system, SLE Micro is the recommended alternative.
Kubernetes cluster: Kubernetes cluster managed by SUSE Rancher Prime, ensuring container and application lifecycle management. We recommend using the SUSE Rancher Prime: RKE2 (https://documentation.suse.com/cloudnative/rke2/) distribution.
NVIDIA GPU Operator: Uses the NVIDIA GPU’s computing power and capabilities for processing AI-related tasks.
SUSE Security: For security and compliance.
SUSE Observability: Provides advanced performance and data monitoring.
SUSE Storage: Enterprise-grade storage solution.
SUSE Virtualization: For virtualized workloads.
SUSE Multi-Linux Manager: For managing multiple Linux distributions.
SUSE Application Collection: Provides curated, trusted, compliant and up-to-date applications for Kubernetes. Learn more on its dedicated Web site and in the product summary.

An image showing a basic structure of SUSE AI

Figure 3.1: Basic schema of SUSE AI #

3.2 AI Library workloads #

Following is a list of AI applications that you can find in the SUSE Application Collection. For a complete and up-to-date list, refer to https://apps.rancher.io/stacks/suse-ai.

cert-manager: An extensible X.509 certificate controller for Kubernetes workloads.
OpenSearch: A search and analytics suite for analyzing and visualizing search data.
Milvus: A vector database built for generative AI applications with minimal performance loss.
Ollama: A platform that simplifies the installation and management of large language models (LLM) on local devices.
Open WebUI: An extensible Web user interface for the Ollama LLM runner.
vLLM: A high-performance inference and serving engine for large language models (LLMs).
mcpo: The MCP-to-OpenAPI proxy server provided by Open WebUI.
PyTorch: An open source machine learning framework.
MLflow: An open source platform to manage the machine learning lifecycle, including experimentation, reproducibility, deployment and a central model registry.
Qdrant: A vector database and similarity search engine for storing, searching and managing high-dimensional vectors.
LiteLLM: An open source LLM proxy and abstraction layer that lets you interact with many large language model providers through a single, OpenAI-compatible API.