3 SUSE AI architecture #
SUSE AI is a cloud native solution that comprises multiple software building blocks. These blocks include the Linux operating system, a Kubernetes cluster with a Web UI management layer, tools for GPU support, and other containerized applications for monitoring and security. The SUSE Application Collection includes a collection of AI-related workloads called AI Library.
3.1 SUSE AI building blocks #
- Linux operating system
The underlying operating system with the optional NVIDIA GPU driver installed. We prefer SUSE Linux Enterprise Server. If you require an immutable operating system, SLE Micro is the recommended alternative.
- Kubernetes cluster
Kubernetes cluster managed by SUSE Rancher Prime, ensuring container and application lifecycle management. We recommend using the SUSE Rancher Prime: RKE2 (https://documentation.suse.com/cloudnative/rke2/) distribution.
- NVIDIA GPU Operator
Uses the NVIDIA GPU’s computing power and capabilities for processing AI-related tasks.
- SUSE Security
For security and compliance.
- SUSE Observability
Provides advanced performance and data monitoring.
- SUSE Storage
Enterprise-grade storage solution.
- SUSE Virtualization
For virtualized workloads.
- SUSE Multi-Linux Manager
For managing multiple Linux distributions.
- SUSE Application Collection
Provides curated, trusted, compliant and up-to-date applications for Kubernetes. Learn more on its dedicated Web site and in the product summary.
3.2 AI Library workloads #
Following is a list of AI applications that you can find in the SUSE Application Collection. For a complete and up-to-date list, refer to https://apps.rancher.io/stacks/suse-ai.
- cert-manager
An extensible X.509 certificate controller for Kubernetes workloads.
- OpenSearch
A search and analytics suite for analyzing and visualizing search data.
- Milvus
A vector database built for generative AI applications with minimal performance loss.
- Ollama
A platform that simplifies the installation and management of large language models (LLM) on local devices.
- Open WebUI
An extensible Web user interface for the Ollama LLM runner.
- vLLM
A high-performance inference and serving engine for large language models (LLMs).
- mcpo
The MCP-to-OpenAPI proxy server provided by Open WebUI.
- PyTorch
An open source machine learning framework.
- MLflow
An open source platform to manage the machine learning lifecycle, including experimentation, reproducibility, deployment and a central model registry.
- Qdrant
A vector database and similarity search engine for storing, searching and managing high-dimensional vectors.
- LiteLLM
An open source LLM proxy and abstraction layer that lets you interact with many large language model providers through a single, OpenAI-compatible API.
