Monitoring SUSE AI with OpenTelemetry and SUSE Observability|Monitoring Milvus
Applies to SUSE AI 1.0

4 Monitoring Milvus

Milvus is monitored by scraping its Prometheus-compatible metrics endpoint. The SUSE Observability Extension uses these metrics to visualize Milvus’s status and activity.

4.1 Scraping the metrics (recommended)

Add the following job to the scrape_configs section of your OpenTelemetry Collector’s configuration. It instructs the collector to scrape the /metrics endpoint of the Milvus service every 15 seconds.

config:
  receivers:
    prometheus:
      config:
        scrape_configs:
          - job_name: 'milvus'
            scrape_interval: 15s
            metrics_path: '/metrics'

            static_configs:
            - targets: ['milvus.SUSE_AI_NAMESPACE.svc.cluster.local:9091'] 1

1

Your Milvus service metrics endpoint. The example milvus.SUSE_AI_NAMESPACE.svc.cluster.local:9091 is a common default, but you should verify that it matches your installation service name and namespace.

4.2 Tracing (advanced)

Milvus can also export detailed tracing data.

Important
Important: High data volume

Enabling tracing in Milvus can generate a large amount of data. We recommend configuring sampling at the collector level to avoid performance issues and high storage costs.

To enable tracing, configure the following settings in your Milvus Helm chart values:

extraConfigFiles:
  user.yaml: |+
    trace:
      exporter: jaeger
      sampleFraction: 1
      jaeger: url: "http://opentelemetry-collector.observability.svc.cluster.local:14268/api/traces" 1

1

The URL of the OpenTelemetry Collector installed by the user.