Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
Applies to SUSE CaaS Platform 4.5.2

8 Monitoring

8.1 Monitoring Stack

Important
Important

The described monitoring approach in this document is a generalized example of one way of monitoring a SUSE CaaS Platform cluster.

Please apply best practices to develop your own monitoring approach using the described examples and available health checking endpoints.

8.1.1 Introduction

This document aims to describe monitoring in a Kubernetes cluster.

The monitoring stack consists of a monitoring/trending system and a visualization platform. Additionally you can use the in-memory metrics-server to perform automatic scaling (Refer to: Section 8.3, “Horizontal Pod Autoscaler”).

  • Prometheus

    Prometheus is an open-source monitoring and trending system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach. The time series collection happens via a pull mode over HTTP.

    Prometheus consists of multiple components:

    • Prometheus server: scrapes and stores data to time series database

    • Alertmanager handles client alerts, sanitizes duplicates and noise and routes them to configurable receivers.

    • Pushgateway is an intermediate service which allows you to push metrics from jobs which cannot be scraped.

    Note
    Note

    Deploying Prometheus Pushgateway is out of the scope of this document.

    • Exporters are libraries which help to exports existing metrics from 3rd-party system as Prometheus metric.

  • Grafana

    Grafana is an open-source system for querying, analysing and visualizing metrics.

8.1.2 Prerequisites

  1. NGINX Ingress Controller

    Please refer to Section 6.8, “NGINX Ingress Controller” on how to configure ingress in your cluster. Deploying NGINX Ingress Controller also allows us to provide TLS termination to our services and to provide basic authentication to the Prometheus Expression browser/API.

  2. Monitoring namespace

    We will deploy our monitoring stack in its own namespace and therefore create one.

    kubectl create namespace monitoring
  3. Configure Authentication

    We need to create a basic-auth secret so the NGINX Ingress Controller can perform authentication.

    Install apache2-utils, which contains htpasswd, on your local workstation.

    zypper in apache2-utils

    Create the secret file auth

    htpasswd -c auth admin
    New password:
    Re-type new password:
    Adding password for user admin
    Important
    Important

    It is very important that the filename is auth. During creation, a key in the configuration containing the secret is created that is named after the used filename. The ingress controller will expect a key named auth. And when you access the monitoring WebUI, you need to enter the username and password.

    Create secret in Kubernetes cluster

    kubectl create secret generic -n monitoring prometheus-basic-auth --from-file=auth

8.1.3 Installation

There will be two different ways of using ingress for accessing the monitoring system.

8.1.3.1 Installation For Subdomains

This installation example shows how to install and configure Prometheus and Grafana using subdomains such as prometheus.example.com, prometheus-alertmanager.example.com, and grafana.example.com.

Important
Important

In order to provide additional security by using TLS certificates, please make sure you have the Section 6.8, “NGINX Ingress Controller” installed and configured.

If you don’t need TLS, you may use other methods for exposing these web services as native LBaaS in OpenStack, haproxy service or k8s native methods as port-forwarding or NodePort but this is out of scope of this document.

8.1.3.2 Create DNS entries

In this example, we will use a master node with IP 10.86.4.158 in the case of NodePort service of the Ingress Controller.

Note
Note

You should configure proper DNS names in any production environment. These values are only for example purposes.

  1. Configure the DNS server

    monitoring.example.com                      IN  A       10.86.4.158
    prometheus.example.com                      IN  CNAME   monitoring.example.com
    prometheus-alertmanager.example.com         IN  CNAME   monitoring.example.com
    grafana.example.com                         IN  CNAME   monitoring.example.com
  2. Configure the management workstation /etc/hosts (optional)

    10.86.4.158 prometheus.example.com prometheus-alertmanager.example.com grafana.example.com
8.1.3.2.1 TLS Certificate

You must configure your certificates for the components as secrets in the Kubernetes cluster. Get certificates from your certificate authority.

  1. Individual certificate

    Single-name TLS certificate protects a single sub-domain, and it means each sub-domain owns its private key. From the security perspective, it is recommended to use individual certificates. However, you have to manage the private key and the certificate rotation separately.

    Important
    Important: Note Down Secret Names For Configuration

    When you choose to secure each service with an individual certificate, you must repeat the step below for each component and adjust the name for the individual secret each time. Please note down the names of the secrets you have created.

    In this example, the secret name is monitoring-tls.

  2. Wildcard certificate

    Wildcard TLS allows you to secure multiple sub-domains with one certificate and it means multiple sub-domains share the same private key. You can then add more sub-domains without having to redeploy the certificate and moreover, save the additional certificate costs.

Refer to Section 6.9.9.1.1, “Trusted Server Certificate” on how to sign the trusted certificate or refer to Section 6.9.9.2.2, “Self-signed Server Certificate” on how to sign the self-signed certificate. The server.conf for DNS.1 is prometheus.example.com and prometheus-alertmanager.example.com grafana.example.com for individual certificates separately. The server.conf for DNS.1 is *.example.com for a wildcard certificate.

Then, import your certificate and key pair into the Kubernetes cluster secret name monitoring-tls. In this example, the certificate and key are monitoring.crt and monitoring.key.

kubectl create -n monitoring secret tls monitoring-tls  \
--key  ./monitoring.key \
--cert ./monitoring.crt
8.1.3.2.2 Prometheus
  1. Create a configuration file prometheus-config-values.yaml

    We need to configure the storage for our deployment. Choose among the options and uncomment the line in the config file. In production environments you must configure persistent storage.

    • Use an existing PersistentVolumeClaim

    • Use a StorageClass (preferred)

    # Alertmanager configuration
    alertmanager:
      enabled: true
      ingress:
        enabled: true
        hosts:
        -  prometheus-alertmanager.example.com
        annotations:
          kubernetes.io/ingress.class: nginx
          nginx.ingress.kubernetes.io/auth-type: basic
          nginx.ingress.kubernetes.io/auth-secret: prometheus-basic-auth
          nginx.ingress.kubernetes.io/auth-realm: "Authentication Required"
        tls:
          - hosts:
            - prometheus-alertmanager.example.com
            secretName: monitoring-tls
      persistentVolume:
        enabled: true
        ## Use a StorageClass
        storageClass: my-storage-class
        ## Create a PersistentVolumeClaim of 2Gi
        size: 2Gi
        ## Use an existing PersistentVolumeClaim (my-pvc)
        #existingClaim: my-pvc
    
    ## Alertmanager is configured through alertmanager.yml. This file and any others
    ## listed in alertmanagerFiles will be mounted into the alertmanager pod.
    ## See configuration options https://prometheus.io/docs/alerting/configuration/
    #alertmanagerFiles:
    #  alertmanager.yml:
    
    # Create a specific service account
    serviceAccounts:
      nodeExporter:
        name: prometheus-node-exporter
    
    # Node tolerations for node-exporter scheduling to nodes with taints
    # Allow scheduling of node-exporter on master nodes
    nodeExporter:
      hostNetwork: false
      hostPID: false
      podSecurityPolicy:
        enabled: true
        annotations:
          apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
          apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
          seccomp.security.alpha.kubernetes.io/allowedProfileNames: runtime/default
          seccomp.security.alpha.kubernetes.io/defaultProfileName: runtime/default
      tolerations:
        - key: node-role.kubernetes.io/master
          operator: Exists
          effect: NoSchedule
    
    # Disable Pushgateway
    pushgateway:
      enabled: false
    
    # Prometheus configuration
    server:
      ingress:
        enabled: true
        hosts:
        - prometheus.example.com
        annotations:
          kubernetes.io/ingress.class: nginx
          nginx.ingress.kubernetes.io/auth-type: basic
          nginx.ingress.kubernetes.io/auth-secret: prometheus-basic-auth
          nginx.ingress.kubernetes.io/auth-realm: "Authentication Required"
        tls:
          - hosts:
            - prometheus.example.com
            secretName: monitoring-tls
      persistentVolume:
        enabled: true
        ## Use a StorageClass
        storageClass: my-storage-class
        ## Create a PersistentVolumeClaim of 8Gi
        size: 8Gi
        ## Use an existing PersistentVolumeClaim (my-pvc)
        #existingClaim: my-pvc
    
    ## Prometheus is configured through prometheus.yml. This file and any others
    ## listed in serverFiles will be mounted into the server pod.
    ## See configuration options
    ## https://prometheus.io/docs/prometheus/latest/configuration/configuration/
    #serverFiles:
    #  prometheus.yml:
  2. Add SUSE helm charts repository

    helm repo add suse https://kubernetes-charts.suse.com
  3. Deploy SUSE prometheus helm chart and pass our configuration values file.

    helm install prometheus suse/prometheus \
    --namespace monitoring \
    --values prometheus-config-values.yaml

    There need to be 3 pods running (3 node-exporter pods because we have 3 nodes).

    kubectl -n monitoring get pod | grep prometheus
    NAME                                             READY     STATUS    RESTARTS   AGE
    prometheus-alertmanager-5487596d54-kcdd6         2/2       Running   0          2m
    prometheus-kube-state-metrics-566669df8c-krblx   1/1       Running   0          2m
    prometheus-node-exporter-jnc5w                   1/1       Running   0          2m
    prometheus-node-exporter-qfwp9                   1/1       Running   0          2m
    prometheus-node-exporter-sc4ls                   1/1       Running   0          2m
    prometheus-server-6488f6c4cd-5n9w8               2/2       Running   0          2m

    There need to be be 2 ingresses configured

    kubectl get ingress -n monitoring
    NAME                      HOSTS                                 ADDRESS   PORTS     AGE
    prometheus-alertmanager   prometheus-alertmanager.example.com             80, 443   87s
    prometheus-server         prometheus.example.com                          80, 443   87s
  4. At this stage, the Prometheus Expression browser/API should be accessible, depending on your network configuration

    • NodePort: https://prometheus.example.com:32443

    • External IPs: https://prometheus.example.com

    • LoadBalancer: https://prometheus.example.com

8.1.3.2.3 Alertmanager Configuration Example

The configuration example sets one "receiver" to get notified by email when one of below conditions is met:

  • Node is unschedulable: severity is critical because the node cannot accept new pods

  • Node runs out of disk space: severity is critical because the node cannot accept new pods

  • Node has memory pressure: severity is warning

  • Node has disk pressure: severity is warning

  • Certificates is going to expire in 7 days: severity is critical

  • Certificates is going to expire in 30 days: severity is warning

  • Certificates is going to expire in 3 months: severity is info

    1. Configure alerting receiver in Alertmanager

      The Alertmanager handles alerts sent by Prometheus server, it takes care of deduplicating, grouping, and routing them to the correct receiver integration such as email. It also takes care of silencing and inhibition of alerts.

      Add the alertmanagerFiles section to your Prometheus configuration file prometheus-config-values.yaml.

      For more information on how to configure Alertmanager, refer to Prometheus: Alerting - Configuration.

      alertmanagerFiles:
        alertmanager.yml:
          global:
            # The smarthost and SMTP sender used for mail notifications.
            smtp_from: alertmanager@example.com
            smtp_smarthost: smtp.example.com:587
            smtp_auth_username: admin@example.com
            smtp_auth_password: <PASSWORD>
            smtp_require_tls: true
      
          route:
            # The labels by which incoming alerts are grouped together.
            group_by: ['node']
      
            # When a new group of alerts is created by an incoming alert, wait at
            # least 'group_wait' to send the initial notification.
            # This way ensures that you get multiple alerts for the same group that start
            # firing shortly after another are batched together on the first
            # notification.
            group_wait: 30s
      
            # When the first notification was sent, wait 'group_interval' to send a batch
            # of new alerts that started firing for that group.
            group_interval: 5m
      
            # If an alert has successfully been sent, wait 'repeat_interval' to
            # resend them.
            repeat_interval: 3h
      
            # A default receiver
            receiver: admin-example
      
          receivers:
          - name: 'admin-example'
            email_configs:
            - to: 'admin@example.com'
    2. Configures alerting rules in Prometheus server

      Replace the serverFiles section of the Prometheus configuration file prometheus-config-values.yaml.

      For more information on how to configure alerts, refer to: Prometheus: Alerting - Notification Template Examples

      serverFiles:
        alerts: {}
        rules:
          groups:
          - name: caasp.node.rules
            rules:
            - alert: NodeIsNotReady
              expr: kube_node_status_condition{condition="Ready",status="false"} == 1 or kube_node_status_condition{condition="Ready",status="unknown"} == 1
              for: 1m
              labels:
                severity: critical
              annotations:
                description: '{{ $labels.node }} is not ready'
            - alert: NodeIsOutOfDisk
              expr: kube_node_status_condition{condition="OutOfDisk",status="true"} == 1
              labels:
                severity: critical
              annotations:
                description: '{{ $labels.node }} has insufficient free disk space'
            - alert: NodeHasDiskPressure
              expr: kube_node_status_condition{condition="DiskPressure",status="true"} == 1
              labels:
                severity: warning
              annotations:
                description: '{{ $labels.node }} has insufficient available disk space'
            - alert: NodeHasInsufficientMemory
              expr: kube_node_status_condition{condition="MemoryPressure",status="true"} == 1
              labels:
                severity: warning
              annotations:
                description: '{{ $labels.node }} has insufficient available memory'
          - name: caasp.certs.rules
            rules:
            - alert: KubernetesCertificateExpiry3Months
              expr: (cert_exporter_cert_expires_in_seconds / 86400) < 90
              labels:
                severity: info
              annotations:
                description: 'The cert for {{ $labels.filename }} on {{ $labels.nodename }} node is going to expire in 3 months'
            - alert: KubernetesCertificateExpiry30Days
              expr: (cert_exporter_cert_expires_in_seconds / 86400) < 30
              labels:
                severity: warning
              annotations:
                description: 'The cert for {{ $labels.filename }} on {{ $labels.nodename }} node is going to expire in 30 days'
            - alert: KubernetesCertificateExpiry7Days
              expr: (cert_exporter_cert_expires_in_seconds / 86400) < 7
              labels:
                severity: critical
              annotations:
                description: 'The cert for {{ $labels.filename }} on {{ $labels.nodename }} node is going to expire in 7 days'
            - alert: KubeconfigCertificateExpiry3Months
              expr: (cert_exporter_kubeconfig_expires_in_seconds / 86400) < 90
              labels:
                severity: info
              annotations:
                description: 'The cert for {{ $labels.filename }} on {{ $labels.nodename }} node is going to expire in 3 months'
            - alert: KubeconfigCertificateExpiry30Days
              expr: (cert_exporter_kubeconfig_expires_in_seconds / 86400) < 30
              labels:
                severity: warning
              annotations:
                description: 'The cert for {{ $labels.filename }} on {{ $labels.nodename }} node is going to expire in 30 days'
            - alert: KubeconfigCertificateExpiry7Days
              expr: (cert_exporter_kubeconfig_expires_in_seconds / 86400) < 7
              labels:
                severity: critical
              annotations:
                description: 'The cert for {{ $labels.filename }} on {{ $labels.nodename }} node is going to expire in 7 days'
            - alert: AddonCertificateExpiry3Months
              expr: (cert_exporter_secret_expires_in_seconds / 86400) < 90
              labels:
                severity: info
              annotations:
                description: 'The cert for {{ $labels.secret_name }} is going to expire in 3 months'
            - alert: AddonCertificateExpiry30Days
              expr: (cert_exporter_secret_expires_in_seconds / 86400) < 30
              labels:
                severity: warning
              annotations:
                description: 'The cert for {{ $labels.secret_name }} is going to expire in 30 days'
            - alert: AddonCertificateExpiry7Days
              expr: (cert_exporter_secret_expires_in_seconds / 86400) < 7
              labels:
                severity: critical
              annotations:
                description: 'The cert for {{ $labels.secret_name }} is going to expire in 7 days'
    3. To apply the changed configuration, run:

      helm upgrade prometheus suse/prometheus --namespace monitoring --values prometheus-config-values.yaml
    4. You should now be able to see your Alertmanager, depending on your network configuration

  • NodePort: https://prometheus-alertmanager.example.com:32443

  • External IPs: https://prometheus-alertmanager.example.com

  • LoadBalancer: https://prometheus-alertmanager.example.com

8.1.3.2.4 Recording Rules Configuration Example

Recording rules allow you to precompute frequently needed or computationally expensive expressions and save their result as a new set of time series. Querying the precomputed result will then often be much faster than executing the original expression every time it is needed. This is especially useful for dashboards, which need to query the same expression repeatedly every time they refresh. Another common use case is federation where precomputed metrics are scraped from one Prometheus instance by another.

For more information on how to configure recording rules, refer to Prometheus:Recording Rules - Configuration.

  1. Configuring recording rules

    Add the following group of rules in the serverFiles section of the prometheus-config-values.yaml configuration file.

    serverFiles:
      alerts: {}
      rules:
        groups:
        - name: node-exporter.rules
          rules:
          - expr: count by (instance) (count without (mode) (node_cpu_seconds_total{component="node-exporter"}))
            record: instance:node_num_cpu:sum
          - expr: 1 - avg by (instance) (rate(node_cpu_seconds_total{component="node-exporter",mode="idle"}[5m]))
            record: instance:node_cpu_utilisation:rate5m
          - expr: node_load1{component="node-exporter"} / on (instance) instance:node_num_cpu:sum
            record: instance:node_load1_per_cpu:ratio
          - expr: node_memory_MemAvailable_bytes / on (instance) node_memory_MemTotal_bytes
            record: instance:node_memory_utilisation:ratio
          - expr: rate(node_vmstat_pgmajfault{component="node-exporter"}[5m])
            record: instance:node_vmstat_pgmajfault:rate5m
          - expr: rate(node_disk_io_time_seconds_total{component="node-exporter", device=~"nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+"}[5m])
            record: instance_device:node_disk_io_time_seconds:rate5m
          - expr: rate(node_disk_io_time_weighted_seconds_total{component="node-exporter", device=~"nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+"}[5m])
            record: instance_device:node_disk_io_time_weighted_seconds:rate5m
          - expr: sum by (instance) (rate(node_network_receive_bytes_total{component="node-exporter", device!="lo"}[5m]))
            record: instance:node_network_receive_bytes_excluding_lo:rate5m
          - expr: sum by (instance) (rate(node_network_transmit_bytes_total{component="node-exporter", device!="lo"}[5m]))
            record: instance:node_network_transmit_bytes_excluding_lo:rate5m
          - expr: sum by (instance) (rate(node_network_receive_drop_total{component="node-exporter", device!="lo"}[5m]))
            record: instance:node_network_receive_drop_excluding_lo:rate5m
          - expr: sum by (instance) (rate(node_network_transmit_drop_total{component="node-exporter", device!="lo"}[5m]))
            record: instance:node_network_transmit_drop_excluding_lo:rate5m
  2. To apply the changed configuration, run:

    helm upgrade prometheus suse/prometheus --namespace monitoring --values prometheus-config-values.yaml
  3. You should now be able to see your configured rules, depending on your network configuration

    • NodePort: https://prometheus.example.com:32443/rules

    • External IPs: https://prometheus.example.com/rules

    • LoadBalancer: https://prometheus.example.com/rules

8.1.3.2.5 Grafana

Starting from Grafana 5.0, it is possible to dynamically provision the data sources and dashboards via files. In a Kubernetes cluster, these files are provided via the utilization of ConfigMap, editing a ConfigMap will result by the modification of the configuration without having to delete/recreate the pod.

  1. Configure Grafana provisioning

    Create the default datasource configuration file grafana-datasources.yaml which point to our Prometheus server

    kind: ConfigMap
    apiVersion: v1
    metadata:
      name: grafana-datasources
      namespace: monitoring
      labels:
         grafana_datasource: "1"
    data:
      datasource.yaml: |-
        apiVersion: 1
        deleteDatasources:
          - name: Prometheus
            orgId: 1
        datasources:
        - name: Prometheus
          type: prometheus
          url: http://prometheus-server.monitoring.svc.cluster.local:80
          access: proxy
          orgId: 1
          isDefault: true
  2. Create the ConfigMap in Kubernetes cluster

    kubectl create -f grafana-datasources.yaml
  3. Configure storage for the deployment

    Choose among the options and uncomment the line in the config file. In production environments you must configure persistent storage.

    • Use an existing PersistentVolumeClaim

    • Use a StorageClass (preferred)

      Create a file grafana-config-values.yaml with the appropriate values

      # Configure admin password
      adminPassword: <PASSWORD>
      
      # Ingress configuration
      ingress:
        enabled: true
        annotations:
          kubernetes.io/ingress.class: nginx
        hosts:
          - grafana.example.com
        tls:
          - hosts:
            - grafana.example.com
            secretName: monitoring-tls
      
      # Configure persistent storage
      persistence:
        enabled: true
        accessModes:
          - ReadWriteOnce
        ## Use a StorageClass
        storageClassName: my-storage-class
        ## Create a PersistentVolumeClaim of 10Gi
        size: 10Gi
        ## Use an existing PersistentVolumeClaim (my-pvc)
        #existingClaim: my-pvc
      
      # Enable sidecar for provisioning
      sidecar:
        datasources:
          enabled: true
          label: grafana_datasource
        dashboards:
          enabled: true
          label: grafana_dashboard
  4. Add SUSE helm charts repository

    helm repo add suse https://kubernetes-charts.suse.com
  5. Deploy SUSE grafana helm chart and pass our configuration values file

    helm install grafana suse/grafana \
    --namespace monitoring \
    --values grafana-config-values.yaml
  6. The result should be a running Grafana pod

    kubectl -n monitoring get pod | grep grafana
    NAME                                             READY     STATUS    RESTARTS   AGE
    grafana-dbf7ddb7d-fxg6d                          3/3       Running   0          2m
  7. At this stage, Grafana should be accessible, depending on your network configuration

    • NodePort: https://grafana.example.com:32443

    • External IPs: https://grafana.example.com

    • LoadBalancer: https://grafana.example.com

  8. Now you can add Grafana dashboards.

8.1.3.2.6 Adding Grafana Dashboards

There are three ways to add dashboards to Grafana:

  • Deploy an existing dashboard from Grafana dashboards

    1. Open the deployed Grafana in your browser and log in.

    2. On the home page of Grafana, hover your mousecursor over the + button on the left sidebar and click on the import menuitem.

    3. Select an existing dashboard for your purpose from Grafana dashboards. Copy the URL to the clipboard.

    4. Paste the URL (for example) https://grafana.com/dashboards/3131 into the first input field to import the "Kubernetes All Nodes" Grafana Dashboard. After pasting in the url, the view will change to another form.

    5. Now select the "Prometheus" datasource in the prometheus field and click on the import button.

    6. The browser will redirect you to your newly created dashboard.

  • Use our pre-built dashboards to monitor the SUSE CaaS Platform system

    # monitor SUSE CaaS Platform cluster
    kubectl apply -f https://raw.githubusercontent.com/SUSE/caasp-monitoring/master/grafana-dashboards-caasp-cluster.yaml
    # monitor SUSE CaaS Platform etcd cluster
    kubectl apply -f https://raw.githubusercontent.com/SUSE/caasp-monitoring/master/grafana-dashboards-caasp-etcd-cluster.yaml
    # monitor SUSE CaaS Platform nodes
    kubectl apply -f https://raw.githubusercontent.com/SUSE/caasp-monitoring/master/grafana-dashboards-caasp-nodes.yaml
    # monitor SUSE CaaS Platform namespaces
    kubectl apply -f https://raw.githubusercontent.com/SUSE/caasp-monitoring/master/grafana-dashboards-caasp-namespaces.yaml
    # monitor SUSE CaaS Platform pods
    kubectl apply -f https://raw.githubusercontent.com/SUSE/caasp-monitoring/master/grafana-dashboards-caasp-pods.yaml
    # monitor SUSE CaaS Platform certificates
    kubectl apply -f https://raw.githubusercontent.com/SUSE/caasp-monitoring/master/grafana-dashboards-caasp-certificates.yaml
  • Build your own dashboard Deploy your own dashboard by configuration file containing the dashboard definition.

    1. Create your dashboard definition file as a ConfigMap, for example grafana-dashboards-caasp-cluster.yaml.

      ---
      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: grafana-dashboards-caasp-cluster
        namespace: monitoring
        labels:
           grafana_dashboard: "1"
      data:
        caasp-cluster.json: |-
          {
            "__inputs": [
              {
                "name": "DS_PROMETHEUS",
                "label": "Prometheus",
                "description": "",
                "type": "datasource",
                "pluginId": "prometheus",
                "pluginName": "Prometheus"
              }
            ],
            "__requires": [
              {
                "type": "grafana",
      [...]
      continues with definition of dashboard JSON
      [...]
    2. Apply the ConfigMap to the cluster.

      kubectl apply -f grafana-dashboards-caasp-cluster.yaml

8.1.3.3 Installation For Subpaths

This installation example shows how to install and configure Prometheus and Grafana using subpaths such as example.com/prometheus, example.com/alertmanager, and example.com/grafana.

Important
Important

Overlapped instructions from subdomains will be omitted. Refer to the instruction from subdomains.

8.1.3.4 Create DNS entries

In this example, we will use a master node with IP 10.86.4.158 in the case of NodePort service of the Ingress Controller.

Note
Note

You should configure proper DNS names in any production environment. These values are only for example purposes.

  1. Configure the DNS server

    example.com                      IN  A       10.86.4.158
  2. Configure the management workstation /etc/hosts (optional)

    10.86.4.158 example.com
8.1.3.4.1 TLS Certificate

You must configure your certificates for the components as secrets in the Kubernetes cluster. Get certificates from your certificate authority.

Refer to Section 6.9.9.1.1, “Trusted Server Certificate” on how to sign the trusted certificate or refer to Section 6.9.9.2.2, “Self-signed Server Certificate” on how to sign the self-signed certificate. The server.conf for DNS.1 is example.com.

Then, import your certificate and key pair into the Kubernetes cluster secret name monitoring-tls. In this example, the certificate and key are monitoring.crt and monitoring.key.

kubectl create -n monitoring secret tls monitoring-tls  \
--key  ./monitoring.key \
--cert ./monitoring.crt
8.1.3.4.2 Prometheus
  1. Create a configuration file prometheus-config-values.yaml

    We need to configure the storage for our deployment. Choose among the options and uncomment the line in the config file. In production environments you must configure persistent storage.

    # Alertmanager configuration
    alertmanager:
      enabled: true
      baseURL: https://example.com:32443/alertmanager
      prefixURL: /alertmanager
      ingress:
        enabled: true
        annotations:
          kubernetes.io/ingress.class: nginx
          nginx.ingress.kubernetes.io/auth-type: basic
          nginx.ingress.kubernetes.io/auth-secret: prometheus-basic-auth
          nginx.ingress.kubernetes.io/auth-realm: "Authentication Required"
        hosts:
          - example.com/alertmanager
        tls:
          - secretName: monitoring-tls
            hosts:
            - example.com
      persistentVolume:
        enabled: true
        ## Use a StorageClass
        storageClass: my-storage-class
        ## Create a PersistentVolumeClaim of 2Gi
        size: 2Gi
        ## Use an existing PersistentVolumeClaim (my-pvc)
        #existingClaim: my-pvc
    
    ## Alertmanager is configured through alertmanager.yml. This file and any others
    ## listed in alertmanagerFiles will be mounted into the alertmanager pod.
    ## See configuration options https://prometheus.io/docs/alerting/configuration/
    #alertmanagerFiles:
    #  alertmanager.yml:
    
    # Create a specific service account
    serviceAccounts:
      nodeExporter:
        name: prometheus-node-exporter
    
    # Node tolerations for node-exporter scheduling to nodes with taints
    # Allow scheduling of node-exporter on master nodes
    nodeExporter:
      hostNetwork: false
      hostPID: false
      podSecurityPolicy:
        enabled: true
        annotations:
          apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
          apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
          seccomp.security.alpha.kubernetes.io/allowedProfileNames: runtime/default
          seccomp.security.alpha.kubernetes.io/defaultProfileName: runtime/default
      tolerations:
        - key: node-role.kubernetes.io/master
          operator: Exists
          effect: NoSchedule
    
    # Disable Pushgateway
    pushgateway:
      enabled: false
    
    # Prometheus configuration
    server:
      baseURL: https://example.com:32443/prometheus
      prefixURL: /prometheus
      ingress:
        enabled: true
        annotations:
          kubernetes.io/ingress.class: nginx
          nginx.ingress.kubernetes.io/auth-type: basic
          nginx.ingress.kubernetes.io/auth-secret: prometheus-basic-auth
          nginx.ingress.kubernetes.io/auth-realm: "Authentication Required"
        hosts:
          - example.com/prometheus
        tls:
          - secretName: monitoring-tls
            hosts:
            - example.com
      persistentVolume:
        enabled: true
        ## Use a StorageClass
        storageClass: my-storage-class
        ## Create a PersistentVolumeClaim of 8Gi
        size: 8Gi
        ## Use an existing PersistentVolumeClaim (my-pvc)
        #existingClaim: my-pvc
    
    ## Prometheus is configured through prometheus.yml. This file and any others
    ## listed in serverFiles will be mounted into the server pod.
    ## See configuration options
    ## https://prometheus.io/docs/prometheus/latest/configuration/configuration/
    #serverFiles:
    #  prometheus.yml:
  2. Add SUSE helm charts repository

    helm repo add suse https://kubernetes-charts.suse.com
  3. Deploy SUSE prometheus helm chart and pass our configuration values file.

    helm install prometheus suse/prometheus \
    --namespace monitoring \
    --values prometheus-config-values.yaml

    There need to be 3 pods running (3 node-exporter pods because we have 3 nodes).

    kubectl -n monitoring get pod | grep prometheus
    NAME                                             READY     STATUS    RESTARTS   AGE
    prometheus-alertmanager-5487596d54-kcdd6         2/2       Running   0          2m
    prometheus-kube-state-metrics-566669df8c-krblx   1/1       Running   0          2m
    prometheus-node-exporter-jnc5w                   1/1       Running   0          2m
    prometheus-node-exporter-qfwp9                   1/1       Running   0          2m
    prometheus-node-exporter-sc4ls                   1/1       Running   0          2m
    prometheus-server-6488f6c4cd-5n9w8               2/2       Running   0          2m
8.1.3.4.3 Alertmanager Configuration Example

Refer to Section 8.1.3.2.3, “Alertmanager Configuration Example”

8.1.3.4.4 Recording Rules Configuration Example

Refer to Section 8.1.3.2.4, “Recording Rules Configuration Example”

8.1.3.4.5 Grafana

Starting from Grafana 5.0, it is possible to dynamically provision the data sources and dashboards via files. In Kubernetes cluster, these files are provided via the utilization of ConfigMap, editing a ConfigMap will result by the modification of the configuration without having to delete/recreate the pod.

  1. Configure Grafana provisioning

    Create the default datasource configuration file grafana-datasources.yaml which point to our Prometheus server

    ---
    kind: ConfigMap
    apiVersion: v1
    metadata:
      name: grafana-datasources
      namespace: monitoring
      labels:
         grafana_datasource: "1"
    data:
      datasource.yaml: |-
        apiVersion: 1
        deleteDatasources:
          - name: Prometheus
            orgId: 1
        datasources:
        - name: Prometheus
          type: prometheus
          url: http://prometheus-server.monitoring.svc.cluster.local:80
          access: proxy
          orgId: 1
          isDefault: true
  2. Create the ConfigMap in Kubernetes cluster

    kubectl create -f grafana-datasources.yaml
  3. Configure storage for the deployment

    Choose among the options and uncomment the line in the config file. In production environments you must configure persistent storage.

    Create a file grafana-config-values.yaml with the appropriate values

    +

    # Configure admin password
    adminPassword: <PASSWORD>
    
    # Ingress configuration
    ingress:
      enabled: true
      annotations:
        kubernetes.io/ingress.class: nginx
        nginx.ingress.kubernetes.io/rewrite-target: /
      hosts:
        - example.com
      path: /grafana
      tls:
        - secretName: monitoring-tls
          hosts:
          - example.com
    
    # subpath for grafana
    grafana.ini:
      server:
        root_url: https://example.com:32443/grafana
    
    # Configure persistent storage
    persistence:
      enabled: true
      accessModes:
        - ReadWriteOnce
      ## Use a StorageClass
      storageClassName: my-storage-class
      ## Create a PersistentVolumeClaim of 10Gi
      size: 10Gi
      ## Use an existing PersistentVolumeClaim (my-pvc)
      #existingClaim: my-pvc
    
    # Enable sidecar for provisioning
    sidecar:
      datasources:
        enabled: true
        label: grafana_datasource
      dashboards:
        enabled: true
        label: grafana_dashboard
  4. Add SUSE helm charts repository

    helm repo add suse https://kubernetes-charts.suse.com
  5. Deploy SUSE grafana helm chart and pass our configuration values file

    helm install grafana suse/grafana \
    --namespace monitoring \
    --values grafana-config-values.yaml
  6. The result should be a running Grafana pod

    kubectl -n monitoring get pod | grep grafana
    NAME                                             READY     STATUS    RESTARTS   AGE
    grafana-dbf7ddb7d-fxg6d                          3/3       Running   0          2m
  7. Access Prometheus, Alertmanager, and Grafana

    At this stage, the Prometheus Expression browser/API, Alertmanager, and Grafana should be accessible, depending on your network configuration

    • Prometheus Expression browser/API

      • NodePort: https://example.com:32443/prometheus

      • External IPs: https://example.com/prometheus

      • LoadBalancer: https://example.com/prometheus

    • Alertmanager

      • NodePort: https://example.com:32443/alertmanager

      • External IPs: https://example.com/alertmanager

      • LoadBalancer: https://example.com/alertmanager

    • Grafana

      • NodePort: https://example.com:32443/grafana

      • External IPs: https://example.com/grafana

      • LoadBalancer: https://example.com/grafana

  8. Now you can add the Grafana dashboards.

8.1.3.4.6 Adding Grafana Dashboards

Refer to Section 8.1.3.2.6, “Adding Grafana Dashboards”

8.1.4 Monitoring

8.1.4.1 Prometheus Jobs

The Prometheus SUSE helm chart includes the following predefined jobs that will scrape metrics from these jobs using service discovery.

  • prometheus: Get metrics from prometheus server

  • kubernetes-apiservers: Get metrics from Kubernetes apiserver

  • kubernetes-nodes: Get metrics from Kubernetes nodes

  • kubernetes-service-endpoints: Get metrics from Services which have annotation prometheus.io/scrape=true in the metadata

  • kubernetes-pods: Get metrics from Pods which have annotation prometheus.io/scrape=true in the metadata

If you want to monitor new pods and services, you don’t need to change prometheus.yaml but add annotation prometheus.io/scrape=true, prometheus.io/port=<TARGET_PORT> and prometheus.io/path=<METRIC_ENDPOINT> to your pods and services metadata. Prometheus will automatically scrape the target.

8.1.4.2 ETCD Cluster

ETCD server exposes metrics on the /metrics endpoint. Prometheus jobs do not scrape it by default. Edit the prometheus.yaml file if you want to monitor the etcd cluster. Since the etcd cluster runs on https, we need to create a certificate to access the endpoint.

  1. Create a new etcd client certificate signed by etcd CA cert/key pair:

    cat << EOF > <CLUSTER_NAME>/pki/etcd/openssl-monitoring-client.conf
    [req]
    distinguished_name = req_distinguished_name
    req_extensions = v3_req
    prompt = no
    
    [v3_req]
    keyUsage = digitalSignature,keyEncipherment
    extendedKeyUsage = clientAuth
    
    [req_distinguished_name]
    O = system:masters
    CN = kube-etcd-monitoring-client
    EOF
    
    openssl req -nodes -new -newkey rsa:2048 -config <CLUSTER_NAME>/pki/etcd/openssl-monitoring-client.conf -out <CLUSTER_NAME>/pki/etcd/monitoring-client.csr -keyout <CLUSTER_NAME>/pki/etcd/monitoring-client.key
    openssl x509 -req -days 365 -CA <CLUSTER_NAME>/pki/etcd/ca.crt -CAkey <CLUSTER_NAME>/pki/etcd/ca.key -CAcreateserial -in <CLUSTER_NAME>/pki/etcd/monitoring-client.csr -out <CLUSTER_NAME>/pki/etcd/monitoring-client.crt -sha256 -extfile <CLUSTER_NAME>/pki/etcd/openssl-monitoring-client.conf -extensions v3_req
  2. Create the etcd client certificate to secret in monitoring namespace:

    kubectl -n monitoring create secret generic etcd-certs --from-file=<CLUSTER_NAME>/pki/etcd/ca.crt --from-file=<CLUSTER_NAME>/pki/etcd/monitoring-client.crt --from-file=<CLUSTER_NAME>/pki/etcd/monitoring-client.key
  3. Get all etcd cluster private IP address:

    kubectl get pods -n kube-system -l component=etcd -o wide
    NAME           READY   STATUS    RESTARTS   AGE   IP             NODE      NOMINATED NODE   READINESS GATES
    etcd-master0   1/1     Running   2          21h   192.168.0.6    master0   <none>           <none>
    etcd-master1   1/1     Running   2          21h   192.168.0.20   master1   <none>           <none>
  4. Edit the configuration file prometheus-config-values.yaml, add extraSecretMounts and extraScrapeConfigs parts, change the extraScrapeConfigs targets IP address(es) as your environment and change the target numbers if you have different etcd cluster members:

    # Prometheus configuration
    server:
      ...
      extraSecretMounts:
      - name: etcd-certs
        mountPath: /etc/secrets
        secretName: etcd-certs
        readOnly: true
    
    extraScrapeConfigs: |
      - job_name: etcd
        static_configs:
        - targets: ['192.168.0.32:2379','192.168.0.17:2379','192.168.0.5:2379']
        scheme: https
        tls_config:
          ca_file: /etc/secrets/ca.crt
          cert_file: /etc/secrets/monitoring-client.crt
          key_file: /etc/secrets/monitoring-client.key
  5. Upgrade prometheus helm deployment:

    helm upgrade prometheus suse/prometheus \
    --namespace monitoring \
    --values prometheus-config-values.yaml

8.2 Health Checks

Although Kubernetes cluster takes care of a lot of the traditional deployment problems on its own, it is good practice to monitor the availability and health of your services and applications in order to react to problems should they go beyond the automated measures.

There are three levels of health checks.

  • Cluster

  • Node

  • Service / Application

8.2.1 Cluster Health Checks

The basic check if a cluster is working correctly is based on a few criteria:

  • Are all services running as expected?

  • Is there at least one Kubernetes master fully working? Even if the deployment is configured to be highly available, it’s useful to know if kube-controller-manager is down on one of the machines.

Note
Note

For further understanding cluster health information, consider reading https://v1-18.docs.kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/

8.2.1.1 Kubernetes master

All components in Kubernetes cluster expose a /healthz endpoint. The expected (healthy) HTTP response status code is 200.

The minimal services for the master to work properly are:

  • kube-apiserver:

    The component that receives your requests from kubectl and from the rest of the Kubernetes components. The URL is https://<CONTROL_PLANE_IP/FQDN>:6443/healthz

    • Local Check

      curl -k -i https://localhost:6443/healthz
    • Remote Check

      curl -k -i https://<CONTROL_PLANE_IP/FQDN>:6443/healthz
  • kube-controller-manager:

    The component that contains the control loop, driving current state to the desired state. The URL is http://<CONTROL_PLANE_IP/FQDN>:10252/healthz

    • Local Check

      curl -i http://localhost:10252/healthz
    • Remote Check

      Make sure firewall allows port 10252.

      curl -i http://<CONTROL_PLANE_IP/FQDN>:10252/healthz
  • kube-scheduler:

    The component that schedules workloads to nodes. The URL is http://<CONTROL_PLANE_IP/FQDN>:10251/healthz

    • Local Check

      curl -i http://localhost:10251/healthz
    • Remote Check

      Make sure firewall allows port 10251.

      curl -i http://<CONTROL_PLANE_IP/FQDN>:10251/healthz
Note
Note: High-Availability Environments

In a HA environment you can monitor kube-apiserver on https://<LOAD_BALANCER_IP/FQDN>:6443/healthz.

If any one of the master nodes is running correctly, you will receive a valid response.

This does, however, not mean that all master nodes necessarily work correctly. To ensure that all master nodes work properly, the health checks must be repeated individually for each deployed master node.

This endpoint will return a successful HTTP response if the cluster is operational; otherwise it will fail. It will for example check that it can access etcd. This should not be used to infer that the overall cluster health is ideal. It will return a successful response even when only minimal operational cluster health exists.

To probe for full cluster health, you must perform individual health checking for all machines.

8.2.1.2 ETCD Cluster

The etcd cluster exposes an endpoint /health. The expected (healthy) HTTP response body is {"health":"true"}. The etcd cluster is accessed through HTTPS only, so be sure to have etcd certificates.

  • Local Check

    curl --cacert /etc/kubernetes/pki/etcd/ca.crt
    --cert /etc/kubernetes/pki/etcd/healthcheck-client.crt
    --key /etc/kubernetes/pki/etcd/healthcheck-client.key https://localhost:2379/health
  • Remote Check

    Make sure firewall allows port 2379.

    curl --cacert <ETCD_ROOT_CA_CERT> --cert <ETCD_CLIENT_CERT>
    --key <ETCD_CLIENT_KEY> https://<CONTROL_PLANE_IP/FQDN>:2379/health

8.2.2 Node Health Checks

This basic node health check consists of two parts. It checks:

  1. The kubelet endpoint

  2. CNI (Container Networking Interface) pod state

8.2.2.1 kubelet

First, determine if kubelet is up and working on the node.

Kubelet has two ports exposed on all machines:

  • Port https/10250: exposes kubelet services to the entire cluster and is available from all nodes through authentication.

  • Port http/10248: is only available on local host.

You can send an HTTP request to the endpoint to find out if kubelet is healthy on that machine. The expected (healthy) HTTP response status code is 200.

8.2.2.1.1 Local Check

If there is an agent running on each node, this agent can simply fetch the local healthz port:

curl -i http://localhost:10248/healthz
8.2.2.1.2 Remote Check

There are two ways to fetch endpoints remotely (metrics, healthz, etc.). Both methods use HTTPS and a token.

The first method is executed against the APIServer and mostly used with Prometheus and Kubernetes discovery kubernetes_sd_config. It allows automatic discovery of the nodes and avoids the task of defining monitoring for each node. For more information see the Kubernetes documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config

The second method directly talks to kubelet and can be used in more traditional monitoring where one must configure each node to be checked.

  • Configuration and Token retrieval:

    Create a Service Account (monitoring) with an associated secondary Token (monitoring-secret-token). The token will be used in HTTP requests to authenticate against the API server.

    This Service Account can only fetch information about nodes and pods. Best practice is not to use the token that has been created default. Using a secondary token is also easier for management. Create a file kubelet.yaml with the following as content.

    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: monitoring
      namespace: kube-system
    secrets:
    - name: monitoring-secret-token
    ---
    apiVersion: v1
    kind: Secret
    metadata:
      name: monitoring-secret-token
      namespace: kube-system
      annotations:
        kubernetes.io/service-account.name: monitoring
    type: kubernetes.io/service-account-token
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: monitoring-clusterrole
      namespace: kube-system
    rules:
    - apiGroups: [""]
      resources:
      - nodes/metrics
      - nodes/proxy
      - pods
      verbs: ["get", "list"]
    - nonResourceURLs: ["/metrics", "/healthz", "/healthz/*"]
      verbs: ["get"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRoleBinding
    metadata:
      name: monitoring-clusterrole-binding
      namespace: kube-system
    roleRef:
      kind: ClusterRole
      name: monitoring-clusterrole
      apiGroup: rbac.authorization.k8s.io
    subjects:
    - kind: ServiceAccount
      name: monitoring
      namespace: kube-system

    Apply the yaml file:

    kubectl apply -f kubelet.yaml

    Export the token to an environment variable:

    TOKEN=$(kubectl -n kube-system get secrets monitoring-secret-token
    -o jsonpath='{.data.token}' | base64 -d)

    This token can now be passed through the --header argument as: "Authorization: Bearer $TOKEN".

    Now export important values as environment variables:

  • Environment Variables Setup

    1. Choose a Kubernetes master node or worker node. The NODE_IP_FQDN here must be a node’s IP address or FQDN. The NODE_NAME here must be a node name in your Kubernetes cluster. Export the variables NODE_IP_FQDN and NODE_NAME so it can be reused.

      NODE_IP_FQDN="10.86.4.158"
      NODE_NAME=worker0
    2. Retrieve the TOKEN with kubectl.

      TOKEN=$(kubectl -n kube-system get secrets monitoring-secret-token
      -o jsonpath='{.data.token}' | base64 -d)
    3. Get the control plane <IP/FQDN> from the configuration file. You can skip this step if you only want to use the kubelet endpoint.

      CONTROL_PLANE=$(kubectl config view | grep server | cut -f 2- -d ":" | tr -d " ")

      Now the key information to retrieve data from the endpoints should be available in the environment and you can poll the endpoints.

  • Fetching Information from kubelet Endpoint

    1. Make sure firewall allows port 10250.

    2. Fetching metrics

      curl -k https://$NODE_IP_FQDN:10250/metrics --header "Authorization: Bearer $TOKEN"
    3. Fetching healthz

      curl -k https://$NODE_IP_FQDN:10250/healthz --header "Authorization: Bearer $TOKEN"
  • Fetching Information from APISERVER Endpoint

    1. Fetching metrics

      curl -k $CONTROL_PLANE/api/v1/nodes/$NODE_NAME/proxy/metrics --header
      "Authorization: Bearer $TOKEN"
    2. Fetching healthz

      curl -k $CONTROL_PLANE/api/v1/nodes/$NODE_NAME/proxy/healthz --header
      "Authorization: Bearer $TOKEN"

8.2.2.2 CNI

You can check if the CNI (Container Networking Interface) is working as expected by check if the coredns service is running. If CNI has some kind of trouble coredns will not be able to start:

kubectl get deployments -n kube-system
NAME              READY   UP-TO-DATE   AVAILABLE   AGE
cilium-operator   1/1     1            1           8d
coredns           2/2     2            2           8d
oidc-dex          1/1     1            1           8d
oidc-gangway      1/1     1            1           8d

If coredns is running and you are able to create pods then you can be certain that CNI and your CNI plugin are working correctly.

There’s also the Monitor Node Health check. This is a DaemonSet that runs on every node, and reports to the apiserver back as NodeCondition and Events.

8.2.3 Service/Application Health Checks

If the deployed services contain a health endpoint, or if they contain an endpoint that can be used to determine if the service is up, you can use livenessProbes and/or readinessProbes.

Note
Note: Health check endpoints vs. functional endpoints

A proper health check is always preferred if designed correctly.

Despite the fact that any endpoint could potentially be used to infer if your application is up, it is better to have an endpoint specifically for health in your application. Such an endpoint will only respond affirmatively when all your setup code on the server has finished and the application is running in a desired state.

The livenessProbes and readinessProbes share configuration options and probe types.

initialDelaySeconds

Number of seconds to wait before performing the very first liveness probe.

periodSeconds

Number of seconds that the kubelet should wait between liveness probes.

successThreshold

Number of minimum consecutive successes for the probe to be considered successful (Default: 1).

failureThreshold

Number of times this probe is allowed to fail in order to assume that the service is not responding (Default: 3).

timeoutSeconds

Number of seconds after which the probe times out (Default: 1).

There are different options for the livenessProbes to check:

Command

A command executed within a container; a return code of 0 means success. All other return codes mean failure.

TCP

If a TCP connection can be established is considered success.

HTTP

Any HTTP response between 200 and 400 indicates success.

8.2.3.1 livenessProbe

livenessProbes are used to detect running but misbehaving pods/a service that might be running (the process didn’t die), but that is not responding as expected. You can find out more about livenessProbes here: https://v1-18.docs.kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/

Probes are executed by each kubelet against the pods that define them and that are running in that specific node. When a livenessProbe fails, Kubernetes will automatically restart the pod and increase the RESTARTS count for that pod. These probes will be executed every periodSeconds starting from initialDelaySeconds.

8.2.3.2 readinessProbe

readinessProbes are used to wait for processes that take some time to start. Find out more about readinessProbes here: https://v1-18.docs.kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-readiness-probes Despite the container running, it might be performing some time consuming initialization operations. During this time, you don’t want Kubernetes to route traffic to that specific pod. You also don’t want that container to be restarted because it will appear unresponsive.

These probes will be executed every periodSeconds starting from initialDelaySeconds until the service is ready.

Both probe types can be used at the same time. If a service is running, but misbehaving, the livenessProbe will ensure that it’s restarted, and the readinessProbe will ensure that Kubernetes won’t route traffic to that specific pod until it’s considered to be fully functional and running again.

8.2.4 General Health Checks

We recommend to apply other best practices from system administration to your monitoring and health checking approach. These steps are not specific to SUSE CaaS Platform and are beyond the scope of this document.

8.3 Horizontal Pod Autoscaler

Horizontal Pod Autoscaler (HPA) is a tool that automatically increases or decreases the number of pods in a replication controller, deployment, replica set or stateful set, based on metrics collected from pods.

In order to leverage HPA, skuba now supports an addon metrics-server. The metrics-server addon is first installed into the Kubernetes cluster. After that, HPA fetches metrics from the aggregated API metrics.k8s.io and according to the user configuration determines whether to increase or decrease the scale of a replication controller, deployment, replica set or stateful set.

The HPA metrics.target.type can be one of the following:

  • Utilization: the value returned from the metrics server API is calculated as the average resource utilization across all relevant pods and subsequently compared with the metrics.target.averageUtilization.

  • AverageValue: the value returned from the metrics server API is divided by the number of all relevant pods, then compared to the metrics.target.averageValue.

  • Value: the value returned from the metrics server API is directly compared to the metrics.target.value.

Note
Note

The metrics supported by metrics-server are the CPU and memory of a pod or node.

Important
Important

API versions supported by the HPA:

  • CPU metric: autoscaling/v1,autoscaling/v2beta2

  • Memory metric: autoscaling/v2beta2.

8.3.1 Usage

It is useful to first find out about the available resources of your cluster.

  • To display resource (CPU/Memory) usage for nodes, run:

    $ kubectl top node

    the expected output should look like the following:

    NAME        CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
    master000   207m         10%    1756Mi          45%
    worker000   100m         10%    602Mi           31%
  • To display resource (CPU/Memory) usage for pods, run:

    $ kubectl top pod

    the expected output should look like the following:

    NAME                                CPU(cores)   MEMORY(bytes)
    cilium-9fjw2                        32m          216Mi
    cilium-cqnq5                        43m          227Mi
    cilium-operator-7d6ddddbf5-2jwgr    1m           46Mi
    coredns-69c4947958-2br4b            2m           11Mi
    coredns-69c4947958-kb6dq            3m           11Mi
    etcd-master000                      21m          584Mi
    kube-apiserver-master000            20m          325Mi
    kube-controller-manager-master000   6m           105Mi
    kube-proxy-x2965                    0m           24Mi
    kube-proxy-x9zlv                    0m           19Mi
    kube-scheduler-master000            2m           46Mi
    kured-45rc2                         1m           25Mi
    kured-cptk4                         0m           25Mi
    metrics-server-79b8658cd7-gjvhs     1m           21Mi
    oidc-dex-55fc689dc-f6cfg            1m           20Mi
    oidc-gangway-7b7fbbdbdf-85p6t       1m           18Mi
    Note
    Note

    The option flag --sort-by=cpu/--sort-by=memory has an sorting issue at the moment. It will be fixed in the future.

8.3.1.1 Using Horizontal Pod Autoscaler (HPA)

You can set the HPA to scale according to various metrics. These include average CPU utilization, average CPU value, average memory utilization and average memory value. The following sections show the recommended configuration for each of the aforementioned options.

8.3.1.1.1 Creating an HPA Using Average CPU Utilization

The following code is an example of what this type of HPA can look like. You will have to run the code on your admin node or user local machine. Note that you need a kubeconfig file with RBAC permission that allow setting up autoscale rules into your Kubernetes cluster.

# deployment
kubectl autoscale deployment <DEPLOYMENT_NAME> \
    --min=<MIN_REPLICAS_NUMBER> \
    --max=<MAX_REPLICAS_NUMBER> \
    --cpu-percent=<PERCENT>

# replication controller
kubectl autoscale replicationcontrollers <REPLICATIONCONTROLLERS_NAME> \
    --min=<MIN_REPLICAS_NUMBER> \
    --max=<MAX_REPLICAS_NUMBER> \
    --cpu-percent=<PERCENT>

You could for example use the following values:

kubectl autoscale deployment oidc-dex \
    --name=avg-cpu-util \
    --min=1 \
    --max=10 \
    --cpu-percent=50

The example output below shows autoscaling works in case of the oidc-dex deployment. The HPA increases the minimum number of pods to 1 and will increase the pods up to 10, if the average CPU utilization of the pods reaches 50%. For more details about the inner workings of the scaling, refer to The Kubernetes documentation on the horizontal pod autoscale algorithm.

To check the current status of the HPA run:

kubectl get hpa

Example output:

NAME       REFERENCE             TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
oidc-dex   Deployment/oidc-dex   0%/50%          1         10        3          115s
Note
Note

To calculate pod CPU utilization HPA divides the total CPU usage of all containers by the total number of CPU requests:

POD CPU UTILIZATION = TOTAL CPU USAGE OF ALL CONTAINERS / NUMBER OF CPU REQUESTS

For example:

  • Container1 requests 0.5 CPU and uses 0 CPU.

  • Container2 requests 1 CPU and uses 2 CPU.

The CPU utilization will be (0+2)/(0.5+1)*100 (%)=133 (%)

If a replication controller, deployment, replica set or stateful set does not specify the CPU request, the output of kubectl get hpa TARGETS will be unknown.

8.3.1.1.2 Creating an HPA Using the Average CPU Value
  1. Create a yaml manifest file hpa-avg-cpu-value.yaml with the following content:

    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    metadata:
      name: avg-cpu-value 1
      namespace: kube-system 2
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment 3
        name: example 4
      minReplicas: 1 5
      maxReplicas: 10 6
      metrics:
      - type: Resource
        resource:
          name: cpu
          target:
            type: AverageValue
            averageValue: 500Mi 7

    1

    Name of the HPA.

    2

    Namespace of the HPA.

    3

    Specifies the kind of object to scale (a replication controller, deployment, replica set or stateful set).

    4

    Specifies the name of the object to scale.

    5

    Specifies the minimum number of replicas.

    6

    Specifies the maximum number of replicas.

    7

    The average value of the requested CPU that each pod uses.

  2. Apply the yaml manifest by running:

    kubectl apply -f hpa-avg-cpu-value.yaml
  3. Check the current status of the HPA:

    kubectl get hpa
    
    NAME            REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
    avg-cpu-value   Deployment/php-apache   1m/500Mi   1         10        1          39s
8.3.1.1.3 Creating an HPA Using Average Memory Utilization
  1. Create a yaml manifest file hpa-avg-memory-util.yaml with the following content:

    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    metadata:
      name: avg-memory-util 1
      namespace: kube-system 2
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment 3
        name: example 4
      minReplicas: 1 5
      maxReplicas: 10 6
      metrics:
      - type: Resource
        resource:
          name: memory
          target:
            type: Utilization
            averageUtilization: 50 7

    1

    Name of the HPA.

    2

    Namespace of the HPA.

    3

    Specifies the kind of object to scale (a replication controller, deployment, replica set or stateful set).

    4

    Specifies the name of the object to scale.

    5

    Specifies the minimum number of replicas.

    6

    Specifies the maximum number of replicas.

    7

    The average utilization of the requested memory that each pod uses.

  2. Apply the yaml manifest by running:

    kubectl apply -f hpa-avg-memory-util.yaml
  3. Check the current status of the HPA:

    kubectl get hpa
    
    NAME              REFERENCE            TARGETS          MINPODS   MAXPODS   REPLICAS   AGE
    avg-memory-util   Deployment/example   5%/50%           1         10        1          4m54s
    Note
    Note

    HPA calculates pod memory utilization as: total memory usage of all containers / total memory requests. If a deployment or replication controller does not specify the memory request, the ouput of kubectl get hpa TARGETS is <unknown>.

8.3.1.1.4 Creating an HPA Using Average Memory Value
  1. Create a yaml manifest file hpa-avg-memory-value.yaml with the following content:

    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    metadata:
      name: avg-memory-value 1
      namespace: kube-system 2
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment 3
        name: example 4
      minReplicas: 1 5
      maxReplicas: 10 6
      metrics:
      - type: Resource
        resource:
          name: memory
          target:
            type: AverageValue
            averageValue: 500Mi 7

    1

    Name of the HPA.

    2

    Namespace of the HPA.

    3

    Specifies the kind of object to scale (a replication controller, deployment, replica set or stateful set).

    4

    Specifies the name of the object to scale.

    5

    Specifies the minimum number of replicas.

    6

    Specifies the maximum number of replicas.

    7

    The average value of the requested memory that each pod uses.

  2. Apply the yaml manifest by running:

    kubectl apply -f hpa-avg-memory-value.yaml
  3. Check the current status of the HPA:

    kubectl get hpa
    
    NAME                     REFERENCE            TARGETS          MINPODS   MAXPODS   REPLICAS   AGE
    avg-memory-value         Deployment/example   11603968/500Mi   1         10        1          6m24s

8.4 Stratos Web Console

Important
Important

This feature is offered as a "tech preview".

We release this as a tech-preview in order to get early feedback from our customers. Tech previews are largely untested, unsupported, and thus not ready for production use.

That said, we strongly believe this technology is useful at this stage in order to make the right improvements based on your feedback. A fully supported, production-ready release is planned for a later point in time.

Note
Note

If you plan to deploy SUSE Cloud Application Platform on your SUSE CaaS Platform cluster please skip this section of the documentation and refer to the official SUSE Cloud Application Platform instructions. This will include Stratos.

https://documentation.suse.com/suse-cap/1.5.2/single-html/cap-guides/#cha-cap-depl-caasp

8.4.1 Introduction

The Stratos user interface (UI) is a modern web-based management application for Kubernetes and for Cloud Foundry distributions based on Kubernetes like SUSE Cloud Application Platform.

Stratos provides a graphical management console for both developers and system administrators.

A single Stratos instance can be used to monitor multiple Kubernetes clusters as long as it is granted access to their Kubernetes API endpoint.

This document aims to describe how to install Stratos in a SUSE CaaS Platform cluster that doesn’t plan to run any SUSE Cloud Application Platform components.

The Stratos stack is deployed using helm charts and consists of its web UI POD and a MariaDB one that is used to store configuration values.

8.4.2 Prerequisites

8.4.2.1 Helm

The deployment of Stratos is performed using a helm chart. Your remote administration machine must have Helm installed.

8.4.2.2 Persistent Storage

The MariaDB instance used by Stratos requires a persistent storage to store its data.

The cluster must have a Kubernetes Storage Class defined.

8.4.3 Installation

8.4.3.1 Adding helm chart repository and default values

  1. Add SUSE helm charts repository

    helm repo add suse https://kubernetes-charts.suse.com
  2. Obtain the default values.yaml file of the helm chart

    helm inspect values suse/console > stratos-values.yaml
  3. Create the stratos namespace

    kubectl create namespace stratos

8.4.3.2 Define admin user password

Create a secure password for your admin user and write that into the stratos-values.yaml as value of the console.localAdminPassword key.

Important
Important

This step is required to allow the installation of Stratos without having any SUSE Cloud Application Platform components deployed on the cluster.

8.4.3.3 Define the Storage Class to be used

If your cluster does not have a default storage class configured, or you want to use a different one, follow these instructions.

Open the stratos-values.yaml file and look for the storageClass entry defined at the global level, uncomment the line and provide the name of your Storage Class.

The values file will have something like that:

# Specify which storage class should be used for PVCs
storageClass: default
Note
Note

The file has other storageClass keys defined inside of some of its resources. These can be left empty to rely on the global Storage Class that has just been defined.

8.4.3.4 Exposing the Web UI

The web interface of Stratos can be exposed either via a Ingress resource or by using a Service of type LoadBalancer or even both at the same time.

An Ingress controller must be deployed on the cluster to be able to expose the service using an Ingress resource.

The cluster must be deployed on a platform that can handle LoadBalancer objects and must have the Cloud Provider Integration (CPI) enabled. This can be achieved, for example, when deploying SUSE CaaS Platform on top of OpenStack.

The behavior is defined inside of the console.service stanza of the yaml file:

console:
  service:
    annotations: []
    externalIPs: []
    loadBalancerIP:
    loadBalancerSourceRanges: []
    servicePort: 443
    # nodePort: 30000
    type: ClusterIP
    externalName:
    ingress:
      ## If true, Ingress will be created
      enabled: false

      ## Additional annotations
      annotations: {}

      ## Additional labels
      extraLabels: {}

      ## Host for the ingress
      # Defaults to console.[env.Domain] if env.Domain is set and host is not
      host:

      # Name of secret containing TLS certificate
      secretName:

      # crt and key for TLS Certificate (this chart will create the secret based on these)
      tls:
        crt:
        key:
8.4.3.4.1 Expose the web UI using a LoadBalancer

The service can be exposes as a LoadBalancer one by setting the value of console.service.type to be LoadBalancer.

The LoadBalancer resource can be tuned by changing the values of the other loadBalancer* params specified inside of the console.service stanza.

8.4.3.4.2 Expose the web UI using an Ingress

The Ingress resource can be created by setting console.service.ingress.enabled to be true.

Stratos is exposed by the Ingress using a dedicated host rule. Hence you must specify the FQDN of the host as a value of the console.service.ingress.host key.

The behavior of the Ingress object can be fine tuned by using the other keys inside of the console.service.ingress stanza.

8.4.3.5 Securing Stratos

It’s highly recommended to secure Stratos' web interface using TLS encryption.

This can be done by creating a TLS certificate for Stratos.

8.4.3.5.1 Secure Stratos web UI

It’s highly recommended to secure the web interface of Stratos by using TLS encryption. This can be easily done when exposing the web interface using an Ingress resource.

Inside of the console.service.ingress stanza ensure the Ingress resource is enabled and then specify values for console.service.ingress.tls.crt and console.service.ingress.tls.key. These keys hold the base64 encoded TLS certificate and key.

The TLS certificate and key can be base64 encoded by using the following command:

base64 tls.crt
base64 tls.key

The output produced by the two commands has to be copied into the stratos-values.yaml file, resulting in something like that:

console:
  service:
    ingress:
      enabled: true
      tls: |
        <output of base64 tls.crt>
      key: |
        <output of base64 tls.key>
8.4.3.5.2 Change MariaDB password

The helm chart provisions the MariaDB database with a default weak password. A stronger password can be specified by altering the value of mariadb.mariadbPassword.

8.4.3.6 Enable tech preview features

You can enable tech preview features of Stratos by changing the value of console.techPreview from false to true.

8.4.3.7 Deploying Stratos

Now Stratos can be deployed using helm and the values specified inside of the stratos-values.yaml file:

helm install stratos-console suse/console \
  --namespace stratos \
  --values stratos-values.yaml

You can monitor the status of your Stratos deployment with the watch command:

watch --color 'kubectl get pods --namespace stratos'

When Stratos is successfully deployed, the following is observed:

  • For the volume-migration pod, the STATUS is Completed and the READY column is at 0/1.

  • All other pods have a Running STATUS and a READY value of n/n.

Press Ctrl–C to exit the watch command.

  1. At this stage Stratos web UI should be accessible. You can log into that using the admin user and the password you specified inside of your stratos-values.yaml file.

8.4.4 Stratos configuration

Now that Stratos is up and running you can log into it and configure it to connect to your Kubernetes cluster(s).

Please refer to the SUSE Cloud Application Platform documentation for more information.

Print this page