Jump to content
documentation.suse.com / SAP Monitoring
SUSE Linux Enterprise Server for SAP Applications 15 SP5

SAP Monitoring

SUSE® Linux Enterprise Server for SAP Applications · SUSE Linux Enterprise High Availability

Publication Date: November 14, 2024

This article shows monitoring solutions for SAP administrators to efficiently monitor their SAP systems. The solutions that is described here works for SUSE® Linux Enterprise Server 12 SP3 to 15 SP2.

1 Conceptual overview

Starting from the idea of improving user experience, SUSE engineering worked on how to monitor High Availability clusters that manage SAP workloads (SAP HANA and SAP NetWeaver).

The exporters shown here export their metrics which can be combined and integrated with Prometheus and Grafana to produce complex dashboards.

SUSE supports Prometheus and Grafana through SUSE Manager 4.0. Some Grafana dashboards for SAP HANA, SAP S/4HANA, SAP NetWeaver, and the cluster monitoring are provided by SUSE via Grafana community dashboards.

2 Terminology

Grafana

An interactive visualization and analytics Web application. It provides methods to visualize, explore, and query your metrics, and trigger alerts.

Prometheus

A systems monitoring and alerting toolkit. It collects and evaluates metrics, displays the result, and triggers possible alerts when an observed condition is true. Metrics can be collected from different targets at given intervals.

3 Installing exporters

Installation of an exporter follows always the same pattern. Execute the following steps:

Procedure 1: General way
  1. Install the package. All package are available in SUSE Linux Enterprise Server for SAP Applications.

  2. (Optional) Copy the configuration file to /etc/EXPORTER_DIR. The exact folder name is different for each exporter. This step depends on the exporter. If you skip this step, the default configuration is used.

  3. Start the daemon:

    systemctl start NAME_OF_DAEMON

The above procedure is automatically done by each of the Salt formulas described in Article “SAP Automation”.

4 SAP HANA database exporter

SAP HANA database exporter makes it possible to export SAP HANA database metrics. The tool can export metrics from more than one database and tenant if the multi_tenant option is enabled in the configuration file (enabled by default).

The labels sid (system identifier), insnr (instance number), database_name (database name) and host (machine hostname) are exported for all the metrics.

4.1 Prerequisites

  • A running and reachable SAP HANA database (single or multi-container). It is recommended to run the exporter on the same machine with the SAP HANA database. Ideally, each database should be monitored by one exporter.

  • One of the following SAP HANA connectors:

  • Certain metrics are collected in the SAP HANA monitoring views by the SAP Host agent. To have access to all the monitoring metrics, make sure that the SAP Host agent is installed and running.

4.2 Metrics file

The exporter relies on a metrics file to determine what metrics to export. When the metrics file uses the JSON format, you can use the options listed below.

  • enabled (boolean, optional). Determines whether the query is executed or not. If set to false, the metrics for this query are not executed.

  • hana_version_range (list, optional). The SAP HANA database versions range where the query is available ([1.0.0] by default). If the current database version is not within the specified range, the query is not executed. If the list has only one element, all versions beyond the specified value (including the defined one) are queried.

  • metrics (list) A list of metrics for the query.

  • name (string) A name for the exported metrics.

  • description (string) A description of the metrics.

  • labels (list) A list of labels used to split the records.

  • value (string) A name of the column for the exported value (must match with one of the columns of the query).

  • unit (string) Used unit for the exported value (`mb` for example).

  • type (enum{gauge}) Defines the type of the exported metric (gauge is the only available option).

Below is an example of a metrics file:

{
  "SELECT TOP 10 host, LPAD(port, 5) port, SUBSTRING(REPLACE_REGEXPR('\n' IN statement_string WITH ' ' OCCURRENCE ALL), 1,30) sql_string, statement_hash sql_hash, execution_count, total_execution_time + total_preparation_time total_elapsed_time FROM sys.m_sql_plan_cache ORDER BY total_elapsed_time, execution_count DESC;":
  {
    "enabled": true,
    "hana_version_range": ["1.0"]
    "metrics": [
      {
        "name": "hanadb_sql_top_time_consumers",
        "description": "Top statements time consumers. Sum of the time consumed in all executions in Microseconds",
        "labels": ["HOST", "PORT", "SQL_STRING", "SQL_HASH"],
        "value": "TOTAL_ELAPSED_TIME",
        "unit": "mu",
        "type": "gauge"
      },
      {
        "name": "hanadb_sql_top_time_consumers",
        "description": "Top statements time consumers. Number of total executions of the SQL Statement",
        "labels": ["HOST", "PORT", "SQL_STRING", "SQL_HASH"],
        "value": "EXECUTION_COUNT",
        "unit": "count",
        "type": "gauge"
      }
    ]
  }
}

4.3 Installing the SAP HANA database exporter

Use the zypper install prometheus-hanadb_exporter command to install the exporter.

You can find the latest development repositories at SUSE's Open Build Service.

To install the exporter from the source code, make sure you have Git and Python 3 installed on your system. Run the following commands to install the exporter with the PyHDB SAP HANA connector:

git clone https://github.com/SUSE/hanadb_exporter
cd hanadb_exporter # project root folder
virtualenv virt
source virt/bin/activate
pip install pyhdb
pip install .

4.4 Configuring the exporter

Use the following example of the config.json configuration file as a starting point.

{
  "listen_address": "0.0.0.0",
  "exposition_port": 9668,
  "multi_tenant": true,
  "timeout": 30,
  "hana": {
    "host": "localhost",
    "port": 30013,
    "user": "SYSTEM",
    "password": "PASSWORD",
    "ssl": false,
    "ssl_validate_cert": false
  },
  "logging": {
    "config_file": "./logging_config.ini",
    "log_file": "hanadb_exporter.log"
  }
}

Below is a list of key configuration options.

  • listen_address IP address of the Prometheus exporter (0.0.0.0 by default).

  • exposition_port Port through which the Prometheus exporter is accessible (9968 by default).

  • multi_tenant Export the metrics from other tenants. This requires a connection to the system database (port 30013).

  • timeout Timeout to connect to the database. The app fails if connection is not established within the specified time (even in daemon mode).

  • hana.host Address of the SAP HANA database.

  • hana.port Port through which the SAP HANA database is accessible.

  • hana.userkey Stored user key (see Section 4.5, “Using the stored user key”). Use this option if you do not want to store the password in the configuration file. The userkey and user/password are mutually exclusive. If both are set, hana.userkey takes priority.

  • hana.user Existing user with access right to the SAP HANA database.

  • hana.password Password of an existing user.

  • hana.ssl Enable SSL connection (false by default). Only available for the dbapi connector.

  • hana.ssl_validate_cert Enable SSL certification validation. This option is required by SAP HANA cloud. Only available for the dbapi connector.

  • hana.aws_secret_name Secret name containing the username and password (see Section 4.6, “Using AWS Secrets Manager”. Use this option when SAP HANA database is stored on AWS. aws_secret_name and user/password are mutually exclusive. If both are set, aws_secret_name takes priority.

  • logging.config_file Python logging system configuration file (by default, WARN and ERROR level messages are sent to the syslog).

  • logging.log_file Logging file (/var/log/hanadb_exporter.log by default)

The logging configuration file follows the Python standard logging system style.

Using the default configuration file, redirects the logs to the file assigned in the json configuration file and to the syslog (only logging level up to WARNING).

4.5 Using the stored user key

Use this option to keep the database secure (you can use user/password with the SYSTEM user for development, as it is faster to set up). To use the userkey option, the dbapi must be installed (normally stored in /hana/shared/SID/hdbclient/hdbcli-N.N.N.tar.gz and installable with pip3). The key is stored in the client itself. To use a different client, you must create a new stored user key for the user running Python. To do that, use the following command (note that the hdbclient is the same as the dbapi Python package):

/hana/shared/PRD/hdbclient/hdbuserstore set USER_KEY host:30013@SYSTEMDB hanadb_exporter pass

4.6 Using AWS Secrets Manager

Use the AWS Secrets Manager to store the login credentials outside the configuration file when the SAP HANA database is stored on AWS EC2 instance.

  • Create a JSON secret file that contains two key-value pairs. The first pair contains the username key and the actual database user as the value. The second pair has the password key and the actual password as the value. For example:

    {
    """username": "DATABASE_USER",
    "password": "DATABASE_PASSWORD"
    }

    Use the actual secret as the secret name, and pass it in the configuration file as a value for the aws_secret_name entry.

  • Configure read-only access from EC2 IAM role to the secret by attaching a resource-based policy to the secret. For example:

    {
    "Version" : "2012-10-17",
    "Statement" : [
      {
        "Effect": "Allow",
        "Principal": {"AWS": "arn:aws:iam::123456789012:role/EC2RoleToAccessSecrets"},
        "Action": "secretsmanager:GetSecretValue",
        "Resource": "*",
      }
    ]
    }

Tips and recommendations:

  • Set SYSTEMDB as the default database for the exporter to get the tenants data.

  • Do not use the stored user key created for the backup, because the key is created using the sidadm user.

  • Instead of the SYSTEM user, use an account limited to accessing the monitoring tables only.

  • In case you use a user account with the monitoring role, this user must exist in all the databases (SYSTEMDB and tenants).

4.7 Create a new user with the monitoring role

Run the following commands to create a user with the monitoring roles (the commands must be executed in all the databases):

su - prdadm
hdbsql -u SYSTEM -p pass -d SYSTEMDB #(PRD for the tenant in this example)
CREATE USER HANADB_EXPORTER_USER PASSWORD MyExporterPassword NO FORCE_FIRST_PASSWORD_CHANGE;
CREATE ROLE HANADB_EXPORTER_ROLE;
GRANT MONITORING TO HANADB_EXPORTER_ROLE;
GRANT HANADB_EXPORTER_ROLE TO HANADB_EXPORTER_USER;

4.8 Running the exporter

Start the exporter with the hanadb_exporter -c config.json -m metrics.json command.

If the config.json configuration file is stored in the /etc/hanadb_exporter directory, the exporter can be started with the following command (note that the identifier matches with the config.json file without extension):

hanadb_exporter --identifier config

4.9 Running as a service

To run the hanadb_exporter as systemd service, install the exporter using the RPM package as described in Section 4.3, “Installing the SAP HANA database exporter”.

Next, create the configuration file as /etc/hanadb_exporter/my-exporter.json. You can use the example file above as a starting point (the example file is also available in the /usr/etc/hanadb_exporter directory).

You can use the example /usr/etc/hanadb_exporter/metrics.json metrics file.

Adjust the default logging configuration file /usr/etc/hanadb_exporter/logging_config.ini.

Start the exporter as a daemon. Because there are multiple hanadb_exporter instances running on one machine, you need to specify the name of the created configuration file, for example:

# systemctl start prometheus-hanadb_exporter@my-exporter
# systemctl status prometheus-hanadb_exporter@my-exporter
# systemctl enable prometheus-hanadb_exporter@my-exporter
Important
Important: Configure the Prometheus server

The exporter only exposes a port, without pushing the data to the Prometheus server. This means that the Prometheus server must be configured to periodically pull the data from the exporter. This is done by either adding the hanadb_exporter job to the Prometheus server configuration, or by adding hanadb_exporter to an existing job. For example:

- job_name: hana_db
    static_configs:
            - targets:
              - "HOSTNAME:PORT"
Important
Important: Configure firewall

Use the following command to open the port for hanadb_exporter.

# firewall-cmd --zone=ZONE --add-port=PORT/tcp --permanent
# firewall-cmd --reload
# firewall-cmd --list-all --zone=ZONE

Replace ZONE with the actual interface used for the exporter, and PORT with the actual port number of hanadb_exporter(default is 9968).

5 High Availability cluster exporter

Enables monitoring of Pacemaker, Corosync, SBD, DRBD and other components of High Availability clusters. Collects metrics to easily monitor cluster status and health.

Link: https://github.com/ClusterLabs/ha_cluster_exporter.

Export metrics in the prometheus format
  • Pacemaker cluster summary, nodes and resources stats

  • Corosync ring errors and quorum votes

  • Health status of SBD devices.

  • DRBD resources and connections status.

5.1 Installation

To install the High Availability cluster exporter on SUSE Linux Enterprise, run the zypper install prometheus-ha_cluster_exporter command.

5.1.1 Enabling systemd service

The High Availability cluster exporter RPM packages comes with the ha_cluster_exporter.service systemd service. To enable and start it, use the following command:

systemctl --now enable prometheus-ha_cluster_exporter

5.2 Using High Availability cluster exporter

You can run the exporter on any of the cluster nodes. Although it is not strictly required, it is advisable to run the exporter on all nodes.

The generated metrics are stored in the /metrics path. By default, the metrics can be accessed through the web interface on port 9664.

Although the exporter can run outside an High Availability cluster node, it cannot export any metric it is not able to collect. In this case, the exporter displays a warning message.

5.3 Configuring High Availability cluster exporter

Before you proceed, make sure that the Prometheus server and the firewall are configured as described in Important: Configure the Prometheus server and Important: Configure firewall

The provided default configuration is designed specifically for the latest version of SUSE Linux Enterprise. If necessary, any of the supported parameters can be modified either via command-line flags or via a configuration file. Use the ha_cluster_exporter --help command for more details on configuring parameters from the command line. Refer to the ha_cluster_exporter.yaml file for an example configuration.

It is also possible to specify CLI flags via the /etc/sysconfig/prometheus-ha_cluster_exporter file.

General flags
web.listen-address

Address to listen on for web interface and telemetry (default 9664).

web.telemetry-path

Directory for storing metrics data (default /metrics).

web.config.file

Path to a the web configuration file (default /etc/ha_cluster_exporter.web.yaml).

log.level

Logging verbosity (default info).

version

Print version information.

Collector flags
crm-mon-path

Path to the crm_mon executable (default /usr/sbin/crm_mon).

cibadmin-path

Path to the cibadmin executable (default /usr/sbin/cibadmin).

corosync-cfgtoolpath-path

Path to the corosync-cfgtool executable (default /usr/sbin/corosync-cfgtool).

corosync-quorumtool-path

Path to the corosync-quorumtool executable (default /usr/sbin/corosync-quorumtool).

sbd-path

Path to the sbd executable (default /usr/sbin/sbd).

sbd-config-path

Path to the sbd configuration (default /etc/sysconfig/sbd/).

drbdsetup-path

Path to the drbdsetup executable (default /sbin/drbdsetup).

drbdsplitbrain-path

Path to the drbd splitbrain hooks temporary files (default /var/run/drbd/splitbrain).

5.4 TLS and basic authentication

The High Availability cluster exporter supports TLS and basic authentication. To use TLS or basic authentication, specify a configuration file using the --web.config.file parameter. The format of the file is described in https://github.com/prometheus/exporter-toolkit/blob/master/docs/web-configuration.md.

5.5 Metrics specification

The following provides an overview of metrics generated by the High Availability cluster exporter.

Pacemaker.  The Pacemaker subsystem collects an atomic snapshot of the High Availability cluster directly from the XML CIB of Pacemaker using crm_mon.

Pacemaker
ha_cluster_pacemaker_config_last_change

A Unix timestamp in seconds converted to a floating number, corresponding to the last time Pacemaker configuration changed.

ha_cluster_pacemaker_fail_count

The fail count per node and resource ID.

ha_cluster_pacemaker_location_constraints

Resource location constraints.

Labels
  • constraint A unique string identifier of the constraint

  • node The node the constraint applies to

  • resource The resource the constraint applies to

  • role The resource role the constraint applies to (if any)

ha_cluster_pacemaker_migration_threshold

The number of migration threshold for each node and resource ID set by a Pacemaker cluster.

ha_cluster_pacemaker_nodes

The status of each node in the cluster (one line for the status of every node). 1 indicates the node is in the status specified by the status label, 0 means it is not.

Labels
  • node The name of the node (normally the hostname)

  • status Possible values: standby, standby_onfail, maintenance, pending, unclean, shutdown, expected_up, dc

  • type Possible values: member, ping, remote

ha_cluster_pacemaker_node_attributes

This metric exposes in its labels raw, opaque, cluster metadata, called node attributes that often leveraged by Resource Agents. The value of each line is always 1.

Labels
  • node The name of the node (normally the hostname)

  • name The name of the attribute

  • value The value of the attribute

ha_cluster_pacemaker_resources

The status of each resource in the cluster (one line for the status of each resource). 1 means the resource is in the status specified by the status label, 0 means that it is not.

Labels
  • agent The name of the resource agent for the resource

  • clone The name of the clone this resource belongs to (if any)

  • group The name of the group this resource belongs to, (if any)

  • managed Can be either true or false

  • node The name of the node hosting the resource

  • resource The unique resource name

  • role Possible values: started, stopped, master, slave or one of starting, stopping, migrating, promoting, demoting

ha_cluster_pacemaker_stonith_enabled

Whether or not stonith is enabled in the cluster. The value is either 1 or 0.

Corosync.  The Corosync subsystem collects cluster quorum votes and ring status by parsing the output of corosync-quorumtool and corosync-cfgtool.

Corosync
ha_cluster_corosync_member_votes

The number of votes each member node has contributed to the current quorum.

Labels
  • node_id The internal corosync identifier associated with the node

  • node The name of the node (normally the hostname)

  • local Indicates whether the node is local

ha_cluster_corosync_quorate

Indicates whether the cluster is quorate. The value is either 1 or 0

ha_cluster_corosync_quorum_votes

Cluster quorum votes (one line per type).

Labels
  • type Possible values: expected_votes, highest_expected, total_votes, quorum.

ha_cluster_corosync_ring_errors

The total number of faulty Corosync rings.

ha_cluster_corosync_rings

The status of each Corosync ring. 1 is healthy, 0 is faulty.

Labels
  • ring_id The internal Corosync ring identifier (normally corresponds to the first member node to join)

  • node_id The internal Corosync identifier of the local node

  • number The ring number

  • address the IP address locally linked to this ring

SBD.  The SBD subsystems collect statistics of each device by parsing its configuration and the output of sbd --dump.

SBD
ha_cluster_sbd_devices

The SBD devices in the cluster (one line per device). The line is either absent or has the value of 1.

Labels
  • device The path of the SBD device

  • status Possible values: healthy, unhealthy

ha_cluster_sbd_timeouts

The SBD timeouts for each SBD device.

Labels
  • device The path of the SBD device

  • type Possible values: watchdog, msgwait

DRBD.  The DRDB subsystem runs a special drbdsetup command to get the current status of a DRDB cluster in the JSON format.

DRBD
ha_cluster_drbd_connections

The DRBD resource connections (one line per resource and per peer_node_id). The line is either absent or has the value of 1.

Labels
  • resource The resource the connection is for

  • peer_node_id The id of the node this connection is for

  • peer_role Possible values: primary, secondaryunknown

  • volume The volume number

  • peer_disk_state Possible values attaching, failed, negotiating, inconsistent, outdated, unknown, consistent, uptodate

The total number of lines for this metric is the cardinality of resource multiplied by the cardinality of peer_node_id.

ha_cluster_drbd_connections_sync

The DRBD disk connections in sync percentage. Values are floating numbers between 0 and 100.00.

Labels
  • resource The resource the connection is for

  • peer_node_id The id of the node this connection is for

  • volume The volume number

ha_cluster_drbd_connections_received

Volume of net data received from the partner via the network connection in KiB (one line per resource and per peer_node_id). The value is an integer greater than or equal to 0.

Labels
  • resource The resource the connection is for

  • peer_node_id The id of the node this connection is for

  • volume The volume number

ha_cluster_drbd_connections_pending

Number of requests sent to the partner that have not yet been received (one line per resource and per peer_node_id). The value is an integer greater than or equal to 0.

Labels
  • resource The resource the connection is for

  • peer_node_id The id of the node this connection is for

  • volume The volume number

ha_cluster_drbd_connections_unacked

Number of requests received by the partner but have not yet been acknowledged (one line per resource and per peer_node_id). The value is an integer greater than or equal to 0.

Labels
  • resource The resource the connection is for

  • peer_node_id The id of the node this connection is for

  • volume The volume number

ha_cluster_drbd_resources

The DRBD resources (one line per name and per volume). The line is either absent or has the value of 1.

Labels
  • resource The name of the resource

  • role Possible values: primary, secondary, unknown

  • volume The volume number

  • disk_state Possible values: attaching, failed, negotiating, inconsistent, outdated, outdated, unknown, consistent, uptodate

The total number of lines for the metric is the cardinality of name multiplied by the cardinality of volume.

ha_cluster_drbd_written

Amount of data in KiB written to the DRBD resource (one line per resource and per volume) The value is an integer greater than or equal to 0.

Labels
  • resource The name of the resource

  • volume The volume number

ha_cluster_drbd_read

Amount of data in KiB read from the DRBD resource (one line per resource and per volume) The value is an integer greater than or equal to 0.

Labels
  • resource The name of the resource

  • volume The volume number

ha_cluster_drbd_al_writes

Number of updates of the activity log area of the meta data (one line per resource and per volume). The value is an integer greater than or equal to 0.

Labels
  • resource The name of the resource

  • volume The volume number

ha_cluster_drbd_bm_writes

Number of updates of the bitmap area of the metadata (one line per resource and per volume). The value is an integer greater than or equal to 0.

Labels
  • resource The name of the resource

  • volume The volume number

ha_cluster_drbd_upper_pending

Number of block I/O requests forwarded to DRBD but not yet answered by DRBD (one line per resource and per volume). The value is an integer greater than or equal to 0.

Labels
  • resource The name of the resource

  • volume The volume number

ha_cluster_drbd_lower_pending

Number of open requests to the local I/O sub-system issued by DRBD (one line per resource and per volume). The value is an integer greater than or equal to 0.

Labels
  • resource The name of the resource

  • volume The volume number

ha_cluster_drbd_quorum

Quorum status of the DRBD resource according to the configured quorum policies (one line per resource and per volume). The value is 1 when quorate, or 0 when inquorate.

Labels
  • resource The name of the resource

  • volume The volume number

ha_cluster_drbd_split_brain

Signals when there is a split brain occurring per resource and volume. The line is either absent or has the value of 1. To make this metric work you must setup a DRBD custom split-brain handler.

Labels
  • resource The name of the resource

  • volume The volume number

Scrape.  The scrape subsystem is a generic namespace dedicated to internal instrumentation of the exporter itself.

Scrape
ha_cluster_scrape_duration_seconds

The duration of a collector scrape in seconds.

Labels
  • collector collector names that correspond to the subsystem they collect metrics from

Example:

# TYPE ha_cluster_scrape_duration_seconds gauge
ha_cluster_scrape_duration_seconds{collector="pacemaker"} 1.234
ha_cluster_scrape_success

Indicates whether a collector succeeded. Collectors can fail gracefully, but that does not prevent them from running. If certain metrics cannot be scraped, the value of this metric is 0. In this case, the exporter logs for more details.

Labels
  • collector collector names that correspond to the subsystem they collect metrics from

    Example:

    # TYPE ha_cluster_scrape_success gauge
    ha_cluster_scrape_success{collector="pacemaker"} 1

6 SAP host exporter

Enables the monitoring of SAP NetWeaver, SAP HANA, and other applications. The gathered metrics are the data that can be obtained by running the sapcontrol command.

Link: https://github.com/SUSE/sap_host_exporter.

Exports metrics (for SAP S/4HANA, SAP NetWeaver, or SAP HANA hosts) in prometheus format
  • SAP start service process list

  • SAP enqueue server metrics

  • SAP application server dispatcher metrics

  • SAP internal alerts

7 For more information

8 Legal notice

Copyright© 2006– 2024 SUSE LLC and contributors. All rights reserved.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled GNU Free Documentation License.

For SUSE trademarks, see https://www.suse.com/company/legal/. All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its affiliates. Asterisks (*) denote third-party trademarks.

All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its affiliates, the authors, nor the translators shall be held liable for possible errors or the consequences thereof.