Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
Applies to SUSE OpenStack Cloud Monitoring

3 Operation and Maintenance

Regular operation and maintenance includes:

  • Configuring data retention for the InfluxDB database. This can be configured in the Monasca barclamp. For details, see the SUSE OpenStack Cloud Deployment Guide.

  • Configuring data retention for the Elasticsearch database. This can be configured in the Monasca barclamp. For details, see the SUSE OpenStack Cloud Deployment Guide.

  • Removing metrics data from the InfluxDB database.

  • Removing log data from the Elasticsearch database.

  • Handling log files of agents and services.

  • Backup and recovery of databases, configuration files, and dashboards.

3.1 Removing Metrics Data

Metrics data is stored in the Metrics and Alarms InfluxDB Database. InfluxDB features an SQL-like query language for querying data and performing aggregations on that data.

The Metrics Agent configuration defines the metrics and types of measurement for which data is stored. For each measurement, a so-called series is written to the InfluxDB database. A series consists of a timestamp, the metrics, and the value measured.

Every series can be assigned key tags. In the case of SUSE OpenStack Cloud Monitoring, this is the _tenant_id tag. This tag identifies the OpenStack project for which the metrics data has been collected.

From time to time, you may want to delete outdated or unnecessary metrics data from the Metrics and Alarms Database, for example, to save space or remove data for metrics you are no longer interested in. To delete data, you use the InfluxDB command line interface, the interactive shell that is provided for the InfluxDB database.

Proceed as follows to delete metrics data from the database:

  1. Create a backup of the database. For details, refer to Section 3.4, “Backup and Recovery”.

  2. Determine the ID of the OpenStack project for the data to be deleted:

    Log in to the OpenStack dashboard and go to Identity > Projects. The monasca project initially provides all metrics data related to SUSE OpenStack Cloud Monitoring.

    In the course of the productive operation of SUSE OpenStack Cloud Monitoring, additional projects may be created, for example, for application operators.

    The Project ID field shows the relevant tenant ID.

  3. Log in to the host where the Monitoring Service is installed.

  4. Go to the directory where InfluxDB is installed:

    cd /usr/bin
  5. Connect to InfluxDB using the InfluxDB command line interface as follows:

    ./influx -host <host_ip>

    Replace <host_ip> with the IP address of the machine on which SUSE OpenStack Cloud Monitoring is installed.

    The output of this command is, for example, as follows:

    Connected to http://localhost:8086 version 1.1.1
    InfluxDB shell version: 1.1.1
  6. Connect to the InfluxDB database of SUSE OpenStack Cloud Monitoring (mon):

    > show databases
    name: databases
    name
    ----
    mon
    _internal
    
    > use mon
    Using database mon
  7. Check the outdated or unnecessary data to be deleted.

    • You can view all measurements for a specific project as follows:

      SHOW MEASUREMENTS WHERE _tenant_id = '<project ID>'
    • You can view the series for a specific metrics and project, for example, as follows:

      SHOW SERIES FROM "cpu.user_perc" WHERE _tenant_id = '<project ID>'
  8. Delete the desired data.

    • When a project is no longer relevant or a specific tenant is no longer used, delete all series for the project as follows:

      DROP SERIES WHERE _tenant_id = '<project ID>'

      Example:

      DROP SERIES WHERE _tenant_id = '27620d7ee6e948e29172f1d0950bd6f4'
    • When a metrics is no longer relevant for a project, delete all series for the specific project and metrics as follows:

      DROP SERIES FROM "<metrics>" WHERE _tenant_id = '<project ID>'

      Example:

      DROP SERIES FROM "cpu.user_perc" WHERE _tenant_id = '27620d7e'
  9. Restart the influxdb service, for example, as follows:

    sudo systemctl restart influxdb

3.2 Removing Log Data

Log data is stored in the Elasticsearch database. Elasticsearch stores the data in indices. One index per day is created for every OpenStack project.

By default, the indices are stored in the following directory on the host where the Monitoring Service is installed:

/var/data/elasticsearch/<cluster-name>/nodes/<node-name>

Example:

/var/data/elasticsearch/elasticsearch/nodes/0

Note
Note

If your system uses a different directory, look up the path.data parameter in the Elasticsearch configuration file, /etc/elasticsearch/elasticsearch.yml.

If you want to delete outdated or unnecessary log data from the Elasticsearch database, proceed as follows:

  1. Make sure that curl is installed. If this is not the case, install the package with

    sudo zypper in curl
  2. Create a backup of the Elasticsearch database. For details, refer to Section 3.4, “Backup and Recovery”.

  3. Determine the ID of the OpenStack project for the data to be deleted:

    Log in to the OpenStack dashboard and go to Identity > Projects. The monasca project initially provides a ll metrics data related to SUSE OpenStack Cloud Monitoring.

    In the course of the productive operation of SUSE OpenStack Cloud Monitoring, additional projects may be created.

    The Project ID field shows the relevant ID.

  4. Log in to the host where the Monitoring Service is installed.

  5. Make sure that the data you want to delete exists by executing the following command:

    curl -XHEAD -i 'http://localhost:<port>/<projectID-date>'

    For example, if Elasticsearch is listening at port 9200 (default), the ID of the OpenStack project is abc123, and you want to check the index of 2015, July 1st, the command is as follows:

    curl -XHEAD -i 'http://localhost:9200/abc123-2015-07-01'

    If the HTTP response is 200, the index exists; if the response is 404, it does not exist.

  6. Delete the index as follows:

    curl -XDELETE -i 'http://localhost:<port>/<projectID-date>'

    Example:

    curl -XDELETE -i 'http://localhost:9200/abc123-2015-07-01'

    This command either returns an error, such as IndexMissingException, or acknowledges the successful deletion of the index.

Note
Note

Be aware that the -XDELETE command immediately deletes the index file!

Both, for -XHEAD and -XDELETE, you can use wildcards for processing several indices. For example, you can delete all indices of a specific project for the whole month of July, 2015:

curl -XDELETE -i 'http://localhost:9200/abc123-2015-07-*'
Note
Note

Take extreme care when using wildcards for the deletion of indices. You could delete all existing indices with one single command!

3.3 Log File Handling

In case of trouble with the SUSE OpenStack Cloud Monitoring services, you can study their log files to find the reason. The log files are also useful if you need to contact your support organization. For storing the log files, the default installation uses the /var/log directory on the hosts where the agents or services are installed.

You can use systemd, a system and session manager for LINUX, and journald, a LINUX logging interface, for addressing dispersed log files.

The SUSE OpenStack Cloud Monitoring installer automatically puts all SUSE OpenStack Cloud Monitoring services under the control of systemd. journald provides a centralized management solution for the logging of all processes that are controlled by systemd. The logs are collected and managed in a so-called journal controlled by the journald daemon.

For details on the systemd and journald utilities, refer to the https://documentation.suse.com/sles/12-SP5/single-html/SLES-admin/#part-system.

3.4 Backup and Recovery

Typical tasks of the Monitoring Service operator are to make regular backups, particularly of the data created during operation.

At regular intervals, you should make a backup of all:

  • Databases.

  • Configuration files of the individual agents and services.

  • Monitoring and log dashboards you have created and saved.

SUSE OpenStack Cloud Monitoring does not offer integrated backup and recovery mechanisms. Instead, use the mechanisms and procedures of the individual components.

3.4.1 Databases

You need to create regular backups of the following databases on the host where the Monitoring Service is installed:

  • Elasticsearch database for historic log data.

  • InfluxDB database for historic metrics data.

  • MariaDB database for historic configuration information.

It is recommended that backup and restore operations for databases are carried out by experienced operators only.

Preparations

Before backing up and restoring a database, we recommend stopping the Monitoring API and the Log API on the monasca-server node, and check that all data is processed. This ensures that no data is written to a database during a backup and restore operation. After backing up and restoring a database, restart the APIs.

To stop the Monitoring API and the Log API, use the following command:

systemctl stop apache2

To check that all Kafka queues are empty, list the existing consumer groups and check the LAG column for each group. It should be 0. For example:

kafka-consumer-groups.sh --zookeeper 192.168.56.81:2181 --list
kafka-consumer-groups.sh --zookeeper 192.168.56.81:2181 --describe \
 --group 1_metrics | column -t -s ','
kafka-consumer-groups.sh --zookeeper 192.168.56.81:2181 --describe \
 --group transformer-logstash-consumer | column -t -s ','
kafka-consumer-groups.sh --zookeeper 192.168.56.81:2181 --describe \
 --group thresh-metric | column -t -s ','

To restart the Monitoring API and the Log API, use the following command:

systemctl start apache2
Elasticsearch Database

For backing up and restoring your Elasticsearch database, use the Snapshot and Restore module of Elasticsearch.

To create a backup of the database, proceed as follows:

  1. Make sure that curl is installed, zypper in curl.

  2. Log in to the host where the Monitoring Service is installed.

  3. Create a snapshot repository. You need the Elasticsearch bind address for all commands. run grep network.bind_host /etc/elasticsearch/elasticsearch.yml to find the bind address, and replace IP in the following commands with this address. For example:

    curl -XPUT http://IP:9200/_snapshot/my_backup -d '{
      "type": "fs",
      "settings": {
           "location": "/mount/backup/elasticsearch1/my_backup",
           "compress": true
      }
    }'

    The example registers a shared file system repository ("type": "fs") that uses the /mount/backup/elasticsearch1 directory for storing snapshots.

    Note
    Note

    The directory for storing snapshots must be configured in the elasticsearch/repo_dir setting in the Monasca barclamp (see the Deployment Guide (https://documentation.suse.com/soc/8/single-html/suse-openstack-cloud-deployment/#sec-depl-ostack-monasca)). The directory must be manually mounted before creating the snapshot. The elasticsearch user must be specified as the owner of the directory.

    compress is turned on to compress the metadata files.

  4. Check whether the repository was created successfully:

    curl -XGET http://IP:9200/_snapshot/my_backup

    This example response shows a successfully created repository:

    {
      "my_backup": {
        "type": "fs",
        "settings": {
          "compress": "true",
          "location": "/mount/backup/elasticsearch1/my_backup"
        }
      }
    }
  5. Create a snapshot of your database that contains all indices. A repository can contain multiple snapshots of the same database. The name of a snapshot must be unique within the snapshots created for your database, for example:

    curl -XPUT http://IP:9200/_snapshot/my_backup/snapshot_1?wait_for_completion=true

    The example creates a snapshot named snapshot_1 for all indices in the my_backup repository.

To restore the database instance, proceed as follows:

  1. Close all indices of your database, for example:

    curl -XPOST http://IP:9200/_all/_close
  2. Restore all indices from the snapshot you have created, for example:

    curl -XPOST http://IP:9200/_snapshot/my_backup/snapshot_1/_restore

    The example restores all indices from snapshot_1 that is stored in the my_backup repository.

For additional information on backing up and restoring an Elasticsearch database, refer to the Elasticsearch documentation (https://www.elastic.co/guide/en/elasticsearch/reference/2.3/modules-snapshots.html).

InfluxDB Database

For backing up and restoring your InfluxDB database, you can use the InfluxDB shell. The shell is part of your InfluxDB distribution. If you installed InfluxDB via a package manager, the shell is, by default, installed in the /usr/bin directory.

To create a backup of the database, proceed as follows:

  1. Log in to the InfluxDB database as a user who is allowed to run the influxdb service, for example:

    su influxdb -s /bin/bash
  2. Back up the database, for example:

    influxd backup -database mon /mount/backup/mysnapshot

    Monasca is using mon as the name of the database The example creates the backup for the database in /mount/backup/mysnapshot.

Before restoring the database, make sure that all database processes are shut down. To restore the database, you can then proceed as follows:

  1. If required, delete all files not included in the backup by dropping the database before you carry out the restore operation. A restore operation restores all files included in the backup. Files created or merged at a later point in time are not affected. For example:

    influx -host IP -execute 'drop database mon;'

    Replace IP with the IP address that the database is listening to. You can run influxd config and look up the IP address in the [http] section.

  2. Stop the InfluxDB database service:

    systemctl stop influxdb
  3. Log in to the InfluxDB database as a user who is allowed to run the influxdb service:

    su influxdb -s /bin/bash
  4. Restore the metastore:

    influxd restore -metadir /var/opt/influxdb/meta /mount/backup/mysnapshot
  5. Restore the database, for example:

    influxd restore -database mon -datadir /var/opt/influxdb/data /mount/backup/mysnapshot

    The example restores the backup from /mount/backup/mysnapshot to /var/opt/influxdb/influxdb.conf.

  6. Ensure that the file permissions for the restored database are set correctly:

    chown -R influxdb:influxdb /var/opt/influxdb
  7. Start the InfluxDB database service:

    systemctl start influxdb

For additional information on backing up and restoring an InfluxDB database, refer to the InfluxDB documentation (https://docs.influxdata.com/influxdb/v1.1/administration/backup_and_restore/).

MariaDB Database

For backing up and restoring your MariaDB database, you can use the mysqldump utility program. mysqldump performs a logical backup that produces a set of SQL statements. These statements can later be executed to restore the database.

To back up your MariaDB database, you must be the owner of the database or a user with superuser privileges, for example:

mysqldump -u root -p mon > dumpfile.sql

In addition to the name of the database, you have to specify the name and the location where mysqldump stores its output.

To restore your MariaDB database, proceed as follows:

  1. Log in to the host where the Monitoring Service is installed as a user with root privileges.

  2. Make sure that the mariadb service is running:

    systemctl start mariadb
  3. Log in to the database you have backed up as a user with root privileges, for example:

    mysql -u root -p mon
  4. Remove and then re-create the database:

    DROP DATABASE mon;
    CREATE DATABASE mon;
  5. Exit mariadb:

    \q
  6. Restore the database, for example:

    mysql -u root -p mon < dumpfile.sql

For additional information on backing up and restoring a MariaDB database with mysqldump, refer to the MariaDB documentation (https://mariadb.com/kb/en/mariadb/mysqldump/).

3.4.2 Configuration Files

Below you find a list of the configuration files of the agents and the individual services included in the Monitoring Service. Back up these files at least after you have installed and configured SUSE OpenStack Cloud Monitoring and after each change in the configuration.

/etc/influxdb/influxdb.conf
/etc/kafka/server.properties
/etc/my.cnf
/etc/my.cnf.d/client.cnf
/etc/my.cnf.d/mysql-clients.cnf
/etc/my.cnf.d/server.cnf
/etc/monasca/agent/agent.yaml
/etc/monasca/agent/conf.d/*
/etc/monasca/agent/supervisor.conf
/etc/monasca/api-config.conf
/etc/monasca/log-api-config.conf
/etc/monasca/log-api-config.ini
/etc/monasca-log-persister/monasca-log-persister.conf
/etc/monasca-log-transformer/monasca-log-transformer.conf
/etc/monasca-log-agent/agent.conf
/etc/monasca-notification/monasca-notification.yaml
/etc/monasca-persister/monasca-persister.yaml
/etc/monasca-thresh/thresh.yaml
/etc/elasticsearch/elasticsearch.yml
/etc/elasticsearch/logging.yml
/etc/kibana/kibana.yml
Recovery

If you need to recover the configuration of one or more agents or services, the recommended procedure is as follows:

  1. If necessary, uninstall the agents or services, and install them again.

  2. Stop the agents or services.

  3. Copy the backup of your configuration files to the correct location according to the table above.

  4. Start the agents or services again.

3.4.3 Dashboards

Kibana can persist customized log dashboard designs to the Elasticsearch database, and allows you to recall them. For details on saving, loading, and sharing log management dashboards, refer to the Kibana documentation (https://www.elastic.co/guide/en/kibana/4.5/dashboard.html#saving-dashboards).

Grafana allows you to export a monitoring dashboard to a JSON file, and to re-import it when necessary. For backing up and restoring the exported dashboards, use the standard mechanisms of your file system. For details on exporting monitoring dashboards, refer to the Getting Started (https://grafana.com/docs/guides/getting_started/) tutorial of Grafana.

Print this page