Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
documentation.suse.com / Documentation / Operations Guide CLM / Operations Console
Applies to SUSE OpenStack Cloud 9

16 Operations Console

Often referred to as the Ops Console, you can use this web-based graphical user interface (GUI) to view data about your cloud infrastructure and ensure your cloud is operating correctly.

16.1 Using the Operations Console

16.1.1 Operations Console Overview

Often referred to as the Ops Console, you can use this web-based graphical user interface (GUI) to view data about your cloud infrastructure and ensure your cloud is operating correctly.

You can use the Operations Console for SUSE OpenStack Cloud 9 to view data about your SUSE OpenStack Cloud infrastructure in a web-based graphical user interface (GUI) and ensure your cloud is operating correctly. By logging on to the console, SUSE OpenStack Cloud administrators can manage data in the following ways: Triage alarm notifications.

  • Alarm Definitions and notifications now have their own screens and are collected under the Alarm Explorer menu item which can be accessed from the Central Dashboard. Central Dashboard now allows you to customize the view in the following ways:

    • Rename or re-configure existing alarm cards to include services different from the defaults

    • Create a new alarm card with the services you want to select

    • Reorder alarm cards using drag and drop

    • View all alarms that have no service dimension now grouped in an Uncategorized Alarms card

    • View all alarms that have a service dimension that does not match any of the other cards -now grouped in an Other Alarms card

  • You can also easily access alarm data for a specific component. On the Summary page for the following components, a link is provided to an alarms screen specifically for that component:

16.1.1.1 Monitor the environment by giving priority to alarms that take precedence.

Alarm Explorer now allows you to manage alarms in the following ways:

  • Refine the monitoring environment by creating new alarms to specify a combination of metrics, services, and hosts that match the triggers unique to an environment

  • Filter alarms in one place using an enumerated filter box instead of service badges

  • Specify full alarm IDs as dimension key-value pairs in the form of dimension=value

16.1.1.2 Support Changes

  • To resolve scalability issues, plain text search through alarm sets is no longer supported

The Business Logic Layer of Operations Console is a middleware component that serves as a single point of contact for the user interface to communicate with OpenStack services such as monasca, nova, and others.

16.1.2 Connecting to the Operations Console

Instructions for accessing the Operations Console through a web browser.

To connect to Operations Console, perform the following:

Operations Console will always be accessed over port 9095.

16.1.2.1 Required Access Credentials

In previous versions of Operations Console you were required to have only the password for the Administrator account (admin by default). Now the Administrator user account must also have all of the following credentials:

ProjectDomainRoleDescription
*All projects**not specific*AdminAdmin role on at least one project
*All projects**not specific*AdminAdmin role in default domain
AdmindefaultAdmin or monasca-userAdmin or monasca-user role on admin project
Important
Important

If your login account has administrator role on the administrator project, then you only need to make sure you have the administrator role on the default domain.

Administrator account

During installation, an administrator account called admin is created by default.

Administrator password

During installation, an administrator password is randomly created by default. It is not recommend that you change the default password.

To find the randomized password:

  1. To display the password, log on to the Cloud Lifecycle Manager and run:

    cat ~/service.osrc

16.1.2.2 Connect Through a Browser

The following instructions will show you how to find the URL to access Operations Console. You will use SSH, also known as Secure Socket Shell, which provides administrators with a secure way to access a remote computer.

To access Operations Console:

  1. Log in to the Cloud Lifecycle Manager.

  2. Locate the URL or IP address for the Operations Console with the following command:

    source ~/service.osrc && openstack endpoint list | grep opsconsole | grep admin

    Sample output:

    | 8ef10dd9c00e4abdb18b5b22adc93e87 | region1 | opsconsole | opsconsole | True | admin | https://192.168.24.169:9095/api/v1/

    To access Operations Console, in the sample output, remove everything after port 9095 (api/v1/) and in a browser, type:

    https://192.168.24.169:9095

16.1.2.3 Optionally use a Hostname OR virtual IP address to access Operations Console

Important
Important

If you can access Operations Console using the above instructions, then you can skip this section. These steps provide an alternate way to access Operations Console if the above steps do not work for you.

To find your hostname OR IP address:

  1. Navigate to and open in a text editor the following file:

    network_groups.yml
  2. Find the following entry:

    external-name
  3. If your administrator set a hostname value in the external-name field, you will use that hostname when logging in to Operations Console. or example, in a browser you would type:

    https://VIP:9095
  4. If your administrator did not set a hostname value, then to determine the IP address to use, from your Cloud Lifecycle Manager, run:

    grep HZN-WEB /etc/hosts

    The output of that command will show you the virtual IP address you should use. For example, in a browser you would type:

    https://VIP:9095

16.1.3 Managing Compute Hosts

Operations Console (Ops Console) provides a graphical interface for you to add and delete compute hosts.

As your deployment grows and changes, you may need to add more compute hosts to increase your capacity for VMs, or delete a host to reallocate hardware for a different use. To accomplish these tasks, in previous versions of SUSE OpenStack Cloud you had to use the command line to update configuration files and run ansible playbooks. Now Operations Console provides a graphical interface for you to complete the same tasks quickly using menu items in the console.

Important
Important

Do not refresh the Operations Console page or open Operations Console in another window during the following tasks. If you do, you will not see any notifications or be able to review the error log for more information. This would make troubleshooting difficult since you would not know the error that was encountered, or why it occurred.

Use Operations Console to perform the following tasks:

Important
Important

To use Operations Console, you need to have the correct permissions and know the URL or VIP connected to Operations Console during installation.

16.1.3.1 Create a Compute Host

If you need to create additional compute hosts for more virtual machine capacity, you can do this easily on the Compute Hosts screen.

To add a compute host:

  1. To open Operations Console, in a browser, enter either the URL or Virtual IP connected to Operations Console.

    For example:

    https://myardana.test:9095
    https://VIP:9095
  2. On the Home screen, click the menu represented by 3 horizontal lines (Three-Line Icon).

  3. From the menu that slides in on the left side, click Compute, and then Compute Hosts.

  4. On the Compute Hosts page, click Create Host.

  5. On the Add & Activate Compute Host tab that slides in from the right, enter the following information:

    Host ID

    Cloud Lifecycle Manager model's server ID

    Host Role

    Defined in the Cloud Lifecycle Manager model and cannot be modified in Operations Console

    Host Group

    Defined in the Cloud Lifecycle Manager model and cannot be modified in Operations Console

    Host NIC Mapping

    Defined in the Cloud Lifecycle Manager model and cannot be modified in Operations Console

    Encryption Key

    If the configuration is encrypted, enter the encryption key here

  6. Click Create Host, and in the confirmation screen that opens, click Confirm.

  7. Wait for SUSE OpenStack Cloud to complete the pre deployment steps. This process can take up to 2 minutes.

  8. If pre-deployment is successful, you will see a notification that deployment has started.

    Important
    Important

    If you receive a notice that pre-deployment did not complete successfully, read the notification explaining at which step the error occured. You can click on the error notification and see the ansible log for the configuration processor playbook. Then you can click Create Host in step 4 again and correct the mistake.

  9. Wait for SUSE OpenStack Cloud to complete the deployments steps. This process can take up to 20 minutes.

  10. If deployment is successful, you will see a notification and a new entry will appear in the compute hosts table.

    Important
    Important

    If you receive a notice that deployment did not complete successfully, read the notification explaining at which step the error occured. You can click on the error notification for more details.

16.1.3.2 Deactivate a Compute Host

If you have multiple compute hosts and for debugging reasons you want to disable them all except one, you may need to deactivate and then activate a compute host. If you want to delete a host, you will also have to deactivate it first. This can be done easily in the Operations Console.

Important
Important

The host must be in the following state: ACTIVATED

To deactivate a compute host:

  1. To open Operations Console, in a browser, enter either the URL or Virtual IP connected to Operations Console.

    For example:

    https://myardana.test:9095
    https://VIP:9095
  2. On the Home screen, click the menu represented by 3 horizontal lines (Three-Line Icon).

  3. From the menu that slides in on the left side, click Compute, and then Compute Hosts.

  4. On the Compute Hosts page, in the row for the host you want to deactivate, click the details button (Ellipsis Icon).

  5. Click Deactivate, and in the confirmation screen that opens, click Confirm.

  6. Wait for SUSE OpenStack Cloud to complete the operation. This process can take up to 2 minutes.

  7. If deactivation is successful, you will see a notification and in the compute hosts table the STATE will change to DEACTIVATED.

    Important
    Important

    If you receive a notice that the operation did not complete successfully, read the notification explaining at which step the error occured. You can click on the link in the error notification for more details. In the compute hosts table the STATE will remain ACTIVATED.

16.1.3.3 Activate a Compute Host

Important
Important

The host must be in the following state: DEACTIVATED

To activate a compute host:

  1. To open Operations Console, in a browser, enter either the URL or Virtual IP connected to Operations Console.

    For example:

    https://myardana.test:9095
    https://VIP:9095
  2. On the Home screen, click the menu represented by 3 horizontal lines (Three-Line Icon).

  3. From the menu that slides in on the left side, click Compute, and then Compute Hosts.

  4. On the Compute Hosts page, in the row for the host you want to activate, click the details button (Ellipsis Icon).

  5. Click Activate, and in the confirmation screen that opens, click Confirm.

  6. Wait for SUSE OpenStack Cloud to complete the operation. This process can take up to 2 minutes.

  7. If activation is successful, you will see a notification and in the compute hosts table the STATE will change to ACTIVATED.

    Important
    Important

    If you receive a notice that the operation did not complete successfully, read the notification explaining at which step the error occured. You can click on the link in the error notification for more details. In the compute hosts table the STATE will remain DEACTIVATED.

16.1.3.4 Delete a Compute Host

If you need to scale down the size of your current deployment to use the hardware for other purposes, you may want to delete a compute host.

Important
Important

Complete the following steps before deleting a host:

  • host must be in the following state: DEACTIVATED

  • Optionally you can migrate the instance off the host to be deleted. To do this, complete the following sections in Section 15.1.3.5, “Removing a Compute Node”:

    1. Disable provisioning on the compute host.

    2. Use live migration to move any instances on this host to other hosts.

To delete a compute host:

  1. To open Operations Console, in a browser, enter either the URL or Virtual IP connected to Operations Console.

    For example:

    https://myardana.test:9095
    https://VIP:9095
  2. On the Home screen, click the menu represented by 3 horizontal lines (Three-Line Icon).

  3. From the menu that slides in on the left side, click Compute, and then Compute Hosts.

  4. On the Compute Hosts page, in the row for the host you want to delete, click the details button (Ellipsis Icon).

  5. Click Delete, and if the configuration is encrypted, enter the encryption key.

  6. in the confirmation screen that opens, click Confirm.

  7. In the compute hosts table you will see the STATE change to Deleting.

  8. Wait for SUSE OpenStack Cloud to complete the operation. This process can take up to 2 minutes.

  9. If deletion is successful, you will see a notification and in the compute hosts table the host will not be listed.

    Important
    Important

    If you receive a notice that the operation did not complete successfully, read the notification explaining at which step the error occured. You can click on the link in the error notification for more details. In the compute hosts table the STATE will remain DEACTIVATED.

16.1.3.5 For More Information

For more information on how to complete these tasks through the command line, see the following topics:

16.1.4 Managing Swift Performance

In Operations Console you can monitor your swift cluster to ensure long-term data protection as well as sufficient performance.

OpenStack swift is an object storage solution with a focus on availability. While there are various mechanisms inside swift to protect stored data and ensure a high availability, you must still closely monitor your swift cluster to ensure long-term data protection as well as sufficient performance. The best way to manage swift is to collect useful data that will detect possible performance impacts early on.

The new Object Summary Dashboard in Operations Console provides an overview of your swift environment.

Important
Important

If swift is not installed and configured, you will not be able to access this dashboard. The swift endpoint must be present in keystone for the Object Summary to be present in the menu.

In Operations Console's object storage dashboard, you can easily review the following information:

16.1.4.1 Performance Summary

View a comprehensive summary of current performance values.

To access the object storage performance dashboard:

  1. To open Operations Console, in a browser enter either the URL or Virtual IP connected to Operations Console.

    For example:

    https://myardana.test:9095
    https://VIP:9095
  2. On the Home screen, click the menu represented by 3 horizontal lines (Three-Line Icon).

  3. In the menu, click Storage › Object Storage Summary.

Performance data includes:

Healthcheck Latency from monasca

This latency is the average time it takes for swift to respond to a healthcheck, or ping, request. The swiftlm-uptime monitor program reports the value. A large difference between average and maximum may indicate a problem with one node.

Operational Latency from monasca

Operational latency is the average time it takes for swift to respond to an upload, download, or object delete request. The swiftlm-uptime monitor program reports the value. A large difference between average and maximum may indicate a problem with one node.

Service Availability

This is the availability over the last 24 hours as a percentage.

  • 100% - No outages in the last 24 hours

  • 50% - swift was unavailable for a total of 12 hours in the last 24-hour period

Graph of Performance Over Time

Create a visual representation of performance data to see when swift encountered longer-than-normal response times.

To create a graph:

  1. Choose the length of time you want to graph in Date Range. This sets the length of time for the x-axis which counts backwards until it reaches the present time. In the example below, 1 day is selected, and so the x axis shows performance starting from 24 hours ago (-24) until the present time.

  2. Look at the y-axis to understand the range of response times. The first number is the smallest value in the data collected from the backend, and the last number is the longest amount of time it took swift to respond to a request. In the example below, the shortest time for a response from swift was 16.1 milliseconds.

  3. Look for spikes which represent longer than normal response times. In the example below, swift experienced long response times 21 hours ago and again 1 hour ago.

  4. Look for the latency value at the present time. The line running across the x-axis at 16.1 milliseconds shows you what the response time is currently.

Image

16.1.4.2 Inventory Summary

Monitor details about all the swift resources deployed in your cloud.

To access the object storage inventory screen:

  1. To open Operations Console, in a browser enter either the URL or Virtual IP connected to Operations Console.

    For example:

    https://myardana.test:9095
    https://VIP:9095
  2. On the Home screen, click the menu represented by 3 horizontal lines (Three-Line Icon).

  3. In the menu, click Storage › Object Storage Summary.

  4. On the Summary page, click Inventory Summary.

Image

General swift metrics are available for the following attributes:

  • Time to replicate: The average time in seconds it takes all hosts to complete a replication cycle.

  • Oldest replication: The time in seconds that has elapsed since the object replication process completed its last replication cycle.

  • Async Pending: This is the number of failed requests to add an entry in the container server's database.There is one async queue per swift disk, and a cron job queries all swift servers to calculate the total. When an object is uploaded into swift, and it is successfully stored, a request is sent to the container server to add a new entry for the object in the database. If the container update fails, the request is stored in what swift calls an Async Pending Queue.

    Important
    Important

    On a public cloud deployment, this value can reach millions. If it continues to grow, it means that the container updates are not keeping up with the requests. It is also normal for it this number to grow if a node hosting the swift container service is down.

  • Total number of alarms: This number includes all nodes that host swift services, including proxy, account, container, and object storage services.

  • Total nodes: This number includes all nodes that host swift services, including proxy, account, container, and object storage services. The number in the colored box represents the number of alarms in that state. The following colors are used to show the most severe alarm triggered on all nodes:

    Green

    Indicates all alarms are in a known and untriggered state. For example, if there are 5 nodes and they are all known with no alarms, you will see the number 5 in the green box, and a zero in all the other colored boxes.

    Yellow

    Indicates that some low or medium alarms have been triggered but no critical or high alarms. For example, if there are 5 nodes, and there are 3 nodes with untriggered alarms and 2 nodes with medium severity alarms, you will see the number 3 in the green box, the number 2 in the yellow box, and zeros in all the other colored boxes.

    Red

    Indicates at least one critical or high severity alarm has been triggered on a node. For example, if there are 5 nodes, and there are 3 nodes with untriggered alarms, 1 node with a low severity, and 1 node with a critical alarm, you will see the number 3 in the green box, the number 1 in the yellow box, the number 1 in the red box,and a zero in the gray box.

    Gray

    Indicates that all alarms on the nodes are unknown. For example, if there are 5 nodes with no data reported, you will see the number 5 in the gray box, and zeros in all the other colored boxes.

  • Cluster breakdown of nodes: In the example screen above, the cluster consists of 2 nodes named SWPAC and SWOBJ. Click a node name to bring up more detailed information about that node.

16.1.4.3 Capacity Summary

Use this screen to view the size of the file system space on all nodes and disk drives assigned to swift. Also shown is the remaining space available and the total size of all file systems used by swift. Values are given in megabytes (MB).

To access the object storage alarm summary screen:

  1. To open Operations Console, in a browser enter either the URL or Virtual IP connected to Operations Console.

    For example:

    ardana > https://myardana.test:9095
    https://VIP:9095
  2. On the Home screen, click the menu represented by 3 horizontal lines (Three-Line Icon).

  3. In the menu, click Storage › Object Storage Summary.

  4. On the Summary page, click Capacity Summary.

Image

16.1.4.4 Alarm Summary

Use this page to quickly see the most recent alarms and triage all alarms related to object storage.

To access the object storage alarm summary screen:

  1. To open Operations Console, in a browser enter either the URL or Virtual IP connected to Operations Console.

    For example:

    ardana > https://myardana.test:9095
    https://VIP:9095
  2. On the Home screen, click the menu represented by 3 horizontal lines (Three-Line Icon).

  3. In the menu, click Storage › Object Storage Summary.

  4. On the Summary page, click Alarm Summary.

Image

Each row has a checkbox to allow you to select multiple alarms and set the same condition on them.

The State column displays a graphical indicator representing the state of each alarm:

  • Green indicator: OK. Good operating state.

  • Yellow indicator: Warning. Low severity, not requiring immediate action.

  • Red indicator: Alarm. Varying severity levels and must be addressed.

  • Gray indicator: Undetermined.

The Alarm column identifies the alarm by the name it was given when it was originally created.

The Last Check column displays the date and time the most recent occurrence of the alarm.

The Dimension column describes the components to check in order to clear the alarm.

The last column, depicted by three dots, reveals an Actions menu that allows you to choose:

  • View Details, which opens a separate window that shows all the information from the table view and the alarm history.

    Comments can be updated by clicking Update Comment. Click View Alarm Definition to go to the Alarm Definition tab showing that specific alarm definition.

16.1.5 Visualizing Data in Charts

Operations Console allows you to create a new chart and select the time range and the metric you want to chart, based on monasca metrics.

Present data in a pictorial or graphical format to enable administrators and decision makers to grasp difficult concepts or identify new patterns.

Create new time-series graphs from My Dashboard.

My Dashboard also allows you to customize the view in the following ways:

  • Include alarm cards from the Central Dashboard

  • Customize graphs in new ways

  • Reorder items using drag and drop

Plan for future storage

  • Track capacity over time to predict with some degree of reliability the amount of additional storage needed.

Charts and graphs provide a quick way to visualize large amounts of complex data. It is especially useful when trying to find relationships and understand your data, which could include thousands or even millions of variables. You can create a new chart in Operations Console from My Dashboard.

The charts in Operations Console are based on monasca data. When you create a new chart you will be able to select the time range and the metric you want to chart. The list of Metrics you can choose from is equivalent to using the monasca metric-name-list on the command line. After you select a metric, you can then specify a dimension, which is derived from the monasca metric-list –name <metric_name> command line results. The dimension list changes based on the selected metric.

This topic provides instructions on how to create a basic chart, and how to create a chart specifically to visualize your cinder capacity.

16.1.5.1 Create a Chart

Create a chart to visually display data for up to 6 metrics over a period of time.

To create a chart:

  1. To open Operations Console, in a browser, enter either the URL or Virtual IP connected to Operations Console.

    For example:

    https://myardana.test:9095
    https://VIP:9095
  2. On the Home screen, click the menu represented by 3 horizontal lines (Three-Line Icon).

  3. From the menu that slides in on the left, select Home, and then select My Dashboard.

  4. On the My Dashboard screen, select Create New Chart.

  5. On the Add New Time Series Chart screen, in Chart Definition complete any of the optional fields:

    Name

    Short description of chart.

    Time Range

    Specifies the interval between metric collection. The default is 1 hour. Can be set to hours (1,2,4,8,24) or days (7,30,45).

    Chart Update Rate

    Collects metric data and adds it to the chart at the specified interval. The default is 1 minute. Can be set to minutes (1,5,10,30) or 1 hour.

    Chart Type

    Determines how the data is displayed. The default type is Line. Can be set to the following values:

    • Image

      Line

    • Image

      Bar

    • Image

      Stacked Bar

    • Image

      Area

    • Image

      Stacked Area

    Chart Size

    This controls the visual display of the chart width as it appears on My Dashboard. The default is Small. This field can be set to Small to display it at 50% or Large for 100%.

  6. On the Add New Time Series Chart screen, in Added Chart Data complete the following fields:

    Metric

    In monasca, a metric is a multi-dimensional description that consists of the following fields: name, dimensions, timestamp, value and value_meta. The pre-populated list is equivalent to using the monasca metric-name-list on the command line.

    Dimension

    The set of unique dimensions that are defined for a specific metric. Dimensions are a dictionary of key-value pairs. This pre-populated list is equivalent to using the monasca metric-list –name <metric_name> on the command line.

    Function

    Operations Console uses monasca to provide the results of all mathematical functions. monasca in turns uses Graphite to perform the mathematical calculations and return the results. The default is AVG. The Function field can be set to AVG (default), MIN, MAX. and COUNT. For more information on these functions, see the Graphite documentation at http://www.aosabook.org/en/graphite.html.

  7. Click Add Data To Chart. To add another metric to the chart, repeat the previous step until all metrics are added. The maximum you can have in one chart is 6 metrics.

  8. To create the chart, click Create New Chart.

After you click Create New Chart, you will be returned to My Dashboard where the new chart will be shown. From the My Dashboard screen you can use the menu in the top-right corner of the card to delete or edit the chart. You can also select an option to create a comma-delimited file of the data in the chart.

16.1.5.2 Chart cinder Capacity

To visualize the use of storage capacity over time, you can create a chart that graphs the total block storage backend capacity. To find out how much of that total is being used, you can also create a chart that graphs the available block storage capacity.

Visualizing cinder:

Important
Important

The total and free capacity values are based on the available capacity reported by the cinder backend. Be aware that some backends can be configured to thinly provision.

16.1.5.3 Chart Total Capacity

To chart the total block-storage backend capacity:

  1. Log in to Operations Console.

  2. Follow the steps in the previous instructions to start creating a chart.

  3. To chart the total backend capacity, on the Add New Time Series Chart screen, in Chart Definition use the following settings:

    FieldSetting
    Metricscinderlm.cinder.backend.total.size
    Dimension

    any hostname. If multiple backends are available, select any one. The backends will all return the same metric data.

  4. Add the data to the chart and click Create.

Example of a cinder Total Capacity Chart:

16.1.5.4 Chart Available Capacity

To chart the available block-storage backend capacity:

  1. Log in to Operations Console.

  2. Follow the steps in the previous instructions to start creating a chart.

  3. To chart the available backend capacity, on the Add New Time Series Chart screen, in Chart Definition use the following settings:

    FieldSetting
    Metricscinderlm.cinder.backend.total.avail
    Dimension

    any hostname. If multiple backends are available, select any one. The backends will all return the same metric data.

  4. Add the data to the chart and click Create.

Example of a chart showing cinder Available Capacity:

Important
Important

The source data for the Capacity Summary pages is only refreshed at the top of each hour. This affects the latency of the displayed data on those pages.

16.1.6 Getting Help with the Operations Console

On each of the Operations Console pages there is a help menu that you can click on to take you to a help page specific to the console you are currently viewing.

To reach the help page:

  1. Click the help menu option in the upper-right corner of the page, depicted by the question mark seen in the screenshot below.

  2. Click the Get Help For This Page link which will open the help page in a new tab in your browser.

Image

16.2 Alarm Definition

The Alarm Definition section under Monitoring allows you to define alarms that are useful in generating notifications and metrics required by your organization. By default, alarm definitions are sorted by name and in a table format.

16.2.1 Filter and Sort

The search feature allows you to search and filter alarm entries by name and description.

The check box above the top left of the table is used to select all alarm definitions on the current page.

To sort the table, click the desired column header. To reverse the sort order, click the column again.

16.2.2 Create Alarm Definitions

The Create Alarm Definition button next to the search bar allows you to create a new alarm definition.

To create a new alarm definition:

  1. Click Create Alarm Definition to open the Create Alarm Definition dialog.

  2. In the Create Alarm Definition window, type a name for the alarm in the Name text field. The name is mandatory and can be up to 255 characters long. The name can include letters, numbers, and special characters.

  3. Provide a short description of the alarm in the Description text field (optional).

  4. Select the desired severity level of the alarm from the Severity drop-down box. The severity level is subjective, so choose the level appropriate for prioritizing the handling of alarms when they occur.

  5. Although not required, in order to specify how to receive notifications, you must be able to select the method(s) of notification (Email, Web, API, etc.) from the list of options in the Alarm Notifications area. If none are available to choose from, you must first configure them in the Notifications Methods window. Refer to the Notification Methods help page for further instructions.

  6. To enable notifications for the alarm, enable the check box next to the desired alarm notification method.

  7. Apply the following rules to your alarm by using the Alarm Expression form:

    • Function: determines the output value from a supplied input value.

    • Metric: applies a pre-defined means of measuring whatever aspect of the alarm.

    • Dimension(s): identifies which aspect (Hostname, Region, and Service) of the alarm you want to monitor.

    • Comparator: specifies the operator for how you want the alarm to trigger.

    • Threshold: determines the numeric threshold associated with the operator you specified.

  8. Match By (optional): group results by a specific dimension that is not part of the Dimension(s) solution.

  9. To save the changes and add the new alarm definition to the table, click Create Alarm Definition.

16.3 Alarm Explorer

This page displays the alarms for all services and appliances. By default, alarms are sorted by their state.

16.3.1 Filter and Sort

Using the Filter Alarms button, you can filter the alarms by their IDs and dimensions. The Filter Alarms dialog lets you configure a filtering rule using the Alarm ID field and options in the Dimension(s) section.

You can display the alarms by grid, list or table views by selecting the corresponding icons next to the Sort By control.

To sort the alarm list, click the desired column header. To reverse the sort order, click the column again.

16.3.2 Alarm Table

Each row has a checkbox to allow you to select multiple alarms and set the same condition on them.

The Status column displays a graphical indicator that shows the state of each alarm:

  • Green indicator: OK. Good operating state.

  • Yellow indicator: Warning. Low severity, not requiring immediate action.

  • Red indicator: Alarm. Varying severity levels and must be addressed.

  • Gray indicator: Unknown.

The Alarm column identifies the alarm by the name it was given when it was originally created.

The Last Check column displays the date and time the most recent occurrence of the alarm.

The Dimension column describes the components to check in order to clear the alarm.

16.3.3 Notification Methods

The Notification Methods section of the Alarm Explorer allows you to define notification methods that are used by the alarms. By default, notification methods are sorted by name.

16.3.3.1 Filter and Sort

The filter bar allows you to filter the notification methods by specifying a filter criteria. You can sort the available notification methods by clicking on the desired column header in the table.

16.3.3.2 Create Notification Methods

The Create Notification Methods button beside the search bar allows you to create a new notification method.

To create a new notification method:

  1. Click the Create Notification Method button.

  2. In the Create Notification Method window, specify a name for the notification in the Name text field. The name is required, and it can be up to 255 characters in length, consisting of letters, numbers, or special characters.

  3. Select a Type in the drop down and select the desired option:

    • Web Hook allows you to enter in an internet address, also referred to as a Web Hook.

    • Email allows you to enter in an email address. For this method to work you need to have a SMTP server specified.

    • PagerDuty allows you to enter in a PagerDuty address.

  4. In the Address/Key text field, provide the required values.

  5. Press Create Notification Method, and you should see the created notification method in the table.

16.4 Compute Hosts

This Compute Hosts page in the Compute section allows you to view your Compute Host resources.

16.4.1 Filter and Sort

The dedicated bar at the top of the page bar lets you filter alarm entries using the available filtering options.

Compute Hosts
Figure 16.1: Compute Hosts

Click the Filter icon to select one of the available options:

  • Any Column enables plain search across all columns

  • Status filters alarm entries by status.

  • Type enables filtering by host type, including Hyper-V, KVM, ESXi, and VMWare vCenter server.

  • State filters alarm entries by nova state (for example, Activated, Activating, Imported, etc.).

  • Alarm State filters entries bay status of the alarms that are triggered on the host.

  • Cluster returns a filtered list of configured clusters that Compute Hosts belong to.

The alarm entries can be sorted by clicking on the appropriate column header, such as Name, Status, Type, State, etc.

To view detailed information (including alarm counts and utilization metrics) about a specific host in the list, click in the host's name in the list.

16.5 Compute Instances

This Operations Console page allows you to monitor your Compute instances.

16.5.1 Search and Sort

The search bar allows you to filter the alarm definitions you want to view. Type and Status are examples of alarm criteria that can be specified. Additionally, you can filter by typing in text similar to searching by keywords.

The checkbox allows you to select (or deselect) a group of alarm definitions to delete:

  • Select Visible allows you to delete the selected alarm definitions from the table.

  • Select All allows you to delete all the alarms from the table.

  • Clear Selection allows you to clear all the selections currently selected from the table.

You can display the alarm definitions by grid, list or table views by selecting the corresponding icons next to the Sort By control.

The Sort By control contains a drop-down list of ways by which you can sort the compute nodes. Alternatively, you can also sort using the column headers in the table.

  • Sort by Name displays the compute instances by the name assigned to it when it was created.

  • Sort by State displays the compute instances by their current state.

  • Sort by Status displays the compute instances by their current status.

  • Sort by Host displays the compute instances by their host.

  • Sort by Image displays the compute instances by the image being used.

  • Sort by IP Address displays the compute instances by their IP address.

16.6 Compute Summary

The Compute Summary page in the Compute section gives you access to inventory, capacity, and alarm summaries.

16.6.1 Inventory Summary

The Inventory Summary section provides an overview of compute alarms by status. These alarms are grouped by control plane. There is also information on resource usage for each compute host. Here you can also see alarms triggered on individual compute hosts.

Compute Summary
Figure 16.2: Compute Summary

16.6.2 Capacity Summary

Capacity Summary offers an overview of the utilization of physical resources and allocation of virtual resources among compute nodes. Here you will also find a break-down of CPU, memory, and storage usage across all compute resources in the cloud.

16.6.3 Compute Summary

The Compute Summary show overviews of new alarms as well as a list of all alarms that can be filtered and sorted. For more information on filtering alarms, see Section 16.3, “Alarm Explorer”.

16.6.4 Appliances

This page displays details of an appliance.

Search and Sort

  • The search bar allows you to filter the appliances you want to view. Role and Status are examples of criteria that can be specified. Additionally, you can filter by selecting Any Column and typing in text similar to searching by keywords.

  • You can sort using the column headers in the table.

Actions

Click the Action icon (three dots) to view details of an appliance.

16.6.5 Block Storage Summary

This page displays the alarms that have triggered since the timeframe indicated.

Search and Sort

  • The search bar allows you to filter the alarms you want to view. State and Service are examples of criteria that can be specified. Additionally, you can filter by typing in text similar to searching by keywords.

  • You can sort alarm entries using the column headers in the table.

New Alarms: Block Storage

The New Alarms section shows you the alarms that have triggered since the timeframe indicated. You can select the timeframe using the Configure control with options ranging from the Last Minute to Last 30 Days. This section refreshes every 60 seconds.

The new alarms will be separated into the following categories:

CategoryDescription
CriticalOpen alarms, identified by red indicator.
WarningOpen alarms, identified by yellow indicator.
Unknown

Open alarms, identified by gray indicator. Unknown will be the status of an alarm that has stopped receiving a metric. This can be caused by the following conditions:

  • An alarm exists for a service or component that is not installed in the environment.

  • An alarm exists for a virtual machine or node that previously existed but has been removed without the corresponding alarms being removed.

  • There is a gap between the last reported metric and the next metric.

OpenComplete list of open alarms.
Total

Complete list of alarms, may include Acknowledged and Resolved alarms.

More Information

16.7 Logging

This page displays the link to the Logging Interface, known as Kibana.

Important
Important: Accessing Kibana

The Kibana logging interface only runs on the management network. You need to have access to that network to be able to use Kibana.

16.7.1 View Logging Interface

To access the logging interface, click the Launch Logging Interface button, which will open the interface in a new window.

For more details about the logging interface, see Section 13.2, “Centralized Logging Service”.

16.8 My Dashboard

This page allows you to customize the dashboard by mixing and matching graphs and alarm cards.

My Dashboard allows you to customize the dashboard by mixing and matching graphs and alarm cards. Since different operators may be interested in different metrics and alarms, the configuration for this page is tied to the login account used to access Operations Console. Charts available here are based on metrics collected by the monasca monitoring component.

16.9 Networking Alarm Summary

This page displays the alarms for the Networking (neutron), DNS, Firewall, and Load Balancing services. By default, alarms are sorted by State.

16.9.1 Filter and Sort

The filter bar allows you to filter the alarms by the available criteria, including Dimension, State, and Service. The dimension filter accepts key/value pairs, while the State filter provides a selection of valid values.

You can sort alarm entries using the column headers in the table.

16.9.2 Alarm Table

You can select one or multiple alarms using the check box next to each entry.

The State column displays a graphical indicator that shows the state of each alarm:

  • Green indicator: OK. Good operating state.

  • Yellow indicator: Warning. Low severity, not requiring immediate action.

  • Red indicator: Alarm. Varying severity levels and must be addressed.

  • Gray square (or gray indicator): Undetermined.

The Alarm column identifies the alarm by its name.

The Last Check column displays the date and time the most recent occurrence of the alarm.

The Dimension column shows the components to check in order to clear the alarm.

The last column, depicted by three dots, reveals an Actions menu gives you access to the following options:

  • View Details opens a separate window with the information from the table view and the alarm history.

  • View Alarm Definition allows you to view and edit the selected alarm definition.

  • Delete is used to delete the currently selected alarm entry.

16.10 Central Dashboard

This page displays a high level overview of all cloud resources and their alarm status.

16.10.1 Central Dashboard

Image

16.10.2 New Alarms

The New Alarms section shows you the alarms that have triggered since the timeframe indicated. You can select the timeframe using the View control with options ranging from the Last Minute to Last 30 Days. This section refreshes every 60 seconds.

The new alarms will be separated into the following categories:

  • Critical - Open alarms, identified by red indicator.

  • Warning - Open alarms, identified by yellow indicator.

  • Unknown - Open alarms, identified by gray indicator. Unknown will be the status of an alarm that has stopped receiving a metric. This can be caused by the following conditions:

    • An alarm exists for a service or component that is not installed in the environment.

    • An alarm exists for a virtual machine or node that previously existed but has been removed without the corresponding alarms being removed.

    • There is a gap between the last reported metric and the next metric.

  • Open - Complete list of open alarms.

  • Total - Complete list of alarms, may include Acknowledged and Resolved alarms.

16.10.3 Alarm Summary

Each service or group of services have a dedicated card displaying related alarms.

  • Critical - Open alarms, identified by red indicator.

  • Warning - Open alarms, identified by yellow indicator.

  • Unknown - Open alarms, identified by gray indicator. Unknown will be the status of an alarm that has stopped receiving a metric. This can be caused by the following conditions:

    • An alarm exists for a service or component that is not installed in the environment.

    • An alarm exists for a virtual machine or node that previously existed but has been removed without the corresponding alarms being removed.

    • There is a gap between the last reported metric and the next metric.

  • Open - Complete list of open alarms.

  • Total - Complete list of alarms, may include Acknowledged and Resolved alarms.