Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
Applies to SUSE Linux Enterprise High Availability Extension 11 SP4

5 Configuring and Managing Cluster Resources (Web Interface)

In addition to the crm command line tool and the Pacemaker GUI, the High Availability Extension also comes with the HA Web Konsole (Hawk), a Web-based user interface for management tasks. It allows you to monitor and administer your Linux cluster from non-Linux machines as well. Furthermore, it is the ideal solution in case your system only provides a minimal graphical user interface.

This chapter introduces Hawk and covers basic tasks for configuring and managing cluster resources: modifying global cluster options, creating basic and advanced types of resources (groups and clones), configuring constraints, specifying failover nodes and failback nodes, configuring resource monitoring, starting, cleaning up or removing resources, and migrating resources manually. For detailed analysis of the cluster status, Hawk generates a cluster report (hb_report). You can view the cluster history or explore potential failure scenarios with the simulator.

5.1 Hawk—Overview

Hawk's Web interface allows you to monitor and administer your Linux cluster from non-Linux machines as well. Furthermore, it is the ideal solution in case your system only provides a minimal graphical user interface.

5.1.1 Starting Hawk and Logging In

The Web interface is included in the hawk package. It must be installed on all cluster nodes you want to connect to with Hawk. On the machine from which you want to access a cluster node using Hawk, you only need a (graphical) Web browser with JavaScript and cookies enabled to establish the connection.

To use Hawk, the respective Web service must be started on the node that you want to connect to via the Web interface.

If you have set up your cluster with the scripts from the sleha-bootstrap package, the Hawk service is already started. In that case, skip Procedure 5.1, “Starting Hawk Services” and proceed with Procedure 5.2, “Logging In to the Hawk Web Interface”.

Procedure 5.1: Starting Hawk Services
  1. On the node you want to connect to, open a shell and log in as root.

  2. Check the status of the service by entering

    root # rchawk status
  3. If the service is not running, start it with

    root # rchawk start

    If you want Hawk to start automatically at boot time, execute the following command:

    root # chkconfig hawk on
Note
Note: User Authentication

Hawk users must be members of the haclient group. The installation creates a Linux user named hacluster, who is added to the haclient group. When using the ha-cluster-init script for setup, a default password is set for the hacluster user.

Before starting Hawk, set or change the password for the hacluster user. Alternatively, create a new user which is a member of the haclient group.

Do this on every node you will connect to with Hawk.

Procedure 5.2: Logging In to the Hawk Web Interface

The Hawk Web interface uses the HTTPS protocol and port 7630.

Note
Note: Accessing Hawk Via a Virtual IP

To access Hawk even in case the cluster node you usually connect to is down, a virtual IP address (IPaddr or IPaddr2) can be configured for Hawk as cluster resource. It does not need any special configuration.

  1. On any machine, start a Web browser and make sure that JavaScript and cookies are enabled.

  2. As URL, enter the IP address or host name of any cluster node running the Hawk Web service. Alternatively, enter the address of any virtual IP address that the cluster operator may have configured as resource:

    https://HOSTNAME_OR_IP_ADDRESS:7630/
    Note
    Note: Certificate Warning

    If a certificate warning appears when you try to access the URL for the first time, a self-signed certificate is in use. Self-signed certificates are not considered trustworthy by default.

    Ask your cluster operator for the certificate details to verify the certificate.

    To proceed anyway, you can add an exception in the browser to bypass the warning.

    For information on how to replace the self-signed certificate with a certificate signed by an official Certificate Authority, refer to Replacing the Self-Signed Certificate.

  3. On the Hawk login screen, enter the Username and Password of the hacluster user (or of any other user that is a member of the haclient group).

  4. Click Log In.

    The Cluster Status screen appears, displaying the status of your cluster nodes and resources. The information shown is similar to the output of crm status in the crm shell.

5.1.2 Main Screen: Cluster Status

After logging in, Hawk displays the Cluster Status screen. It shows a summary with the most important global cluster parameters and the status of your cluster nodes and resources. The following color code is used for status display of nodes and resources:

Hawk Color Code
  • Green: OK. For example, the resource is running or the node is online.

  • Red: Bad, unclean. For example, the resource has failed or the node was not shut down cleanly.

  • Yellow: In transition. For example, the node is currently being shut down or a resource is currently being started or stopped. If you click a pending resource to view its details, Hawk also displays the state to which the resource is currently changing (Starting, Stopping, Moving, Promoting, or Demoting).

  • Gray: Not running, but the cluster expects it to be running. For example, nodes that the administrator has stopped or put into standby mode. Also nodes that are offline are displayed in gray (if they have been shut down cleanly).

In addition to the color code, Hawk also displays icons for the state of nodes, resources, tickets and for error messages in all views of the Cluster Status screen.

If a resource has failed, an error message with the details is shown in red at the top of the screen. To analyze the causes for the failure, click the error message. This automatically takes you to Hawk's History Explorer and triggers the collection of data for a time span of 20 minutes (10 minutes before and 10 minutes after the failure occurred). For more details, refer to Procedure 5.27, “Viewing Transitions with the History Explorer”.

Hawk—Cluster Status (Summary View)
Figure 5.1: Hawk—Cluster Status (Summary View)

The Cluster Status screen refreshes itself in near real-time. Choose between the following views, which you can access with the three icons in the upper right corner:

Hawk Cluster Status Views
Summary View

Shows the most important global cluster parameters and the status of your cluster nodes and resources at the same time. If your setup includes Geo clusters (multi-site clusters), the summary view also shows tickets. To view details about all elements belonging to a certain category (tickets, nodes, or resources), click the category title, which is marked as a link. Otherwise click the individual elements for details.

Tree View

Presents an expandable view of the most important global cluster parameters and the status of your cluster nodes and resources. If your setup includes Geo clusters (multi-site clusters), the tree view also shows tickets. Click the arrows to expand or collapse the elements belonging to the respective category. In contrast to the Summary View this view not only shows the IDs and status of resources but also the type (for example, primitive, clone, or group).

Table View

This view is especially useful for larger clusters, because it shows in a concise way which resources are currently running on which node. Inactive nodes or resources are also displayed.

The top-level row of the main screen shows the user name with which you are logged in. It also allows you to Log Out of the Web interface, and to access the following Tools from the wrench icon next to the user name:

To perform basic operator tasks on nodes and resources (like starting or stopping resources, bringing nodes online, or viewing details), click the wrench icon next to the node or resource. This will display a context menu. For any clone, group or multi-state child resource on any of the status screens, select the Parent menu item from the context menu. Clicking this will let you start, stop, etc. the top-level clone or group to which that primitive belongs.

For more complex tasks like configuring resources, constraints, or global cluster options, use the navigation bar on the left hand side. From there, you can access the following screens:

Note
Note: Available Functions in Hawk

By default, users logged in as root or hacluster have full read-write access to all cluster configuration tasks. However, Access Control Lists can be used to define fine-grained access permissions.

If ACLs are enabled in the CRM, the available functions in Hawk depend on the user role and access permissions assigned to you. In addition, the following functions in Hawk can only be executed by the user hacluster:

  • Generating an hb_report.

  • Using the History Explorer.

  • Viewing recent events for nodes or resources.

5.2 Configuring Global Cluster Options

Global cluster options control how the cluster behaves when confronted with certain situations. They are grouped into sets and can be viewed and modified with cluster management tools like Pacemaker GUI, Hawk, and crm shell. The predefined values can be kept in most cases. However, to make key functions of your cluster work correctly, you need to adjust the following parameters after basic cluster setup:

Procedure 5.3: Modifying Global Cluster Options
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Cluster Properties to view the global cluster options and their current values. Hawk displays the most important parameters with regard to CRM Configuration, Resource Defaults, and Operation Defaults.

    Hawk—Cluster Configuration
    Figure 5.3: Hawk—Cluster Configuration
  3. Depending on your cluster requirements, adjust the CRM Configuration:

    1. Set no-quorum-policy to the appropriate value.

    2. If you need to disable fencing for any reasons, deselect stonith-enabled.

      Important
      Important: No Support Without STONITH

      A cluster without STONITH is not supported.

    3. To remove a property from the CRM configuration, click the minus icon next to the property. If a property is deleted, the cluster will behave as if that property had the default value. For details of the default values, refer to Section 4.2.6, “Resource Options (Meta Attributes)”.

    4. To add a new property for the CRM configuration, choose one from the drop-down box and click the plus icon.

  4. If you need to change Resource Defaults or Operation Defaults, proceed as follows:

    1. To change the value of defaults that are already displayed, edit the value in the respective input field.

    2. To add a new resource default or operation default, choose one from the empty drop-down list, click the plus icon and enter a value. If there are default values defined, Hawk proposes them automatically.

    3. To remove a resource or operation default, click the minus icon next to the parameter. If no values are specified for Resource Defaults and Operation Defaults, the cluster uses the default values that are documented in Section 4.2.6, “Resource Options (Meta Attributes)” and Section 4.2.8, “Resource Operations”.

  5. Confirm your changes.

5.3 Configuring Cluster Resources

As a cluster administrator, you need to create cluster resources for every resource or application you run on servers in your cluster. Cluster resources can include Web sites, mail servers, databases, file systems, virtual machines, and any other server-based applications or services you want to make available to users at all times.

For an overview of the resource types you can create, refer to Section 4.2.3, “Types of Resources”. Apart from the basic specification of a resource (ID, class, provider, and type), you can add or modify the following parameters during or after creation of a resource:

  • Instance attributes (parameters) determine which instance of a service the resource controls. For more information, refer to Section 4.2.7, “Instance Attributes (Parameters)”.

    When creating a resource, Hawk automatically shows any required parameters. Edit them to get a valid resource configuration.

  • Meta attributes tell the CRM how to treat a specific resource. For more information, refer to Section 4.2.6, “Resource Options (Meta Attributes)”.

    When creating a resource, Hawk automatically lists the important meta attributes for that resource (for example, the target-role attribute that defines the initial state of a resource. By default, it is set to Stopped, so the resource will not start immediately).

  • Operations are needed for resource monitoring. For more information, refer to Section 4.2.8, “Resource Operations”.

    When creating a resource, Hawk displays the most important resource operations (monitor, start, and stop).

5.3.1 Configuring Resources with the Setup Wizard

The High Availability Extension comes with a predefined set of templates for some frequently used cluster scenarios, for example, setting up a highly available NFS server. Find the predefined templates in the hawk-templates package. You can also define your own wizard templates. For detailed information, refer to https://github.com/ClusterLabs/hawk/blob/sle-11-sp4/doc/wizard.txt.

Procedure 5.4: Using the Setup Wizard
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Setup Wizard. The Cluster Setup Wizard lists the available resource templates. If you click an entry, Hawk displays a short help text about the template.

  3. Select the resource set you want to configure and click Next.

  4. Follow the instructions on the screen. If you need information about an option, click it to display a short help text in Hawk.

Hawk—Setup Wizard
Figure 5.4: Hawk—Setup Wizard

5.3.2 Creating Simple Cluster Resources

To create the most basic type of resource, proceed as follows:

Procedure 5.5: Adding Primitive Resources
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Resources. The Resources screen shows categories for all types of resources. It lists any resources that are already defined.

  3. Select the Primitive category and click the plus icon.

  4. Specify the resource:

    1. Enter a unique Resource ID.

    2. From the Class list, select the resource agent class you want to use for the resource: lsb, ocf, service, or stonith. For more information, see Section 4.2.2, “Supported Resource Agent Classes”.

    3. If you selected ocf as class, specify the Provider of your OCF resource agent. The OCF specification allows multiple vendors to supply the same resource agent.

    4. From the Type list, select the resource agent you want to use (for example, IPaddr or Filesystem). A short description for this resource agent is displayed.

      The selection you get in the Type list depends on the Class (and for OCF resources also on the Provider) you have chosen.

  5. Hawk automatically shows any required parameters for the resource plus an empty drop-down list that you can use to specify an additional parameter.

    To define Parameters (instance attributes) for the resource:

    1. Enter values for each required parameter. A short help text is displayed as soon as you click the text box next to a parameter.

    2. To completely remove a parameter, click the minus icon next to the parameter.

    3. To add another parameter, click the empty drop-down list, select a parameter and enter a value for it.

  6. Hawk automatically shows the most important resource Operations and proposes default values. If you do not modify any settings here, Hawk will add the proposed operations and their default values as soon as you confirm your changes.

    For details on how to modify, add or remove operations, refer to Procedure 5.15, “Adding or Modifying Monitor Operations”.

  7. Hawk automatically lists the most important meta attributes for the resource, for example target-role.

    To modify or add Meta Attributes:

    1. To set a (different) value for an attribute, select one from the drop-down box next to the attribute or edit the value in the input field.

    2. To completely remove a meta attribute, click the minus icon next to it.

    3. To add another meta attribute, click the empty drop-down box and select an attribute. The default value for the attribute is displayed. If needed, change it as described above.

  8. Click Create Resource to finish the configuration. A message at the top of the screen shows if the resource was successfully created or not.

Hawk—Primitive Resource
Figure 5.5: Hawk—Primitive Resource

5.3.3 Creating STONITH Resources

Important
Important: No Support Without STONITH

A cluster without STONITH is not supported.

By default, the global cluster option stonith-enabled is set to true: If no STONITH resources have been defined, the cluster will refuse to start any resources. Configure one or more STONITH resources to complete the STONITH setup. While they are configured similar to other resources, the behavior of STONITH resources is different in some respects. For details refer to Section 9.3, “STONITH Resources and Configuration”.

Procedure 5.6: Adding a STONITH Resource
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Resources. The Resources screen shows categories for all types of resources and lists all defined resources.

  3. Select the Primitive category and click the plus icon.

  4. Specify the resource:

    1. Enter a unique Resource ID.

    2. From the Class list, select the resource agent class stonith.

    3. From the Type list, select the STONITH plug-in for controlling your STONITH device. A short description for this plug-in is displayed.

  5. Hawk automatically shows the required Parameters for the resource. Enter values for each parameter.

  6. Hawk displays the most important resource Operations and proposes default values. If you do not modify any settings here, Hawk will add the proposed operations and their default values as soon as you confirm.

  7. Adopt the default Meta Attributes settings if there is no reason to change them.

  8. Confirm your changes to create the STONITH resource.

To complete your fencing configuration, add constraints, use clones or both. For more details, refer to Chapter 9, Fencing and STONITH.

5.3.4 Using Resource Templates

If you want to create lots of resources with similar configurations, defining a resource template is the easiest way. After being defined, it can be referenced in primitives or in certain types of constraints. For detailed information about function and use of resource templates, refer to Section 4.4.3, “Resource Templates and Constraints”.

Procedure 5.7: Creating Resource Templates
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Resources. The Resources screen shows categories for all types of resources plus a Template category.

  3. Select the Template category and click the plus icon.

  4. Enter a Template ID.

  5. Specify the resource template as you would specify a primitive. Follow Procedure 5.5: Adding Primitive Resources, starting with Step 4.b.

  6. Click Create Resource to finish the configuration. A message at the top of the screen shows if the resource template was successfully created.

Hawk—Resource Template
Figure 5.6: Hawk—Resource Template
Procedure 5.8: Referencing Resource Templates
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. To reference the newly created resource template in a primitive, follow these steps:

    1. In the left navigation bar, select Resources. The Resources screen shows categories for all types of resources. It lists all defined resources.

    2. Select the Primitive category and click the plus icon.

    3. Enter a unique Resource ID.

    4. Activate Use Template and, from the drop-down list, select the template to reference.

    5. If needed, specify further Parameters, Operations, or Meta Attributes as described in Procedure 5.5, “Adding Primitive Resources”.

  3. To reference the newly created resource template in colocational or order constraints, proceed as described in Procedure 5.10, “Adding or Modifying Colocational or Order Constraints”.

5.3.5 Configuring Resource Constraints

After you have configured all resources, specify how the cluster should handle them correctly. Resource constraints let you specify on which cluster nodes resources can run, in which order resources will be loaded, and what other resources a specific resource depends on.

For an overview of available types of constraints, refer to Section 4.4.1, “Types of Constraints”. When defining constraints, you also need to specify scores. For more information on scores and their implications in the cluster, see Section 4.4.2, “Scores and Infinity”.

Learn how to create the different types of constraints in the following procedures.

Procedure 5.9: Adding or Modifying Location Constraints

For location constraints, specify a constraint ID, resource, score and node:

  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Constraints. The Constraints screen shows categories for all types of constraints. It lists all defined constraints.

  3. To add a new Location constraint, click the plus icon in the respective category.

    To modify an existing constraint, click the wrench icon next to the constraint and select Edit Constraint.

  4. Enter a unique Constraint ID. When modifying existing constraints, the ID is already defined.

  5. Select the Resource for which to define the constraint. The list shows the IDs of all resources that have been configured for the cluster.

  6. Set the Score for the constraint. Positive values indicate the resource can run on the Node you specify in the next step. Negative values mean it should not run on that node. Setting the score to INFINITY forces the resource to run on the node. Setting it to -INFINITY means the resources must not run on the node.

  7. Select the Node for the constraint.

  8. Click Create Constraint to finish the configuration. A message at the top of the screen shows if the constraint was successfully created.

Hawk—Location Constraint
Figure 5.7: Hawk—Location Constraint
Procedure 5.10: Adding or Modifying Colocational or Order Constraints

For both types of constraints specify a constraint ID and a score, then add resources to a dependency chain:

  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Constraints. The Constraints screen shows categories for all types of constraints and lists all defined constraints.

  3. To add a new Colocation or Order constraint, click the plus icon in the respective category.

    To modify an existing constraint, click the wrench icon next to the constraint and select Edit Constraint.

  4. Enter a unique Constraint ID. When modifying existing constraints, the ID is already defined.

  5. Define a Score.

    For colocation constraints, the score determines the location relationship between the resources. Positive values indicate the resources should run on the same node. Negative values indicate the resources should not run on the same node. Setting the score to INFINITY forces the resources to run on the same node. Setting it to -INFINITY means the resources must not run on the same node. The score will be combined with other factors to decide where to put the resource.

    For order constraints, the constraint is mandatory if the score is greater than zero, otherwise it is only a suggestion. The default value is INFINITY.

  6. For order constraints, you can usually keep the option Symmetrical enabled. This specifies that resources are stopped in reverse order.

  7. To define the resources for the constraint, follow these steps:

    1. Select a resource from the list Add resource to constraint. The list shows the IDs of all resources and all resource templates configured for the cluster.

    2. To add the selected resource, click the plus icon next to the list. A new list appears beneath. Select the next resource from the list. As both colocation and order constraints define a dependency between resources, you need at least two resources.

    3. Select one of the remaining resources from the list Add resource to constraint. Click the plus icon to add the resource.

      Now you have two resources in a dependency chain.

      If you have defined an order constraint, the topmost resource will start first, then the second etc. Usually the resources will be stopped in reverse order.

      However, if you have defined a colocation constraint, the arrow icons between the resources reflect their dependency, but not their start order. As the topmost resource depends on the next resource and so on, the cluster will first decide where to put the last resource, then place the depending ones based on that decision. If the constraint cannot be satisfied, the cluster may decide not to allow the dependent resource to run.

    4. Add as many resources as needed for your colocation or order constraint.

    5. If you want to swap the order of two resources, click the double arrow at the right hand side of the resources to swap the resources in the dependency chain.

  8. If needed, specify further parameters for each resource, like the role (Master, Slave, Started, or Stopped).

  9. Click Create Constraint to finish the configuration. A message at the top of the screen shows if the constraint was successfully created.

Hawk—Colocation Constraint
Figure 5.8: Hawk—Colocation Constraint

As an alternative format for defining colocation or ordering constraints, you can use resource sets. They have the same ordering semantics as groups.

Procedure 5.11: Using Resource Sets for Colocation or Order Constraints
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. Define colocation or order constraints as described in Procedure 5.10, “Adding or Modifying Colocational or Order Constraints”.

  3. When you have added the resources to the dependency chain, you can put them into a resource set by clicking the chain icon at the right hand side. A resource set is visualized by a frame around the resources belonging to a set.

  4. You can also add multiple resources to a resource set or create multiple resource sets.

  5. To extract a resource from a resource set, click the scissors icon above the respective resource.

    The resource will be removed from the set and put back into the dependency chain at its original place.

  6. Confirm your changes to finish the constraint configuration.

For more information on configuring constraints and detailed background information about the basic concepts of ordering and colocation, refer to the documentation available at http://www.clusterlabs.org/doc/:

  • Pacemaker Explained, chapter Resource Constraints

  • Colocation Explained

  • Ordering Explained

Procedure 5.12: Removing Constraints
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Constraints. The Constraints screen shows categories for all types of constraints and lists all defined constraints.

  3. Click the wrench icon next to a constraint and select Remove Constraint.

5.3.6 Specifying Resource Failover Nodes

A resource will be automatically restarted if it fails. If that cannot be achieved on the current node, or it fails N times on the current node, it will try to fail over to another node. You can define a number of failures for resources (a migration-threshold), after which they will migrate to a new node. If you have more than two nodes in your cluster, the node to which a particular resource fails over is chosen by the High Availability software.

You can specify a specific node to which a resource will fail over by proceeding as follows:

  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. Configure a location constraint for the resource as described in Procedure 5.9, “Adding or Modifying Location Constraints”.

  3. Add the migration-threshold meta attribute to the resource as described in Procedure 5.5: Adding Primitive Resources, Step 7 and enter a Value for the migration-threshold. The value should be positive and less than INFINITY.

  4. If you want to automatically expire the failcount for a resource, add the failure-timeout meta attribute to the resource as described in Procedure 5.5: Adding Primitive Resources, Step 7 and enter a Value for the failure-timeout.

  5. If you want to specify additional failover nodes with preferences for a resource, create additional location constraints.

The process flow regarding migration thresholds and failcounts is demonstrated in Example 4.6, “Migration Threshold—Process Flow”.

Instead of letting the failcount for a resource expire automatically, you can also clean up failcounts for a resource manually at any time. Refer to Section 5.4.2, “Cleaning Up Resources” for details.

5.3.7 Specifying Resource Failback Nodes (Resource Stickiness)

A resource may fail back to its original node when that node is back online and in the cluster. To prevent this or to specify a different node for the resource to fail back to, change the stickiness value of the resource. You can either specify the resource stickiness when creating it or afterwards.

For the implications of different resource stickiness values, refer to Section 4.4.5, “Failback Nodes”.

Procedure 5.13: Specifying Resource Stickiness
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. Add the resource-stickiness meta attribute to the resource as described in Procedure 5.5: Adding Primitive Resources, Step 7.

  3. Specify a value between -INFINITY and INFINITY for the resource-stickiness.

5.3.8 Configuring Placement of Resources Based on Load Impact

Not all resources are equal. Some, such as Xen guests, require that the node hosting them meets their capacity requirements. If resources are placed so that their combined needs exceed the provided capacity, the performance of the resources diminishes or they fail.

To take this into account, the High Availability Extension allows you to specify the following parameters:

  1. The capacity a certain node provides.

  2. The capacity a certain resource requires.

  3. An overall strategy for placement of resources.

Utilization attributes are used to configure both the resource's requirements and the capacity a node provides. The High Availability Extension now also provides means to detect and configure both node capacity and resource requirements automatically. For more details and a configuration example, refer to Section 4.4.6, “Placing Resources Based on Their Load Impact”.

To display a node's capacity values (defined via utilization attributes) and the capacity currently consumed by resources running on the node, switch to the Cluster Status screen in Hawk. Select the node you are interested in, click the wrench icon next to the node and select Show Details.

Hawk—Viewing a Node's Capacity Values
Figure 5.9: Hawk—Viewing a Node's Capacity Values

After you have configured the capacities your nodes provide and the capacities your resources require, you need to set the placement strategy in the global cluster options. Otherwise the capacity configurations have no effect. Several strategies are available to schedule the load: for example, you can concentrate it on as few nodes as possible, or balance it evenly over all available nodes. For more information, refer to Section 4.4.6, “Placing Resources Based on Their Load Impact”.

Procedure 5.14: Setting the Placement Strategy
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Cluster Properties to view the global cluster options and their current values.

  3. From the Add new property drop-down list, choose placement-strategy.

  4. Depending on your requirements, set Placement Strategy to the appropriate value.

  5. Click the plus icon to add the new cluster property including its value.

  6. Confirm your changes.

5.3.9 Configuring Resource Monitoring

The High Availability Extension can not only detect a node failure, but also when an individual resource on a node has failed. If you want to ensure that a resource is running, configure resource monitoring for it. For resource monitoring, specify a timeout and/or start delay value, and an interval. The interval tells the CRM how often it should check the resource status. You can also set particular parameters, such as Timeout for start or stop operations.

Procedure 5.15: Adding or Modifying Monitor Operations
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Resources. The Resources screen shows categories for all types of resources and lists all defined resources.

  3. Select the resource to modify, click the wrench icon next to it and select Edit Resource. The resource definition is displayed. Hawk automatically shows the most important resource operations (monitor, start, stop) and proposes default values.

  4. To change the values for an operation:

    1. Click the pen icon next to the operation.

    2. In the dialog that opens, specify the following values:

      • Enter a timeout value in seconds. After the specified timeout period, the operation will be treated as failed. The PE will decide what to do or execute what you specified in the On Fail field of the monitor operation.

      • For monitoring operations, define the monitoring interval in seconds.

      If needed, use the empty drop-down box at the bottom of the monitor dialog to add more parameters, like On Fail (what to do if this action fails?) or Requires (what conditions need to be fulfilled before this action occurs?).

    3. Confirm your changes to close the dialog and to return to the Edit Resource screen.

  5. To completely remove an operation, click the minus icon next to it.

  6. To add another operation, click the empty drop-down box and select an operation. A default value for the operation is displayed. If needed, change it by clicking the pen icon.

  7. Click Apply Changes to finish the configuration. A message at the top of the screen shows if the resource was successfully updated or not.

For the processes which take place if the resource monitor detects a failure, refer to Section 4.3, “Resource Monitoring”.

To view resource failures, switch to the Cluster Status screen in Hawk and select the resource you are interested in. Click the wrench icon next to the resource and select Show Details.

5.3.10 Configuring a Cluster Resource Group

Some cluster resources depend on other components or resources and require that each component or resource starts in a specific order and runs on the same server. To simplify this configuration we support the concept of groups.

For an example of a resource group and more information about groups and their properties, refer to Section 4.2.5.1, “Groups”.

Note
Note: Empty Groups

Groups must contain at least one resource, otherwise the configuration is not valid. In Hawk, primitives cannot be created or modified while creating a group. Before adding a group, create primitives and configure them as desired. For details, refer to Procedure 5.5, “Adding Primitive Resources”.

Procedure 5.16: Adding a Resource Group
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Resources. The Resources screen shows categories for all types of resources and lists all defined resources.

  3. Select the Group category and click the plus icon.

  4. Enter a unique Group ID.

  5. To define the group members, select one or multiple entries in the list of Available Primitives and click the < icon to add them to the Group Children list. Any new group members are added to the bottom of the list. To define the order of the group members, you currently need to add and remove them in the order you desire.

  6. If needed, modify or add Meta Attributes as described in Adding Primitive Resources, Step 7.

  7. Click Create Group to finish the configuration. A message at the top of the screen shows if the group was successfully created.

Hawk—Resource Group
Figure 5.10: Hawk—Resource Group

5.3.11 Configuring a Clone Resource

If you want certain resources to run simultaneously on multiple nodes in your cluster, configure these resources as a clones. For example, cloning makes sense for resources like STONITH and cluster file systems like OCFS2. You can clone any resource provided. Cloning is supported by the resource’s Resource Agent. Clone resources may be configured differently depending on which nodes they are running on.

For an overview of the available types of resource clones, refer to Section 4.2.5.2, “Clones”.

Note
Note: Sub-resources for Clones

Clones can either contain a primitive or a group as sub-resources. In Hawk, sub-resources cannot be created or modified while creating a clone. Before adding a clone, create sub-resources and configure them as desired. For details, refer to Procedure 5.5, “Adding Primitive Resources” or Procedure 5.16, “Adding a Resource Group”.

Procedure 5.17: Adding or Modifying Clones
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Resources. The Resources screen shows categories for all types of resources and lists all defined resources.

  3. Select the Clone category and click the plus icon.

  4. Enter a unique Clone ID.

  5. From the Child Resource list, select the primitive or group to use as a sub-resource for the clone.

  6. If needed, modify or add Meta Attributes as described in Procedure 5.5: Adding Primitive Resources, Step 7.

  7. Click Create Clone to finish the configuration. A message at the top of the screen shows if the clone was successfully created.

Hawk—Clone Resource
Figure 5.11: Hawk—Clone Resource

5.4 Managing Cluster Resources

In addition to configuring your cluster resources, Hawk allows you to manage existing resources from the Cluster Status screen. For a general overview of the screen, its different views and the color code used for status information, refer to Section 5.1.2, “Main Screen: Cluster Status”.

Basic resource operations can be executed from any cluster status view. Both Tree View and Table View let you access the individual resources directly. However, in the Summary View you need to click the links in the resources category first to display the resource details. The detailed view also shows any attributes set for that resource. For primitive resources (regular primitives, children of groups, clones, or multi-state resources), the following information will be shown additionally:

  • the resource's failcount

    the last failure time stamp (if the failcount is > 0)

  • operation history and timings (call id, operation, last run time stamp, execution time, queue time, return code and last change time stamp)

Viewing a Resource's Details
Figure 5.12: Viewing a Resource's Details

5.4.1 Starting Resources

Before you start a cluster resource, make sure it is set up correctly. For example, if you want to use an Apache server as a cluster resource, set up the Apache server first. Complete the Apache configuration before starting the respective resource in your cluster.

Note
Note: Do Not Touch Services Managed by the Cluster

When managing a resource via the High Availability Extension, the same resource must not be started or stopped otherwise (outside of the cluster, for example manually or on boot or reboot). The High Availability Extension software is responsible for all service start or stop actions.

However, if you want to check if the service is configured properly, start it manually, but make sure that it is stopped again before High Availability takes over.

For interventions in resources that are currently managed by the cluster, set the resource to maintenance mode first as described in Procedure 5.23, “Applying Maintenance Mode to Resources”.

When creating a resource with Hawk, you can set its initial state with the target-role meta attribute. If you set its value to stopped, the resource does not start automatically after being created.

Procedure 5.18: Starting A New Resource
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Cluster Status.

  3. In one of the individual resource views, click the wrench icon next to the resource and select Start. To continue, confirm the message that appears. As soon as the resource has started, Hawk changes the resource's color to green and shows on which node it is running.

5.4.2 Cleaning Up Resources

A resource will be automatically restarted if it fails, but each failure increases the resource's failcount.

If a migration-threshold has been set for the resource, the node will no longer run the resource when the number of failures reaches the migration threshold.

A resource's failcount can either be reset automatically (by setting a failure-timeout option for the resource) or you can reset it manually as described below.

Procedure 5.19: Cleaning Up A Resource
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Cluster Status.

  3. In one of the individual resource views, click the wrench icon next to the failed resource and select Clean Up. To continue, confirm the message that appears.

    This executes the commands crm_resource  -C and crm_failcount  -D for the specified resource on the specified node.

For more information, see the man pages of crm_resource and crm_failcount.

5.4.3 Removing Cluster Resources

If you need to remove a resource from the cluster, follow the procedure below to avoid configuration errors:

Procedure 5.20: Removing a Cluster Resource
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Cluster Status.

  3. Clean up the resource on all nodes as described in Procedure 5.19, “Cleaning Up A Resource”.

  4. In one of the individual resource views, click the wrench icon next to the resource and select Stop. To continue, confirm the message that appears.

  5. If the resource is stopped, click the wrench icon next to it and select Delete Resource.

5.4.4 Migrating Cluster Resources

As mentioned in Section 5.3.6, “Specifying Resource Failover Nodes”, the cluster will fail over (migrate) resources automatically in case of software or hardware failures—according to certain parameters you can define (for example, migration threshold or resource stickiness). Apart from that, you can manually migrate a resource to another node in the cluster. Or you decide to move it away from the current node and leave the decision about where to put it to the cluster.

Procedure 5.21: Manually Migrating a Resource
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Cluster Status.

  3. In one of the individual resource views, click the wrench icon next to the resource and select Move.

  4. In the new window, select the node to which to move the resource.

    This creates a location constraint with an INFINITY score for the destination node.

  5. Alternatively, select to move the resource Away from current node.

    This creates a location constraint with a -INFINITY score for the current node.

  6. Click OK to confirm the migration.

To allow a resource to move back again, proceed as follows:

Procedure 5.22: Clearing a Migration Constraint
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Cluster Status.

  3. In one of the individual resource views, click the wrench icon next to the resource and select Drop Relocation Rule. To continue, confirm the message that appears.

    This uses the crm_resource  -U command. The resource can move back to its original location or it may stay where it is (depending on resource stickiness).

For more information, see the crm_resource man page or Pacemaker Explained, available from http://www.clusterlabs.org/doc/. Refer to section Resource Migration.

5.4.5 Using Maintenance Mode

Every now and then, you need to perform testing or maintenance tasks on individual cluster components or the whole cluster—be it changing the cluster configuration, updating software packages for individual nodes, or upgrading the cluster to a higher product version.

With regard to that, High Availability Extension provides maintenance options on several levels:

Warning
Warning: Risk of Data Loss

If you need to execute any testing or maintenance tasks while services are running under cluster control, make sure to follow this outline:

  1. Before you start, set the individual resource, node or the whole cluster to maintenance mode. This helps to avoid unwanted side effects like resources not starting in an orderly fashion, the risk of unsynchronized CIBs across the cluster nodes or data loss.

  2. Execute your maintenance task or tests.

  3. After you have finished, remove the maintenance mode to start normal cluster operation.

For more details on what happens to the resources and the cluster while in maintenance mode, see Section 4.7, “Maintenance Mode”.

Procedure 5.23: Applying Maintenance Mode to Resources
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Resources. Select the resource you want to put in maintenance mode or unmanaged mode, click the wrench icon next to the resource and select Edit Resource.

  3. Open the Meta Attributes category.

  4. From the empty drop-down list, select the maintenance attribute and click the plus icon to add it.

  5. Activate the check box next to maintenance to set the maintenance attribute to yes.

  6. Confirm your changes.

  7. After you have finished the maintenance task for that resource, deactivate the check box next to the maintenance attribute for that resource.

    From this point on, the resource will be managed by the High Availability Extension software again.

Procedure 5.24: Applying Maintenance Mode to Nodes

Sometimes it is necessary to put single nodes into maintenance mode.

  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Cluster Status.

  3. In one of the individual nodes' views, click the wrench icon next to the node and select Maintenance.

    This will add the following instance attribute to the node: maintenance="true". The resources previously running on the maintenance-mode node will become unmanaged. No new resources will be allocated to the node until it leaves the maintenance mode.

  4. To deactivate the maintenance mode, click the wrench icon next to the node and select Ready.

Procedure 5.25: Applying Maintenance Mode to the Cluster

For setting or unsetting the maintenance mode for the whole cluster, proceed as follows:

  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Cluster Configuration.

  3. In the CRM Configuration group, select the maintenance-mode attribute from the empty drop-down box and click the plus icon to add it.

  4. To set maintenance-mode=true, activate the check box next to maintenance-mode and confirm your changes.

  5. After you have finished the maintenance task for the whole cluster, deactivate the check box next to the maintenance-mode attribute.

    From this point on, High Availability Extension will take over cluster management again.

5.4.6 Viewing the Cluster History

Hawk provides the following possibilities to view past events on the cluster (on different levels and in varying detail).

Procedure 5.26: Viewing Recent Events of Nodes or Resources
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select Cluster Status.

  3. In the Tree View or Table View, click the wrench icon next to the resource or node you are interested in and select View Recent Events.

    The dialog that opens shows the events of the last hour.

Procedure 5.27: Viewing Transitions with the History Explorer

The History Explorer provides transition information for a time frame that you can define. It also lists its previous runs and allows you to Delete reports that you no longer need. The History Explorer uses the information provided by hb_report.

  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. In the left navigation bar, select History Explorer.

  3. By default, the period to explore is set to the last 24 hours. To modify this, set another Start Time and End Time.

  4. Click Display to start collecting transition data.

Hawk—History Report
Figure 5.13: Hawk—History Report

The following information is displayed:

History Explorer Results
Time

The time line of all past transitions in the cluster.

PE Input/Node

The pe-input* file for each transition and the node on which it was generated. For each transition, the cluster saves a copy of the state which is provided to the policy engine as input. The path to this archive is logged. The pe-input* files are only generated on the Designated Coordinator (DC), but as the DC can change, there may be pe-input* files from several nodes. The files show what the Policy Engine (PE) planned to do.

Details/Full Log

Opens a pop-up window with snippets of logging data that belong to that particular transition. Different amounts of details are available: Clicking Details displays the output of crm history transition peinput (including the resource agents' log messages). Full Log also includes details from the pengine, crmd, and lrmd and is equivalent to crm history transition log peinput.

Graph/XML

A graph and an XML representation of each transition. If you choose to show the Graph, the PE is reinvoked (using the pe-input* files), and generates a graphical visualization of the transition. Alternatively, you can view the XML representation of the graph.

Hawk History Report—Transition Graph
Figure 5.14: Hawk History Report—Transition Graph
Diff

If two or more pe-inputs are listed, a Diff link will appear to the right of each pair of pe-inputs. Clicking it displays the difference of configuration and status.

5.4.7 Exploring Potential Failure Scenarios

Hawk provides a Simulator that allows you to explore failure scenarios before they happen. After switching to the simulator mode, you can change the status of nodes, add or edit resources and constraints, change the cluster configuration, or execute multiple resource operations to see how the cluster would behave should these events occur. As long as the simulator mode is activated, a control dialog will be displayed in the bottom right hand corner of the Cluster Status screen. The simulator will collect the changes from all screens and will add them to its internal queue of events. The simulation run with the queued events will not be executed unless it is manually triggered in the control dialog. After the simulation run, you can view and analyze the details of what would have happened (log snippets, transition graph, and CIB states).

Procedure 5.28: Switching to Simulator Mode
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. Activate the simulator mode by clicking the wrench icon in the top-level row (next to the user name), and by selecting Simulator.

    Hawk's background changes color to indicate the simulator is active. A simulator control dialog is displayed in the bottom right hand corner of the Cluster Status screen. Its title Simulator (initial state) indicates that no simulator run has occurred yet.

  3. Fill the simulator's event queue:

    1. To simulate status change of a node: Click +Node in the simulator control dialog. Select the Node you want to manipulate and select its target State. Confirm your changes to add them to the queue of events listed in the controller dialog.

    2. To simulate a resource operation: Click +Op in the simulator control dialog. Select the Resource to manipulate and the Operation to simulate. If necessary, define an Interval. Select the Node on which to run the operation and the targeted Result. Confirm your changes to add them to the queue of events listed in the controller dialog.

  4. Repeat the previous steps for any other node status changes or resource operations you want to simulate.

    Hawk—Simulator with Injected Events
    Figure 5.15: Hawk—Simulator with Injected Events
  5. To inject other changes that you want to simulate:

    1. Switch to one or more of the following Hawk screens: Cluster Status, Setup Wizard, Cluster Configuration, Resources, or Constraints.

      Note
      Note: History Explorer and Simulator Mode

      Clicking the History Explorer tab will deactivate simulator mode.

    2. Add or modify parameters on the screens as desired.

      The simulator will collect the changes from all screens and will add them to its internal queue of events.

    3. To return to the simulator control dialog, switch to the Cluster Status screen or click the wrench icon in the top-level row and click Simulator again.

  6. If you want to remove an event listed in Injected State, select the respective entry and click the minus icon beneath the list.

  7. Start the simulation run by clicking Run in the simulator control dialog. The Cluster Status screen displays the simulated events. For example, if you marked a node as unclean, it will now be shown offline, and all its resources will be stopped. The simulator control dialog changes to Simulator (final state).

    Hawk—Simulator in Final State
    Figure 5.16: Hawk—Simulator in Final State
  8. To view more detailed information about the simulation run:

    1. Click the Details link in the simulator dialog to see log snippets of what occurred.

    2. Click the Graph link to show the transition graph.

    3. Click CIB (in) to display the initial CIB state. To see what the CIB would look like after the transition, click CIB (out).

  9. To start from scratch with a new simulation, use the Reset button.

  10. To exit the simulation mode, close the simulator control dialog. The Cluster Status screen switches back to its normal color and displays the current cluster state.

5.4.8 Generating a Cluster Report

For analysis and diagnosis of problems occurring on the cluster, Hawk can generate a cluster report that collects information from all nodes in the cluster.

Procedure 5.29: Generating an hb_report
  1. Start a Web browser and log in to the cluster as described in Section 5.1.1, “Starting Hawk and Logging In”.

  2. Click the wrench icon next to the user name in the top-level row, and select Generate hb_report.

  3. By default, the period to examine is the last hour. To modify this, set another Start Time and End Time.

  4. Click Generate.

  5. After the report has been created, download the *.tar.bz2 file by clicking the respective link.

For more information about the log files that tools like hb_report and crm_report cover, refer to How can I create a report with an analysis of all my cluster nodes?.

5.5 Monitoring Multiple Clusters

You can use Hawk as a single point of administration for monitoring multiple clusters. Hawk's Cluster Dashboard allows you to view a summary of multiple clusters, with each summary listing the number of nodes, resources, tickets (if you use Geo clusters), and their state. The summary also shows whether any failures have appeared in the respective cluster.

The cluster information displayed in the Cluster Dashboard is stored in a persistent cookie. This means you need to decide which Hawk instance you want to view the Cluster Dashboard on, and always use that one. The machine you are running Hawk on does not even need to be part of any cluster for that purpose—it can be a separate, unrelated system.

Procedure 5.30: Monitoring Multiple Clusters with Hawk
Prerequisites
  • All clusters to be monitored from Hawk's Cluster Dashboard must be running SUSE Linux Enterprise High Availability Extension 11 SP4. It is not possible to monitor clusters that are running earlier versions of SUSE Linux Enterprise High Availability Extension.

  • If you did not replace the self-signed certificate for Hawk on every cluster node with your own certificate (or a certificate signed by an official Certificate Authority), log in to Hawk on every node in every cluster at least once. Verify the certificate (and add an exception in the browser to bypass the warning).

  • If you are using Mozilla Firefox, you must change its preferences to Accept third-party cookies. Otherwise cookies from monitored clusters will not be set, thus preventing login to the clusters you are trying to monitor.

  1. Start the Hawk Web service on a machine you want to use for monitoring multiple clusters.

  2. Start a Web browser and as URL enter the IP address or host name of the machine that runs Hawk:

    https://IPaddress:7630/
  3. On the Hawk login screen, click the Dashboard link in the right upper corner.

    The Add Cluster dialog appears.

  4. Enter a custom Cluster Name with which to identify the cluster in the Cluster Dashboard.

  5. Enter the Host Name of one of the cluster nodes and confirm your changes.

    The Cluster Dashboard opens and shows a summary of the cluster you have added.

  6. To add more clusters to the dashboard, click the plus icon and enter the details for the next cluster.

    Hawk—Cluster Dashboard
    Figure 5.17: Hawk—Cluster Dashboard
  7. To remove a cluster from the dashboard, click the x icon next to the cluster's summary.

  8. To view more details about a cluster, click somewhere in the cluster's box on the dashboard.

    This opens a new browser window or new browser tab. If you are not currently logged in to the cluster, this takes you to the Hawk login screen. After having logged in, Hawk shows the Cluster Status of that cluster in the summary view. From here, you can administer the cluster with Hawk as usual.

  9. As the Cluster Dashboard stays open in a separate browser window or tab, you can easily switch between the dashboard and the administration of individual clusters in Hawk.

Any status changes for nodes or resources are reflected almost immediately within the Cluster Dashboard.

5.6 Hawk for Geo Clusters

For more details on Hawk features that relate to geographically dispersed clusters (Geo clusters), see the Quick Start Geo Clustering for SUSE Linux Enterprise High Availability Extension.

5.7 Troubleshooting

Hawk Log Files

Find the Hawk log files in /srv/www/hawk/log. Check these files in case you cannot access Hawk.

If you have trouble starting or stopping a resource with Hawk, check the Pacemaker log messages. By default, Pacemaker logs to /var/log/messages.

Authentication Fails

If you cannot log in to Hawk with a new user that is a member of the haclient group (or if you experience delays until Hawk accepts logins from this user), stop the nscd daemon with rcnscd stop and try again.

Replacing the Self-Signed Certificate

To avoid the warning about the self-signed certificate on first Hawk start-up, replace the automatically created certificate with your own certificate or a certificate that was signed by an official Certificate Authority (CA).

The certificate is stored in /etc/lighttpd/certs/hawk-combined.pem and contains both key and certificate.

Change the permissions to make the file only accessible by root:

root # chown root.root /etc/lighttpd/certs/hawk-combined.pem
      chmod 600 /etc/lighttpd/certs/hawk-combined.pem

After you have created or received your new key and certificate, combine them by executing the following command:

root # cat keyfile certificatefile > /etc/lighttpd/certs/hawk-combined.pem
Login to Hawk Fails After Using History Explorer/hb_report

Depending on the period of time you defined in the History Explorer or hb_report and the events that took place in the cluster during this time, Hawk might collect an extensive amount of information. It is stored in log files in the /tmp directory. This might consume the remaining free disk space on your node. In case Hawk should not respond after using the History Explorer or hb_report, check the hard disk of your cluster node and remove the respective log files.

Cluster Dashboard: Unable to connect to host

If adding clusters to Hawk's dashboard fails, check the prerequisites listed in Procedure 5.30, “Monitoring Multiple Clusters with Hawk”.

Cluster Dashboard: Node Not Accessible

The Cluster Dashboard only polls one node in each cluster for status. If the node being polled goes down, the dashboard will cycle to poll another node. In that case, Hawk briefly displays a warning message about that node being inaccessible. The message will disappear after Hawk has found another node to contact to.