Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
documentation.suse.com / SUSE Linux Enterprise High Availability Documentation / Geo Clustering Guide / Managing Geo clusters
Applies to SUSE Linux Enterprise High Availability 15 SP3

8 Managing Geo clusters

Before booth can manage a certain ticket within the Geo cluster, you initially need to grant it to a site manually—either with the booth command line client or with Hawk2.

8.1 Managing tickets from command line

Use the booth command line tool to grant, list, or revoke tickets as described in Section 8.1.1, “Overview of booth commands”.

Warning
Warning: crm_ticket and crm site ticket

If the booth service is not running for any reasons, you can also manage tickets fully manually with crm_ticket or crm site ticket. Both commands are only available on cluster nodes. Use them with great care as they cannot verify if the same ticket is already granted elsewhere. For more information, read the man pages.

As long as booth is up and running, only use the booth for manual intervention.

8.1.1 Overview of booth commands

The booth commands can be run on any machine in the cluster, not only the ones having the boothd running. The booth commands try to find the local cluster by looking at the booth configuration file and the locally defined IP addresses. If you do not specify a site which booth should connect to (using the -s option), it will always connect to the local site.

Listing all tickets
# booth list
ticket: ticketA, leader: none
ticket: ticketB, leader: 10.2.12.101, expires: 2014-08-13 10:28:57

If you do not specify a certain site with -s, the information about the tickets will be requested from the local booth instance.

Granting a ticket to a site
# booth grant -s 192.168.201.100 ticketA
booth[27891]: 2014/08/13_10:21:23 info: grant request sent, waiting for the result ...
booth[27891]: 2014/08/13_10:21:23 info: grant succeeded!

In this case, ticketA will be granted to the site 192.168.201.100. Without the -s option, booth would automatically connect to the current site (the site you are running the booth client on) and would request the grant operation.

Before granting a ticket, the command executes a sanity check. If the same ticket is already granted to another site, you are warned about that and are prompted to revoke the ticket from the current site first.

Revoking a ticket from a site
# booth revoke ticketA
booth[27900]: 2014/08/13_10:21:23 info: revoke succeeded!

Booth checks to which site the ticket is currently granted and requests the revoke operation for ticketA. The revoke operation will be executed immediately.

The grant and (under certain circumstances), revoke operations may take a while to return a definite operation's outcome. The client waits for the result up to the ticket's timeout value before it gives up waiting. If the -w option was used, the client will wait indefinitely instead. Find the exact status in the log files or with the crm_ticket -L command.

Forcing a grant operation
# booth grant -F ticketA

The result of this command depends on whether you use automatic or manual tickets.

  • Automatic tickets. As long as booth can make sure a ticket is granted to one site, you cannot grant the same ticket to another site, not even by using the -F option. However, in case of a split brain situation, booth might not be able to check if an automatic ticket is granted somewhere else. In that case, the Geo cluster administrator can override the automatic process and manually grant the ticket to the site that is still up and running. In this situation, the -F options tells booth not to wait for a response from other, unreachable sites (so ignoring the parameters expire and acquire-after, if defined for this ticket). Instead, booth will immediately grant the ticket to the specified site.

  • Manual tickets. When using manual tickets, booth grant -F makes booth grant the ticket immediately to the specified site.

Warning
Warning: Potential loss of data

Before using booth grant -F, make sure that no other site (which is online) owns the same ticket. If the same ticket is granted to multiple sites, resources depending on the ticket might start on several sites in parallel. This results in concurrency violation and potential data corruption.

As Geo cluster administrator, you need to resolve a conflict between tickets once the other site is reachable again.

In the following sections, find some examples for managing tickets in different scenarios.

8.1.2 Manually moving an automatic ticket

Assuming that you want to manually move ticketA from site amsterdam (with the virtual IP 192.168.201.100) to site berlin (with the virtual IP 192.168.202.100), proceed as follows:

  1. Log in to amsterdam.

  2. Set ticketA to standby:

    # crm_ticket -t ticketA -s
  3. Wait for any resources that depend on ticketA to be stopped or demoted cleanly.

  4. Revoke ticketA from site amsterdam:

    # booth revoke -s 192.168.201.100 ticketA
  5. After the ticket has been revoked from its original site, grant ticketA to the site berlin:

    # booth grant -s 192.168.202.100 ticketA

    This enables the resources which depend on this ticket to start on site berlin.

  6. Remove the standby mode for ticketA on site amsterdam:

    # crm_ticket -t ticketA -a

    In case berlin fails, resources depending on ticketA will automatically fail over to site amsterdam.

8.1.3 Moving a manual ticket

Assuming that you want to move the manual ticket ticket-nfs from site amsterdam (with the virtual IP 192.168.201.100) to site berlin (with the virtual IP 192.168.202.100), proceed as follows:

  1. Log in to amsterdam.

  2. Set ticket-nfs to standby:

    # crm_ticket -t ticket-nfs -s
  3. Wait for any resources that depend on ticket-nfs to be stopped or demoted cleanly.

  4. Revoke ticket-nfs from site amsterdam:

    # booth revoke -s 192.168.201.100 ticket-nfs
  5. After the ticket has been revoked from its original site, grant ticket-nfs to the site berlin:

    # booth grant -s 192.168.202.100 ticket-nfs

    This enables the resources which depend on this ticket to start on site berlin.

  6. If you want to move the resources back to site amsterdam at any point in time, remove the standby mode for ticket-nfs on site amsterdam:

    # crm_ticket -t ticket-nfs -a

8.1.4 Failing over a manual ticket

Let us assume that the (manually managed) ticket ticket-nfs had been granted to site amsterdam (with the virtual IP 192.168.201.100. This site cannot be reached at the moment. Site berlin (with the virtual IP 192.168.202.100) is still available.

  1. Try to contact a local administrator on site amsterdam and check if the site is down.

    • If yes, proceed with Step 2.

    • If amsterdam cannot be reached because of a connectivity problem, but the nodes are still running, ask the local cluster administrator to put ticket-nfs into standby mode on site amsterdam:

      # crm_ticket -t ticket-nfs -s

      This will relinquish the resources which depend on ticket-nfs. Now the ticket can safely be granted to the other site.

  2. Log in to berlin.

  3. Grant ticket-nfsto site berlin using the -F option:

    # booth grant -F ticket-nfs

    You will see a warning that the same ticket might be granted to another site, but the command will be executed.

  4. Check the result with:

    # booth list

    It should show berlin as ticket owner for ticket-nfs now. All resources that depend on this ticket will be started on berlin.

  5. Before trying to bring back amsterdam into the Geo cluster again, make sure to revoke ticket-nfs on amsterdam:

    # booth revoke -s 192.168.201.100 ticket-nfs

8.2 Managing tickets with Hawk2

Tickets can be viewed in both the Dashboard and the Status view. Hawk2 displays the following ticket statuses:

  • Granted: Tickets that are granted to the current site.

  • Elsewhere: Tickets that are granted to another site.

  • Revoked: Tickets that have been revoked. Additionally, Hawk2 also displays tickets as revoked if they are referenced in a ticket dependency, but have not been granted to any site yet.

Note
Note: Granting tickets to current site and revoking tickets

Though you can view tickets for all sites with Hawk2, any grant or revoke operations triggered by Hawk2 only apply to the current site (that you are currently connected to with Hawk2). To grant a ticket to another site of your Geo cluster, start Hawk2 on one of the cluster nodes belonging to the respective site.

You can only grant tickets that are not already given to any site.

Procedure 8.1: Viewing, granting and revoking tickets with Hawk2
  1. Start a Web browser and log in to Hawk2.

  2. In the left navigation bar, select Monitoring › Status.

    Along with information about cluster nodes and resources, Hawk2 also displays a Tickets category. It lists the ticket status, the ticket name and when the ticket was last granted. From the Granted column you can manage the tickets.

  3. To show further information about the ticket, along with information about the cluster sites and arbitrators, click the Details icon next to the ticket.

    Hawk2—ticket details
    Figure 8.1: Hawk2—ticket details
  4. To revoke a granted ticket from the current site or to grant a ticket to the current site, click the switch in the Granted column next to the ticket. On clicking, it shows the available action. Confirm your choice when Hawk2 prompts for a confirmation.

    If the ticket cannot be granted or revoked for any reason, Hawk2 shows an error message. If the ticket has been successfully granted or revoked, Hawk2 will update the ticket Status.

Procedure 8.2: Simulating granting and revoking tickets

Hawk2's Batch Mode allows you to explore failure scenarios before they happen. To explore whether your resources that depend on a certain ticket behave as expected, you can also test the impact of granting or revoking tickets.

  1. Start a Web browser and log in to Hawk2.

  2. From the top-level row, select Batch Mode.

  3. In the batch mode bar, click Show to open the Batch Mode window.

  4. To simulate a status change of a ticket:

    1. Click Inject › Ticket Event.

    2. Select the Ticket you want to manipulate and select the Action you want to simulate.

    3. Confirm your changes. Your event is added to the queue of events listed in the Batch Mode dialog. Any event listed here is simulated immediately and is reflected on the Status screen.

    4. Close the Batch Mode dialog and review the simulated changes.

  5. To leave the batch mode, either Apply or Discard the simulated changes.

Hawk2 simulator—tickets
Figure 8.2: Hawk2 simulator—tickets

For more information about Hawk2's Batch Mode (and which other scenarios can be explored with it), refer to Section 5.4.7, “Using the batch mode”.