8 Managing Geo clusters #
Before booth can manage a certain ticket within the Geo cluster, you initially need to grant it to a site manually—either with the booth command line client or with Hawk2.
8.1 Managing tickets from command line #
Use the booth
command line tool to grant, list, or
revoke tickets as described in Section 8.1.1, “Overview of booth
commands”.
crm_ticket
and
crm site ticket
If the booth service is not running for any reasons, you can also
manage tickets fully manually with crm_ticket
or
crm site ticket
. Both commands are only
available on cluster nodes. Use them with great care as they
cannot verify if the same ticket is already granted
elsewhere. For more information, read the man pages.
As long as booth is up and running, only use the
booth
for manual intervention.
8.1.1 Overview of booth
commands #
The booth
commands can be run on any machine in the cluster,
not only the ones having the boothd
running. The
booth
commands try to find the “local”
cluster by looking at the booth configuration file and the locally defined IP
addresses. If you do not specify a site which booth
should
connect to (using the -s
option), it will always connect to
the local site.
- Listing all tickets
#
booth list
ticket: ticketA, leader: none ticket: ticketB, leader: 10.2.12.101, expires: 2014-08-13 10:28:57If you do not specify a certain site with
-s
, the information about the tickets will be requested from the local booth instance.- Granting a ticket to a site
#
booth grant -s 192.168.201.100 ticketA
booth[27891]: 2014/08/13_10:21:23 info: grant request sent, waiting for the result ... booth[27891]: 2014/08/13_10:21:23 info: grant succeeded!In this case,
ticketA
will be granted to the site192.168.201.100
. Without the-s
option, booth would automatically connect to the current site (the site you are running the booth client on) and would request thegrant
operation.Before granting a ticket, the command executes a sanity check. If the same ticket is already granted to another site, you are warned about that and are prompted to revoke the ticket from the current site first.
- Revoking a ticket from a site
#
booth revoke ticketA
booth[27900]: 2014/08/13_10:21:23 info: revoke succeeded!Booth checks to which site the ticket is currently granted and requests the
revoke
operation forticketA
. The revoke operation will be executed immediately.The
grant
and (under certain circumstances),revoke
operations may take a while to return a definite operation's outcome. The client waits for the result up to the ticket'stimeout
value before it gives up waiting. If the-w
option was used, the client will wait indefinitely instead. Find the exact status in the log files or with thecrm_ticket -L
command.- Forcing a grant operation
#
booth grant -F ticketA
The result of this command depends on whether you use automatic or manual tickets.
Automatic tickets. As long as booth can make sure a ticket is granted to one site, you cannot grant the same ticket to another site, not even by using the
-F
option. However, in case of a split brain situation, booth might not be able to check if an automatic ticket is granted somewhere else. In that case, the Geo cluster administrator can override the automatic process and manually grant the ticket to the site that is still up and running. In this situation, the-F
options tells booth not to wait for a response from other, unreachable sites (so ignoring the parameters expire and acquire-after, if defined for this ticket). Instead, booth will immediately grant the ticket to the specified site.Manual tickets. When using manual tickets,
booth grant -F
makes booth grant the ticket immediately to the specified site.
Warning: Potential loss of dataBefore using
booth grant -F
, make sure that no other site (which is online) owns the same ticket. If the same ticket is granted to multiple sites, resources depending on the ticket might start on several sites in parallel. This results in concurrency violation and potential data corruption.As Geo cluster administrator, you need to resolve a conflict between tickets when the other site is reachable again.
In the following sections, find some examples for managing tickets in different scenarios.
8.1.2 Manually moving an automatic ticket #
Assuming that you want to manually move ticketA
from
site amsterdam
(with the virtual IP
192.168.201.100
) to site berlin
(with the virtual IP 192.168.202.100
), proceed as
follows:
Log in to
amsterdam
.Set
ticketA
to standby:#
crm_ticket -t ticketA -s
Wait for any resources that depend on
ticketA
to be stopped or demoted cleanly.Revoke
ticketA
from siteamsterdam
:#
booth revoke -s 192.168.201.100 ticketA
After the ticket has been revoked from its original site, grant
ticketA
to the siteberlin
:#
booth grant -s 192.168.202.100 ticketA
This enables the resources which depend on this ticket to start on site
berlin
.Remove the standby mode for
ticketA
on siteamsterdam
:#
crm_ticket -t ticketA -a
In case
berlin
fails, resources depending onticketA
will automatically fail over to siteamsterdam
.
8.1.3 Moving a manual ticket #
Assuming that you want to move the manual ticket ticket-nfs
from
site amsterdam
(with the virtual IP
192.168.201.100
) to site berlin
(with the virtual IP 192.168.202.100
), proceed as
follows:
Log in to
amsterdam
.Set
ticket-nfs
to standby:#
crm_ticket -t ticket-nfs -s
Wait for any resources that depend on
ticket-nfs
to be stopped or demoted cleanly.Revoke
ticket-nfs
from siteamsterdam
:#
booth revoke -s 192.168.201.100 ticket-nfs
After the ticket has been revoked from its original site, grant
ticket-nfs
to the siteberlin
:#
booth grant -s 192.168.202.100 ticket-nfs
This enables the resources which depend on this ticket to start on site
berlin
.If you want to move the resources back to site
amsterdam
at any point in time, remove the standby mode forticket-nfs
on siteamsterdam
:#
crm_ticket -t ticket-nfs -a
8.1.4 Failing over a manual ticket #
Let us assume that the (manually managed) ticket
ticket-nfs
had been granted to site
amsterdam
(with the virtual IP
192.168.201.100
. This site cannot be reached at the
moment. Site berlin
(with the virtual IP
192.168.202.100
) is still available.
Try to contact a local administrator on site
amsterdam
and check if the site is down.If yes, proceed with Step 2.
If
amsterdam
cannot be reached because of a connectivity problem, but the nodes are still running, ask the local cluster administrator to putticket-nfs
into standby mode on siteamsterdam
:#
crm_ticket -t ticket-nfs -s
This will relinquish the resources which depend on
ticket-nfs
. Now the ticket can safely be granted to the other site.
Log in to
berlin
.Grant
ticket-nfs
to siteberlin
using the-F
option:#
booth grant -F ticket-nfs
You will see a warning that the same ticket might be granted to another site, but the command will be executed.
Check the result with:
#
booth list
It should show
berlin
as ticket owner forticket-nfs
now. All resources that depend on this ticket will be started onberlin
.Before trying to bring back
amsterdam
into the Geo cluster again, make sure to revoketicket-nfs
onamsterdam
:#
booth revoke -s 192.168.201.100 ticket-nfs
8.2 Managing tickets with Hawk2 #
Tickets can be viewed in both the
and the view. Hawk2 displays the following ticket statuses:Though you can view tickets for all sites with Hawk2, any grant or revoke operations triggered by Hawk2 only apply to the current site (that you are currently connected to with Hawk2). To grant a ticket to another site of your Geo cluster, start Hawk2 on one of the cluster nodes belonging to the respective site.
You can only grant tickets that are not already given to any site.
Start a Web browser and log in to Hawk2.
In the left navigation bar, select
› .Along with information about cluster nodes and resources, Hawk2 also displays a
category. It lists the ticket status, the ticket name and when the ticket was last granted. From the column you can manage the tickets.To show further information about the ticket, along with information about the cluster sites and arbitrators, click the
icon next to the ticket.Figure 8.1: Hawk2—ticket details #To revoke a granted ticket from the current site or to grant a ticket to the current site, click the switch in the
column next to the ticket. On clicking, it shows the available action. Confirm your choice when Hawk2 prompts for a confirmation.If the ticket cannot be granted or revoked for any reason, Hawk2 shows an error message. If the ticket has been successfully granted or revoked, Hawk2 will update the ticket
.
Hawk2's
allows you to explore failure scenarios before they happen. To explore whether your resources that depend on a certain ticket behave as expected, you can also test the impact of granting or revoking tickets.Start a Web browser and log in to Hawk2.
From the top-level row, select
.In the batch mode bar, click
to open the window.To simulate a status change of a ticket:
Click
› .Select the
you want to manipulate and select the you want to simulate.Confirm your changes. Your event is added to the queue of events listed in the
dialog. Any event listed here is simulated immediately and is reflected on the screen.Close the
dialog and review the simulated changes.
To leave the batch mode, either
or the simulated changes.
For more information about Hawk2's 7.9項 「バッチモードの使用」.
(and which other scenarios can be explored with it), refer to