Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
Applies to SUSE Cloud Application Platform 1.5.2

7 SUSE Cloud Application Platform High Availability

7.1 Configuring Cloud Application Platform for High Availability

There are two ways to make your SUSE Cloud Application Platform deployment highly available. The simplest method is to set the config.HA parameter in your deployment configuration file to true. The second method is to create custom configuration files with your own sizing values.

7.1.1 Prerequisites

  • A running deployment of SUSE Cloud Application Platform where the database roles, mysql, are single availability for both uaa and scf. By starting with single availability mysql roles then transitioning to high availability mysql roles, it allows other resources dependent on the database to come up properly. It is not recommended for initial deployments of SUSE Cloud Application Platform to have the mysql roles in high availability mode.

7.1.2 Finding Default and Allowable Sizing Values

The sizing: section in the Helm values.yaml files for each chart describes which roles can be scaled, and the scaling options for each role. You may use helm inspect to read the sizing: section in the Helm charts:

tux > helm inspect suse/uaa | less +/sizing:
tux > helm inspect suse/cf | less +/sizing:

Another way is to use Perl to extract the information for each role from the sizing: section. The following example is for the uaa namespace.

tux > helm inspect values suse/uaa | \
perl -ne '/^sizing/..0 and do { print $.,":",$_ if /^ [a-z]/ || /high avail|scale|count/ }'
199:    # The mysql instance group can scale between 1 and 7 instances.
200:    # For high availability it needs at least 3 instances.
201:    count: ~
226:    # The mysql_proxy instance group can scale between 1 and 5 instances.
227:    # For high availability it needs at least 2 instances.
228:    count: ~
247:    # The secret_generation instance group cannot be scaled.
248:    count: ~
276:  #   for managing user accounts and for registering OAuth2 clients, as well as
282:    # The uaa instance group can scale between 1 and 65535 instances.
283:    # For high availability it needs at least 2 instances.
284:    count: ~

The default values.yaml files are also included in this guide at Section A.2, “Complete suse/uaa values.yaml File” and Section A.3, “Complete suse/scf values.yaml File”.

7.1.3 Simple High Availability Configuration

Important
Important
Always install to a fresh namespace

If you are not creating a fresh SUSE Cloud Application Platform installation, but have deleted a previous deployment and are starting over, you must create new namespaces. Do not re-use your old namespaces. The helm delete command does not remove generated secrets from the scf and uaa namespaces as it is not aware of them. These leftover secrets may cause deployment failures. See Section 31.4, “Deleting and Rebuilding a Deployment” for more information.

The simplest way to make your SUSE Cloud Application Platform deployment highly available is to set config.HA to true. This changes the size of all roles to the minimum required for a highly available deployment. In your deployment configuration file, scf-config-values.yaml, include the following.

config:
  # Flag to activate high-availability mode
  HA: true

Use helm install to deploy or helm upgrade to apply the change to an existing deployment.

tux > helm install suse/uaa \
--name susecf-uaa \
--namespace uaa \
--values scf-config-values.yaml

When config.HA is set to true, instances groups can be made to allow sizing values other than the minimum required for High Availability mode. To do so, set the config.HA_strict flag to false in conjunction with specifying the count for a given instance group in the sizing: section. As an example, the following configuration enables High Availability mode while only using 1 instance of the mysql instance group.

config:
  # Flag to activate high-availability mode
  HA: true
  HA_strict: false
sizing:
  mysql:
    count: 1

7.1.4 Example Custom High Availability Configurations

The following two example High Availability configuration files are for the uaa and scf namespaces. The example values are not meant to be copied, as these depend on your particular deployment and requirements.

Note
Note: Embedded uaa in scf

The User Account and Authentication (uaa) Server is included as an optional feature of the scf Helm chart. This simplifies the Cloud Application Platform deployment process as a separate installation and/or upgrade of uaa is no longer a prerequisite to installing and/or upgrading scf.

It is important to note that:

  • This feature should only be used when uaa is not shared with other projects.

  • You cannot migrate from an existing external uaa to an embedded one. In this situation, enabling this feature during an upgrade will result in a single admin account.

To enable this feature, add the following to your scf-config-values.yaml.

enable:
  uaa: true

When deploying and/or upgrading scf, run helm install and/or helm upgrade and note that:

  • Installing and/or upgrading uaa using helm install suse/uaa ... and/or helm upgrade is no longer required.

  • It is no longer necessary to set the UAA_CA_CERT parameter. Previously, this parameter was passed the CA_CERT variable, which was assigned the CA certificate of uaa.

The first example is for the uaa namespace, uaa-sizing.yaml. The values specified are the minimum required for a High Availability deployment, equivalent to setting config.HA to true.

sizing:
  mysql:
    count: 3
  mysql_proxy:
    count: 2
  uaa:
    count: 2

The second example is for scf, scf-sizing.yaml. The values specified are the minimum required for a High Availability deployment, equivalent to setting config.HA to true.

sizing:
  adapter:
    count: 2
  api_group:
    count: 2
  cc_clock:
    count: 2
  cc_uploader:
    count: 2
  cc_worker:
    count: 2
  cf_usb_group:
    count: 2
  diego_api:
    count: 2
  diego_brain:
    count: 2
  diego_cell:
    count: 3
  diego_ssh:
    count: 2
  doppler:
    count: 2
  locket:
    count: 2
  log_api:
    count: 2
  log_cache_scheduler:
    count: 2
  loggregator_agent:
    count: 2
  mysql:
    count: 3
  mysql_proxy:
    count: 2
  nats:
    count: 2
  nfs_broker:
    count: 2
  router:
    count: 2
  routing_api:
    count: 2
  syslog_scheduler:
    count: 2
  tcp_router:
    count: 2

When using custom sizing configurations, take note that the mysql instance group, for both uaa and scf, must have have an odd number of instances.

Important
Important
Always install to a fresh namespace

If you are not creating a fresh SUSE Cloud Application Platform installation, but have deleted a previous deployment and are starting over, you must create new namespaces. Do not re-use your old namespaces. The helm delete command does not remove generated secrets from the scf and uaa namespaces as it is not aware of them. These leftover secrets may cause deployment failures. See Section 31.4, “Deleting and Rebuilding a Deployment” for more information.

After creating your configuration files, follow the steps in Section 5.5, “Deployment Configuration” until you get to Section 5.7.1, “Deploy uaa. Then deploy uaa with this command:

tux > helm install suse/uaa \
--name susecf-uaa \
--namespace uaa \
--values scf-config-values.yaml \
--values uaa-sizing.yaml

Wait until you have a successful uaa deployment before going to the next steps, which you can monitor with the watch command:

tux > watch --color 'kubectl get pods --namespace uaa'

When uaa is successfully deployed, the following is observed:

  • For the secret-generation pod, the STATUS is Completed and the READY column is at 0/1.

  • All other pods have a Running STATUS and a READY value of n/n.

Press CtrlC to exit the watch command.

When the uaa deployment completes, deploy SCF with these commands:

Note
Note: Setting UAA_CA_CERT

Starting with SUSE Cloud Application Platform 1.5.2, you no longer need to set UAA_CA_CERT when using an external UAA with a certificate signed by a well known Certificate Authority (CA). It is only needed when you use an external UAA with either a certificate generated by the secret-generator or a self-signed certificate.

If you need to set UAA_CA_CERT:

  1. Obtain your UAA secret and certificate:

    tux > SECRET=$(kubectl get pods --namespace uaa \
    --output jsonpath='{.items[?(.metadata.name=="uaa-0")].spec.containers[?(.name=="uaa")].env[?(.name=="INTERNAL_CA_CERT")].valueFrom.secretKeyRef.name}')
    
    tux > CA_CERT="$(kubectl get secret $SECRET --namespace uaa \
    --output jsonpath="{.data['internal-ca-cert']}" | base64 --decode -)"
  2. Then pass --set "secrets.UAA_CA_CERT=${CA_CERT}" as part of your helm command.

tux > helm install suse/cf \
--name susecf-scf \
--namespace scf \
--values scf-config-values.yaml \
--values scf-sizing.yaml

7.2 Availability Zones

Availability Zones (AZ) are logical arrangements of compute nodes within a region that provide isolation from each other. A deployment that is distributed across multiple AZs can use this separation to increase resiliency against downtime in the event a given zone experiences issues.

Refer to the following for platform-specific information about availability zones:

7.2.1 Availability Zone Information Handling

In Cloud Application Platform, availability zone handling is done using the AZ_LABEL_NAME Helm chart value. By default, AZ_LABEL_NAME is set to failure-domain.beta.kubernetes.io/zone, which is the predefined Kubernetes label for availability zones. On most public cloud providers, nodes will already have this label set and availability zone support will work without further configuration. For on-premise installations, it is recommended that nodes are labeled with the same label.

Run the following to see the labels on your nodes.

tux > kubectl get nodes --show-labels

To label a node, use kubectl label nodes . For example:

tux > kubectl label nodes cap-worker-1 failure-domain.beta.kubernetes.io/zone=zone-1

To see which node and availability zone a given diego-cell pod is assigned to, refer to the following example:

tux > kubectl logs diego-cell-0 --namespace scf | grep ^AZ

For more information on the failure-domain.beta.kubernetes.io/zone label, see https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#failure-domainbetakubernetesiozone.

Note that due to a bug in Cloud Application Platform 1.4 and earlier, this label was not working for AZ_LABEL_NAME.

Important
Important: Performance with Availability Zones

For the best performance, all availability zones should have a similar number of nodes because app instances will be evenly distributed, so that each zone has about the same number of instances.

Print this page