Jump to content
SUSE OpenStack Cloud 9

Operations Guide CLM Edit source

Abstract

At the time of the SUSE OpenStack Cloud 9 release, this guide contains information pertaining to the operation, administration, and user functions of SUSE OpenStack Cloud. The audience is the admin-level operator of the cloud.

Publication Date: 08/08/2022
1 Operations Overview
1.1 What is a cloud operator?
1.2 Tools provided to operate your cloud
1.3 Daily tasks
1.4 Weekly or monthly tasks
1.5 Semi-annual tasks
1.6 Troubleshooting
1.7 Common Questions
2 Tutorials
2.1 SUSE OpenStack Cloud Quickstart Guide
2.2 Installing the Command-Line Clients
2.3 Cloud Admin Actions with the Command Line
2.4 Log Management and Integration
2.5 Integrating Your Logs with Splunk
2.6 Integrating SUSE OpenStack Cloud with an LDAP System
3 Cloud Lifecycle Manager Admin UI User Guide
3.1 Accessing the Admin UI
3.2 Admin UI Pages
3.3 Topology
3.4 Server Management
3.5 Server Replacement
4 Third-Party Integrations
4.1 Splunk Integration
4.2 Operations Bridge Integration
4.3 Monitoring Third-Party Components With Monasca
5 Managing Identity
5.1 The Identity Service
5.2 Supported Upstream Keystone Features
5.3 Understanding Domains, Projects, Users, Groups, and Roles
5.4 Identity Service Token Validation Example
5.5 Configuring the Identity Service
5.6 Retrieving the Admin Password
5.7 Changing Service Passwords
5.8 Reconfiguring the Identity service
5.9 Integrating LDAP with the Identity Service
5.10 keystone-to-keystone Federation
5.11 Configuring Web Single Sign-On
5.12 Identity Service Notes and Limitations
6 Managing Compute
6.1 Managing Compute Hosts using Aggregates and Scheduler Filters
6.2 Using Flavor Metadata to Specify CPU Model
6.3 Forcing CPU and RAM Overcommit Settings
6.4 Enabling the Nova Resize and Migrate Features
6.5 Enabling ESX Compute Instance(s) Resize Feature
6.6 GPU passthrough
6.7 Configuring the Image Service
7 Managing ESX
7.1 Networking for ESXi Hypervisor (OVSvApp)
7.2 Validating the neutron Installation
7.3 Removing a Cluster from the Compute Resource Pool
7.4 Removing an ESXi Host from a Cluster
7.5 Configuring Debug Logging
7.6 Making Scale Configuration Changes
7.7 Monitoring vCenter Clusters
7.8 Monitoring Integration with OVSvApp Appliance
8 Managing Block Storage
8.1 Managing Block Storage using Cinder
9 Managing Object Storage
9.1 Running the swift Dispersion Report
9.2 Gathering Swift Data
9.3 Gathering Swift Monitoring Metrics
9.4 Using the swift Command-line Client (CLI)
9.5 Managing swift Rings
9.6 Configuring your swift System to Allow Container Sync
10 Managing Networking
10.1 SUSE OpenStack Cloud Firewall
10.2 Using VPN as a Service (VPNaaS)
10.3 DNS Service Overview
10.4 Networking Service Overview
10.5 Creating a Highly Available Router
11 Managing the Dashboard
11.1 Configuring the Dashboard Service
11.2 Changing the Dashboard Timeout Value
11.3 Creating a Load Balancer with the Dashboard
12 Managing Orchestration
12.1 Configuring the Orchestration Service
12.2 Autoscaling using the Orchestration Service
12.3 Orchestration Service support for LBaaS v2
13 Managing Monitoring, Logging, and Usage Reporting
13.1 Monitoring
13.2 Centralized Logging Service
13.3 Metering Service (ceilometer) Overview
14 Managing Container as a Service (Magnum)
14.1 Deploying a Kubernetes Cluster on Fedora Atomic
14.2 Deploying a Kubernetes Cluster on CoreOS
14.3 Deploying a Docker Swarm Cluster on Fedora Atomic
14.4 Deploying an Apache Mesos Cluster on Ubuntu
14.5 Creating a Magnum Cluster with the Dashboard
15 System Maintenance
15.1 Planned System Maintenance
15.2 Unplanned System Maintenance
15.3 Cloud Lifecycle Manager Maintenance Update Procedure
15.4 Upgrading Cloud Lifecycle Manager 8 to Cloud Lifecycle Manager 9
15.5 Cloud Lifecycle Manager Program Temporary Fix (PTF) Deployment
15.6 Periodic OpenStack Maintenance Tasks
16 Operations Console
16.1 Using the Operations Console
16.2 Alarm Definition
16.3 Alarm Explorer
16.4 Compute Hosts
16.5 Compute Instances
16.6 Compute Summary
16.7 Logging
16.8 My Dashboard
16.9 Networking Alarm Summary
16.10 Central Dashboard
17 Backup and Restore
17.1 Manual Backup Overview
17.2 Enabling Backups to a Remote Server
17.3 Manual Backup and Restore Procedures
17.4 Full Disaster Recovery Test
18 Troubleshooting Issues
18.1 General Troubleshooting
18.2 Control Plane Troubleshooting
18.3 Troubleshooting Compute service
18.4 Network Service Troubleshooting
18.5 Troubleshooting the Image (glance) Service
18.6 Storage Troubleshooting
18.7 Monitoring, Logging, and Usage Reporting Troubleshooting
18.8 Orchestration Troubleshooting
18.9 Troubleshooting Tools
List of Figures
3.1 Cloud Lifecycle Manager Admin UI Login Page
3.2 Cloud Lifecycle Manager Admin UI Service Information
3.3 Cloud Lifecycle Manager Admin UI SUSE Cloud Package
3.4 Cloud Lifecycle Manager Admin UI SUSE Service Configuration
3.5 Cloud Lifecycle Manager Admin UI SUSE Service Configuration Editor
3.6 Cloud Lifecycle Manager Admin UI SUSE Service Configuration Update
3.7 Cloud Lifecycle Manager Admin UI SUSE Service Model
3.8 Cloud Lifecycle Manager Admin UI SUSE Service Model Editor
3.9 Cloud Lifecycle Manager Admin UI SUSE Service Model Confirmation
3.10 Cloud Lifecycle Manager Admin UI SUSE Service Model Update
3.11 Cloud Lifecycle Manager Admin UI Services Per Role
3.12 Cloud Lifecycle Manager Admin UI Server Summary
3.13 Server Details (1/2)
3.14 Server Details (2/2)
3.15 Control Plane Topology
3.16 Control Plane Topology - Availability Zones
3.17 Regions Topology
3.18 Services Topology
3.19 Service Details Topology
3.20 Networks Topology
3.21 Network Groups Topology
3.22 Server Groups Topology
3.23 Roles Topology
3.24 Add Server Overview
3.25 Manually Add Server
3.26 Manually Add Server
3.27 Add Server Settings options
3.28 Select Servers to Provision OS
3.29 Confirm Provision OS
3.30 OS Install Progress
3.31 OS Install Summary
3.32 Confirm Deploy Servers
3.33 Validate Server Changes
3.34 Prepare Servers
3.35 Deploy Servers
3.36 Deploy Summary
3.37 Activate Server
3.38 Activate Server Progress
3.39 Deactivate Server
3.40 Deactivate Server Confirmation
3.41 Deactivate Server Progress
3.42 Select Migration Target
3.43 Deactivate Migration Progress
3.44 Delete Server
3.45 Delete Server Confirmation
3.46 Unreachable Delete Confirmation
3.47 Delete Server Progress
3.48 Replace Server Menu
3.49 Replace Controller Form
3.50 Replace Controller Progress
3.51 Replace Compute Menu
3.52 Unreachable Compute Node Warning
3.53 Replace Compute Form
3.54 Install SLES on New Compute
3.55 Prepare Compute Server
3.56 Deploy New Compute Server
3.57 Host Aggregate Removal Warning
3.58 Migrate Instances from Existing Compute Server
3.59 Disable Existing Compute Server
3.60 Existing Server Shutdown Check
3.61 Existing Server Delete
3.62 Compute Replacement Summary
5.1 Keystone Authentication Flow
16.1 Compute Hosts
16.2 Compute Summary
List of Examples
5.1 k2kclient.py

Copyright © 2006– 2022 SUSE LLC and contributors. All rights reserved.

Except where otherwise noted, this document is licensed under Creative Commons Attribution 3.0 License : https://creativecommons.org/licenses/by/3.0/legalcode.

For SUSE trademarks, see https://www.suse.com/company/legal/. All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its affiliates. Asterisks (*) denote third-party trademarks.

All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its affiliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof.

1 Operations Overview Edit source

A high-level overview of the processes related to operating a SUSE OpenStack Cloud 9 cloud.

1.1 What is a cloud operator? Edit source

When we talk about a cloud operator it is important to understand the scope of the tasks and responsibilities we are referring to. SUSE OpenStack Cloud defines a cloud operator as the person or group of people who will be administering the cloud infrastructure, which includes:

  • Monitoring the cloud infrastructure, resolving issues as they arise.

  • Managing hardware resources, adding/removing hardware due to capacity needs.

  • Repairing, and recovering if needed, any hardware issues.

  • Performing domain administration tasks, which involves creating and managing projects, users, and groups as well as setting and managing resource quotas.

1.2 Tools provided to operate your cloud Edit source

SUSE OpenStack Cloud provides the following tools which are available to operate your cloud:

Operations Console

Often referred to as the Ops Console, you can use this console to view data about your cloud infrastructure in a web-based graphical user interface (GUI) to make sure your cloud is operating correctly. By logging on to the console, SUSE OpenStack Cloud administrators can manage data in the following ways:

  • Triage alarm notifications in the central dashboard

  • Monitor the environment by giving priority to alarms that take precedence

  • Manage compute nodes and easily use a form to create a new host

  • Refine the monitoring environment by creating new alarms to specify a combination of metrics, services, and hosts that match the triggers unique to an environment

  • Plan for future storage by tracking capacity over time to predict with some degree of reliability the amount of additional storage needed

Dashboard

Often referred to as horizon or the horizon dashboard, you can use this console to manage resources on a domain and project level in a web-based graphical user interface (GUI). The following are some of the typical operational tasks that you may perform using the dashboard:

  • Creating and managing projects, users, and groups within your domain.

  • Assigning roles to users and groups to manage access to resources.

  • Setting and updating resource quotas for the projects.

For more details, see the following page: Section 5.3, “Understanding Domains, Projects, Users, Groups, and Roles”

Command-line interface (CLI)

The OpenStack community has created a unified client, called the openstackclient (OSC), which combines the available commands in the various service-specific clients into one tool. Some service-specific commands do not have OSC equivalents.

You will find processes defined in our documentation that use these command-line tools. There is also a list of common cloud administration tasks which we have outlined which you can use the command-line tools to do.

There are references throughout the SUSE OpenStack Cloud documentation to the HPE Smart Storage Administrator (HPE SSA) CLI. HPE-specific binaries that are not based on open source are distributed directly from and supported by HPE. To download and install the SSACLI utility, please refer to: https://support.hpe.com/hpsc/swd/public/detail?swItemId=MTX_3d16386b418a443388c18da82f

1.3 Daily tasks Edit source

  • Ensure your cloud is running correctly: SUSE OpenStack Cloud is deployed as a set of highly available services to minimize the impact of failures. That said, hardware and software systems can fail. Detection of failures early in the process will enable you to address issues before they affect the broader system. SUSE OpenStack Cloud provides a monitoring solution, based on OpenStack’s monasca, which provides monitoring and metrics for all OpenStack components and much of the underlying system, including service status, performance metrics, compute node, and virtual machine status. Failures are exposed via the Operations Console and/or alarm notifications. In the case where more detailed diagnostics are required, you can use a centralized logging system based on the Elasticsearch, Logstash, and Kibana (ELK) stack. This provides the ability to search service logs to get detailed information on behavior and errors.

  • Perform critical maintenance: To ensure your OpenStack installation is running correctly, provides the right access and functionality, and is secure, you should make ongoing adjustments to the environment. Examples of daily maintenance tasks include:

    • Add/remove projects and users. The frequency of this task depends on your policy.

    • Apply security patches (if released).

    • Run daily backups.

1.4 Weekly or monthly tasks Edit source

  • Do regular capacity planning: Your initial deployment will likely reflect the known near to mid-term scale requirements, but at some point your needs will outgrow your initial deployment’s capacity. You can expand SUSE OpenStack Cloud in a variety of ways, such as by adding compute and storage capacity.

To manage your cloud’s capacity, begin by determining the load on the existing system. OpenStack is a set of relatively independent components and services, so there are multiple subsystems that can affect capacity. These include control plane nodes, compute nodes, object storage nodes, block storage nodes, and an image management system. At the most basic level, you should look at the CPU used, RAM used, I/O load, and the disk space used relative to the amounts available. For compute nodes, you can also evaluate the allocation of resource to hosted virtual machines. This information can be viewed in the Operations Console. You can pull historical information from the monitoring service (OpenStack’s monasca) by using its client or API. Also, OpenStack provides you some ability to manage the hosted resource utilization by using quotas for projects. You can track this usage over time to get your growth trend so that you can project when you will need to add capacity.

1.5 Semi-annual tasks Edit source

  • Perform upgrades: OpenStack releases new versions on a six-month cycle. In general, SUSE OpenStack Cloud will release new major versions annually with minor versions and maintenance updates more often. Each new release consists of both new functionality and services, as well as bug fixes for existing functionality.

Note
Note

If you are planning to upgrade, this is also an excellent time to evaluate your existing capabilities, especially in terms of capacity (see Capacity Planning above).

1.6 Troubleshooting Edit source

As part of managing your cloud, you should be ready to troubleshoot issues, as needed. The following are some common troubleshooting scenarios and solutions:

How do I determine if my cloud is operating correctly now?: SUSE OpenStack Cloud provides a monitoring solution based on OpenStack’s monasca service. This service provides monitoring and metrics for all OpenStack components, as well as much of the underlying system. By default, SUSE OpenStack Cloud comes with a set of alarms that provide coverage of the primary systems. In addition, you can define alarms based on threshold values for any metrics defined in the system. You can view alarm information in the Operations Console. You can also receive or deliver this information to others by configuring email or other mechanisms. Alarms provide information about whether a component failed and is affecting the system, and also what condition triggered the alarm.

How do I troubleshoot and resolve performance issues for my cloud?: There are a variety of factors that can affect the performance of a cloud system, such as the following:

  • Health of the control plane

  • Health of the hosting compute node and virtualization layer

  • Resource allocation on the compute node

If your cloud users are experiencing performance issues on your cloud, use the following approach:

  1. View the compute summary page on the Operations Console to determine if any alarms have been triggered.

  2. Determine the hosting node of the virtual machine that is having issues.

  3. On the compute hosts page, view the status and resource utilization of the compute node to determine if it has errors or is over-allocated.

  4. On the compute instances page you can view the status of the VM along with its metrics.

How do I troubleshoot and resolve availability issues for my cloud?: If your cloud users are experiencing availability issues, determine what your users are experiencing that indicates to them the cloud is down. For example, can they not access the Dashboard service (horizon) console or APIs, indicating a problem with the control plane? Or are they having trouble accessing resources? Console/API issues would indicate a problem with the control planes. Use the Operations Console to view the status of services to see if there is an issue. However, if it is an issue of accessing a virtual machine, then also search the consolidated logs that are available in the ELK stack or errors related to the virtual machine and supporting networking.

1.7 Common Questions Edit source

What skills do my cloud administrators need?

Your administrators should be experienced Linux admins. They should have experience in application management, as well as experience with Ansible. It is a plus if they have experience with Bash shell scripting and Python programming skills.

In addition, you will need skilled networking engineering staff to administer the cloud network environment.

2 Tutorials Edit source

This section contains tutorials for common tasks for your SUSE OpenStack Cloud 9 cloud.

2.1 SUSE OpenStack Cloud Quickstart Guide Edit source

2.1.1 Introduction Edit source

This document provides simplified instructions for installing and setting up a SUSE OpenStack Cloud. Use this quickstart guide to build testing, demonstration, and lab-type environments., rather than production installations. When you complete this quickstart process, you will have a fully functioning SUSE OpenStack Cloud demo environment.

Note
Note

These simplified instructions are intended for testing or demonstration. Instructions for production installations are in Book “Deployment Guide using Cloud Lifecycle Manager.

2.1.2 Overview of components Edit source

The following are short descriptions of the components that SUSE OpenStack Cloud employs when installing and deploying your cloud.

Ansible.  Ansible is a powerful configuration management tool used by SUSE OpenStack Cloud to manage nearly all aspects of your cloud infrastructure. Most commands in this quickstart guide execute Ansible scripts, known as playbooks. You will run playbooks that install packages, edit configuration files, manage network settings, and take care of the general administration tasks required to get your cloud up and running.

Get more information on Ansible at https://www.ansible.com/.

Cobbler.  Cobbler is another third-party tool used by SUSE OpenStack Cloud to deploy operating systems across the physical servers that make up your cloud. Find more info at http://cobbler.github.io/.

Git.  Git is the version control system used to manage the configuration files that define your cloud. Any changes made to your cloud configuration files must be committed to the locally hosted git repository to take effect. Read more information on Git at https://git-scm.com/.

2.1.3 Preparation Edit source

Successfully deploying a SUSE OpenStack Cloud environment is a large endeavor, but it is not complicated. For a successful deployment, you must put a number of components in place before rolling out your cloud. Most importantly, a basic SUSE OpenStack Cloud requires the proper network infrastrucure. Because SUSE OpenStack Cloud segregates the network traffic of many of its elements, if the necessary networks, routes, and firewall access rules are not in place, communication required for a successful deployment will not occur.

2.1.4 Getting Started Edit source

When your network infrastructure is in place, go ahead and set up the Cloud Lifecycle Manager. This is the server that will orchestrate the deployment of the rest of your cloud. It is also the server you will run most of your deployment and management commands on.

Set up the Cloud Lifecycle Manager

  1. Download the installation media

    Obtain a copy of the SUSE OpenStack Cloud installation media, and make sure that it is accessible by the server that you are installing it on. Your method of doing this may vary. For instance, some may choose to load the installation ISO on a USB drive and physically attach it to the server, while others may run the IPMI Remote Console and attach the ISO to a virtual disc drive.

  2. Install the operating system

    1. Boot your server, using the installation media as the boot source.

    2. Choose "install" from the list of options and choose your preferred keyboard layout, location, language, and other settings.

    3. Set the address, netmask, and gateway for the primary network interface.

    4. Create a root user account.

    Proceed with the OS installation. After the installation is complete and the server has rebooted into the new OS, log in with the user account you created.

  3. Configure the new server

    1. SSH to your new server, and set a valid DNS nameserver in the /etc/resolv.conf file.

    2. Set the environment variable LC_ALL:

      export LC_ALL=C

    You now have a server running SUSE Linux Enterprise Server (SLES). The next step is to configure this machine as a Cloud Lifecycle Manager.

  4. Configure the Cloud Lifecycle Manager

    The installation media you used to install the OS on the server also has the files that will configure your cloud. You need to mount this installation media on your new server in order to use these files.

    1. Using the URL that you obtained the SUSE OpenStack Cloud installation media from, run wget to download the ISO file to your server:

      wget INSTALLATION_ISO_URL
    2. Now mount the ISO in the /media/cdrom/ directory

      sudo mount INSTALLATION_ISO /media/cdrom/
    3. Unpack the tar file found in the /media/cdrom/ardana/ directory where you just mounted the ISO:

      tar xvf /media/cdrom/ardana/ardana-x.x.x-x.tar
    4. Now you will install and configure all the components needed to turn this server into a Cloud Lifecycle Manager. Run the ardana-init.bash script from the uncompressed tar file:

      ~/ardana-x.x.x/ardana-init.bash

      The ardana-init.bash script prompts you to enter an optional SSH passphrase. This passphrase protects the RSA key used to SSH to the other cloud nodes. This is an optional passphrase, and you can skip it by pressing Enter at the prompt.

      The ardana-init.bash script automatically installs and configures everything needed to set up this server as the lifecycle manager for your cloud.

      When the script has finished running, you can proceed to the next step, editing your input files.

  5. Edit your input files

    Your SUSE OpenStack Cloud input files are where you define your cloud infrastructure and how it runs. The input files define options such as which servers are included in your cloud, the type of disks the servers use, and their network configuration. The input files also define which services your cloud will provide and use, the network architecture, and the storage backends for your cloud.

    There are several example configurations, which you can find on your Cloud Lifecycle Manager in the ~/openstack/examples/ directory.

    1. The simplest way to set up your cloud is to copy the contents of one of these example configurations to your ~/openstack/mycloud/definition/ directory. You can then edit the copied files and define your cloud.

      cp -r ~/openstack/examples/CHOSEN_EXAMPLE/* ~/openstack/my_cloud/definition/
    2. Edit the files in your ~/openstack/my_cloud/definition/ directory to define your cloud.

  6. Commit your changes

    When you finish editing the necessary input files, stage them, and then commit the changes to the local Git repository:

    cd ~/openstack/ardana/ansible
    git add -A
    git commit -m "My commit message"
  7. Image your servers

    Now that you have finished editing your input files, you can deploy the configuration to the servers that will comprise your cloud.

    1. Image the servers. You will install the SLES operating system across all the servers in your cloud, using Ansible playbooks to trigger the process.

    2. The following playbook confirms that your servers are accessible over their IPMI ports, which is a prerequisite for the imaging process:

      ansible-playbook -i hosts/localhost bm-power-status.yml
    3. Now validate that your cloud configuration files have proper YAML syntax by running the config-processor-run.yml playbook:

      ansible-playbook -i hosts/localhost config-processor-run.yml

      If you receive an error when running the preceeding playbook, one or more of your configuration files has an issue. Refer to the output of the Ansible playbook, and look for clues in the Ansible log file, found at ~/.ansible/ansible.log.

    4. The next step is to prepare your imaging system, Cobbler, to deploy operating systems to all your cloud nodes:

      ansible-playbook -i hosts/localhost cobbler-deploy.yml
    5. Now you can image your cloud nodes. You will use an Ansible playbook to trigger Cobbler to deploy operating systems to all the nodes you specified in your input files:

      ansible-playbook -i hosts/localhost bm-reimage.yml

      The bm-reimage.yml playbook performs the following operations:

      1. Powers down the servers.

      2. Sets the servers to boot from a network interface.

      3. Powers on the servers and performs a PXE OS installation.

      4. Waits for the servers to power themselves down as part of a successful OS installation. This can take some time.

      5. Sets the servers to boot from their local hard disks and powers on the servers.

      6. Waits for the SSH service to start on the servers and verifies that they have the expected host-key signature.

  8. Deploy your cloud

    Now that your servers are running the SLES operating system, it is time to configure them for the roles they will play in your new cloud.

    1. Prepare the Cloud Lifecycle Manager to deploy your cloud configuration to all the nodes:

      ansible-playbook -i hosts/localhost ready-deployment.yml

      NOTE: The preceding playbook creates a new directory, ~/scratch/ansible/next/ardana/ansible/, from which you will run many of the following commands.

    2. (Optional) If you are reusing servers or disks to run your cloud, you can wipe the disks of your newly imaged servers by running the wipe_disks.yml playbook:

      cd ~/scratch/ansible/next/ardana/ansible/
      ansible-playbook -i hosts/verb_hosts wipe_disks.yml

      The wipe_disks.yml playbook removes any existing data from the drives on your new servers. This can be helpful if you are reusing servers or disks. This action will not affect the OS partitions on the servers.

      Note
      Note

      The wipe_disks.yml playbook is only meant to be run on systems immediately after running bm-reimage.yml. If used for any other case, it may not wipe all of the expected partitions. For example, if site.yml fails, you cannot start fresh by running wipe_disks.yml. You must bm-reimage the node first and then run wipe_disks.

    3. Now it is time to deploy your cloud. Do this by running the site.yml playbook, which pushes the configuration you defined in the input files out to all the servers that will host your cloud.

      cd ~/scratch/ansible/next/ardana/ansible/
      ansible-playbook -i hosts/verb_hosts site.yml

      The site.yml playbook installs packages, starts services, configures network interface settings, sets iptables firewall rules, and more. Upon successful completion of this playbook, your SUSE OpenStack Cloud will be in place and in a running state. This playbook can take up to six hours to complete.

  9. SSH to your nodes

    Now that you have successfully run site.yml, your cloud will be up and running. You can verify connectivity to your nodes by connecting to each one by using SSH. You can find the IP addresses of your nodes by viewing the /etc/hosts file.

    For security reasons, you can only SSH to your nodes from the Cloud Lifecycle Manager. SSH connections from any machine other than the Cloud Lifecycle Manager will be refused by the nodes.

    From the Cloud Lifecycle Manager, SSH to your nodes:

    ssh <management IP address of node>

    Also note that SSH is limited to your cloud's management network. Each node has an address on the management network, and you can find this address by reading the /etc/hosts or server_info.yml file.

2.2 Installing the Command-Line Clients Edit source

During the installation, by default, the suite of OpenStack command-line tools are installed on the Cloud Lifecycle Manager and the control plane in your environment. You can learn more about these in the OpenStack documentation here: OpenStackClient.

If you wish to install the command-line interfaces on other nodes in your environment, there are two methods you can use to do so that we describe below.

2.2.1 Installing the CLI tools using the input model Edit source

During the initial install phase of your cloud you can edit your input model to request that the command-line clients be installed on any of the node clusters in your environment. To do so, follow these steps:

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit your control_plane.yml file. Full path:

    ~/openstack/my_cloud/definition/data/control_plane.yml
  3. In this file you will see a list of service-components to be installed on each of your clusters. These clusters will be divided per role, with your controller node cluster likely coming at the beginning. Here you will see a list of each of the clients that can be installed. These include:

    keystone-client
    glance-client
    cinder-client
    nova-client
    neutron-client
    swift-client
    heat-client
    openstack-client
    monasca-client
    barbican-client
    designate-client
  4. For each client you want to install, specify the name under the service-components section for the cluster you want to install it on.

    So, for example, if you would like to install the nova and neutron clients on your Compute node cluster, you can do so by adding the nova-client and neutron-client services, like this:

          resources:
            - name: compute
              resource-prefix: comp
              server-role: COMPUTE-ROLE
              allocation-policy: any
              min-count: 0
              service-components:
                - ntp-client
                - nova-compute
                - nova-compute-kvm
                - neutron-l3-agent
                - neutron-metadata-agent
                - neutron-openvswitch-agent
                - nova-client
                - neutron-client
    Note
    Note

    This example uses the entry-scale-kvm sample file. Your model may be different so use this as a guide but do not copy and paste the contents of this example into your input model.

  5. Commit your configuration to the local git repo, as follows:

    cd ~/openstack/ardana/ansible
    git add -A
    git commit -m "My config or other commit message"
  6. Continue with the rest of your installation.

2.2.2 Installing the CLI tools using Ansible Edit source

At any point after your initial installation you can install the command-line clients on any of the nodes in your environment. To do so, follow these steps:

  1. Log in to the Cloud Lifecycle Manager.

  2. Obtain the hostname for the nodes you want to install the clients on by looking in your hosts file:

    cat /etc/hosts
  3. Install the clients using this playbook, specifying your hostnames using commas:

    cd ~/scratch/ansible/next/ardana/ansible
    ansible-playbook -i hosts/verb_hosts -e "install_package=<client_name>" client-deploy.yml -e "install_hosts=<hostname>"

    So, for example, if you would like to install the novaClient on two of your Compute nodes with hostnames ardana-cp1-comp0001-mgmt and ardana-cp1-comp0002-mgmt you can use this syntax:

    cd ~/scratch/ansible/next/ardana/ansible
    ansible-playbook -i hosts/verb_hosts -e "install_package=novaclient" client-deploy.yml -e "install_hosts=ardana-cp1-comp0001-mgmt,ardana-cp1-comp0002-mgmt"
  4. Once the playbook completes successfully, you should be able to SSH to those nodes and, using the proper credentials, authenticate and use the command-line interfaces you have installed.

2.3 Cloud Admin Actions with the Command Line Edit source

Cloud admins can use the command line tools to perform domain admin tasks such as user and project administration.

2.3.1 Creating Additional Cloud Admins Edit source

You can create additional Cloud Admins to help with the administration of your cloud.

keystone identity service query and administration tasks can be performed using the OpenStack command line utility. The utility is installed by the Cloud Lifecycle Manager onto the Cloud Lifecycle Manager.

Note
Note

keystone administration tasks should be performed by an admin user with a token scoped to the default domain via the keystone v3 identity API. These settings are preconfigured in the file ~/keystone.osrc. By default, keystone.osrc is configured with the admin endpoint of keystone. If the admin endpoint is not accessible from your network, change OS_AUTH_URL to point to the public endpoint.

2.3.2 Command Line Examples Edit source

For a full list of OpenStackClient commands, see OpenStackClient Command List.

Sourcing the keystone Administration Credentials

You can set the environment variables needed for identity administration by sourcing the keystone.osrc file created by the lifecycle manager:

source ~/keystone.osrc

List users in the default domain

These users are created by the Cloud Lifecycle Manager in the MySQL back end:

openstack user list

Example output:

$ openstack user list
+----------------------------------+------------------+
| ID                               | Name             |
+----------------------------------+------------------+
| 155b68eda9634725a1d32c5025b91919 | heat             |
| 303375d5e44d48f298685db7e6a4efce | octavia          |
| 40099e245a394e7f8bb2aa91243168ee | logging          |
| 452596adbf4d49a28cb3768d20a56e38 | admin            |
| 76971c3ad2274820ad5347d46d7560ec | designate        |
| 7b2dc0b5bb8e4ffb92fc338f3fa02bf3 | hlm_backup       |
| 86d345c960e34c9189519548fe13a594 | barbican         |
| 8e7027ab438c4920b5853d52f1e08a22 | nova_monasca     |
| 9c57dfff57e2400190ab04955e7d82a0 | barbican_service |
| a3f99bcc71b242a1bf79dbc9024eec77 | nova             |
| aeeb56fc4c4f40e0a6a938761f7b154a | glance-check     |
| af1ef292a8bb46d9a1167db4da48ac65 | cinder           |
| af3000158c6d4d3d9257462c9cc68dda | demo             |
| b41a7d0cb1264d949614dc66f6449870 | swift            |
| b78a2b17336b43368fb15fea5ed089e9 | cinderinternal   |
| bae1718dee2d47e6a75cd6196fb940bd | monasca          |
| d4b9b32f660943668c9f5963f1ff43f9 | ceilometer       |
| d7bef811fb7e4d8282f19fb3ee5089e9 | swift-monitor    |
| e22bbb2be91342fd9afa20baad4cd490 | neutron          |
| ec0ad2418a644e6b995d8af3eb5ff195 | glance           |
| ef16c37ec7a648338eaf53c029d6e904 | swift-dispersion |
| ef1a6daccb6f4694a27a1c41cc5e7a31 | glance-swift     |
| fed3a599b0864f5b80420c9e387b4901 | monasca-agent    |
+----------------------------------+------------------+

List domains created by the installation process:

openstack domain list

Example output:

$ openstack domain list
+----------------------------------+---------+---------+----------------------------------------------------------------------+
| ID                               | Name    | Enabled | Description                                                          |
+----------------------------------+---------+---------+----------------------------------------------------------------------+
| 6740dbf7465a4108a36d6476fc967dbd | heat    | True    | Owns users and projects created by heat                              |
| default                          | Default | True    | Owns users and tenants (i.e. projects) available on Identity API v2. |
+----------------------------------+---------+---------+----------------------------------------------------------------------+

List the roles:

openstack role list

Example output:

$ openstack role list
+----------------------------------+---------------------------+
| ID                               | Name                      |
+----------------------------------+---------------------------+
| 0be3da26cd3f4cd38d490b4f1a8b0c03 | designate_admin           |
| 13ce16e4e714473285824df8188ee7c0 | monasca-agent             |
| 160f25204add485890bc95a6065b9954 | key-manager:service-admin |
| 27755430b38c411c9ef07f1b78b5ebd7 | monitor                   |
| 2b8eb0a261344fbb8b6b3d5934745fe1 | key-manager:observer      |
| 345f1ec5ab3b4206a7bffdeb5318bd32 | admin                     |
| 49ba3b42696841cea5da8398d0a5d68e | nova_admin                |
| 5129400d4f934d4fbfc2c3dd608b41d9 | ResellerAdmin             |
| 60bc2c44f8c7460a9786232a444b56a5 | neutron_admin             |
| 654bf409c3c94aab8f929e9e82048612 | cinder_admin              |
| 854e542baa144240bfc761cdb5fe0c07 | monitoring-delegate       |
| 8946dbdfa3d346b2aa36fa5941b43643 | key-manager:auditor       |
| 901453d9a4934610ad0d56434d9276b4 | key-manager:admin         |
| 9bc90d1121544e60a39adbfe624a46bc | monasca-user              |
| 9fe2a84a3e7443ae868d1009d6ab4521 | service                   |
| 9fe2ff9ee4384b1894a90878d3e92bab | member                    |
| a24d4e0a5de14bffbe166bfd68b36e6a | swiftoperator             |
| ae088fcbf579425580ee4593bfa680e5 | heat_stack_user           |
| bfba56b2562942e5a2e09b7ed939f01b | keystoneAdmin             |
| c05f54cf4bb34c7cb3a4b2b46c2a448b | glance_admin              |
| fe010be5c57240db8f559e0114a380c1 | key-manager:creator       |
+----------------------------------+---------------------------+

List admin user role assignment within default domain:

openstack role assignment list --user admin --domain default

Example output:

# This indicates that the admin user is assigned the admin role within the default domain
ardana >  openstack role assignment list --user admin --domain default
+----------------------------------+----------------------------------+-------+---------+---------+
| Role                             | User                             | Group | Project | Domain  |
+----------------------------------+----------------------------------+-------+---------+---------+
| b398322103504546a070d607d02618ad | fed1c038d9e64392890b6b44c38f5bbb |       |         | default |
+----------------------------------+----------------------------------+-------+---------+---------+

Create a new user in default domain:

openstack user create --domain default --password-prompt --email <email_address> --description <description> --enable <username>

Example output showing the creation of a user named testuser with email address test@example.com and a description of Test User:

ardana >  openstack user create --domain default --password-prompt --email test@example.com --description "Test User" --enable testuser
User Password:
Repeat User Password:
+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| description | Test User                        |
| domain_id   | default                          |
| email       | test@example.com                 |
| enabled     | True                             |
| id          | 8aad69acacf0457e9690abf8c557754b |
| name        | testuser                         |
+-------------+----------------------------------+

Assign admin role for testuser within the default domain:

openstack role add admin --user <username> --domain default
openstack role assignment list --user <username> --domain default

Example output:

# Just for demonstration purposes - do not do this in a production environment!
ardana >  openstack role add admin --user testuser --domain default
ardana >  openstack role assignment list --user testuser --domain default
+----------------------------------+----------------------------------+-------+---------+---------+
| Role                             | User                             | Group | Project | Domain  |
+----------------------------------+----------------------------------+-------+---------+---------+
| b398322103504546a070d607d02618ad | 8aad69acacf0457e9690abf8c557754b |       |         | default |
+----------------------------------+----------------------------------+-------+---------+---------+

2.3.3 Assigning the default service admin roles Edit source

The following examples illustrate how you can assign each of the new service admin roles to a user.

Assigning the glance_admin role

A user must have the role of admin in order to assign the glance_admin role. To assign the role, you will set the environment variables needed for the identity service administrator.

  1. First, source the identity service credentials:

    source ~/keystone.osrc
  2. You can add the glance_admin role to a user on a project with this command:

    openstack role add --user <username> --project <project_name> glance_admin

    Example, showing a user named testuser being granted the glance_admin role in the test_project project:

    openstack role add --user testuser --project test_project glance_admin
  3. You can confirm the role assignment by listing out the roles:

    openstack role assignment list --user <username>

    Example output:

    ardana >  openstack role assignment list --user testuser
    +----------------------------------+----------------------------------+-------+----------------------------------+--------+-----------+
    | Role                             | User                             | Group | Project                          | Domain | Inherited |
    +----------------------------------+----------------------------------+-------+----------------------------------+--------+-----------+
    | 46ba80078bc64853b051c964db918816 | 8bcfe10101964e0c8ebc4de391f3e345 |       | 0ebbf7640d7948d2a17ac08bbbf0ca5b |        | False     |
    +----------------------------------+----------------------------------+-------+----------------------------------+--------+-----------+
  4. Note that only the role ID is displayed. To get the role name, execute the following:

    openstack role show <role_id>

    Example output:

    ardana >  openstack role show 46ba80078bc64853b051c964db918816
    +-------+----------------------------------+
    | Field | Value                            |
    +-------+----------------------------------+
    | id    | 46ba80078bc64853b051c964db918816 |
    | name  | glance_admin                     |
    +-------+----------------------------------+
  5. To demonstrate that the user has glance admin privileges, authenticate with those user creds and then upload and publish an image. Only a user with an admin role or glance_admin can publish an image.

    1. The easiest way to do this will be to make a copy of the service.osrc file and edit it with your user credentials. You can do that with this command:

      cp ~/service.osrc ~/user.osrc
    2. Using your preferred editor, edit the user.osrc file and replace the values for the following entries to match your user credentials:

      export OS_USERNAME=<username>
      export OS_PASSWORD=<password>
    3. You will also need to edit the following lines for your environment:

      ## Change these values from 'unset' to 'export'
      export OS_PROJECT_NAME=<project_name>
      export OS_PROJECT_DOMAIN_NAME=Default

      Here is an example output:

      unset OS_DOMAIN_NAME
      export OS_IDENTITY_API_VERSION=3
      export OS_AUTH_VERSION=3
      export OS_PROJECT_NAME=test_project
      export OS_PROJECT_DOMAIN_NAME=Default
      export OS_USERNAME=testuser
      export OS_USER_DOMAIN_NAME=Default
      export OS_PASSWORD=testuser
      export OS_AUTH_URL=http://192.168.245.9:35357/v3
      export OS_ENDPOINT_TYPE=internalURL
      # OpenstackClient uses OS_INTERFACE instead of OS_ENDPOINT
      export OS_INTERFACE=internal
      export OS_CACERT=/etc/ssl/certs/ca-certificates.crt
  6. Source the environment variables for your user:

    source ~/user.osrc
  7. Upload an image and publicize it:

    openstack image create --name "upload me" --visibility public --container-format bare --disk-format qcow2 --file uploadme.txt

    Example output:

    +------------------+--------------------------------------+
    | Property         | Value                                |
    +------------------+--------------------------------------+
    | checksum         | dd75c3b840a16570088ef12f6415dd15     |
    | container_format | bare                                 |
    | created_at       | 2016-01-06T23:31:27Z                 |
    | disk_format      | qcow2                                |
    | id               | cf1490f4-1eb1-477c-92e8-15ebbe91da03 |
    | min_disk         | 0                                    |
    | min_ram          | 0                                    |
    | name             | upload me                            |
    | owner            | bd24897932074780a20b780c4dde34c7     |
    | protected        | False                                |
    | size             | 10                                   |
    | status           | active                               |
    | tags             | []                                   |
    | updated_at       | 2016-01-06T23:31:31Z                 |
    | virtual_size     | None                                 |
    | visibility       | public                               |
    +------------------+--------------------------------------+
    Note
    Note

    You can use the command openstack help image create to get the full syntax for this command.

Assigning the nova_admin role

A user must have the role of admin in order to assign the nova_admin role. To assign the role, you will set the environment variables needed for the identity service administrator.

  1. First, source the identity service credentials:

    source ~/keystone.osrc
  2. You can add the glance_admin role to a user on a project with this command:

    openstack role add --user <username> --project <project_name> nova_admin

    Example, showing a user named testuser being granted the glance_admin role in the test_project project:

    openstack role add --user testuser --project test_project nova_admin
  3. You can confirm the role assignment by listing out the roles:

    openstack role assignment list --user <username>

    Example output:

    ardana >  openstack role assignment list --user testuser
    +----------------------------------+----------------------------------+-------+----------------------------------+--------+-----------+
    | Role                             | User                             | Group | Project                          | Domain | Inherited |
    +----------------------------------+----------------------------------+-------+----------------------------------+--------+-----------+
    | 8cdb02bab38347f3b65753099f3ab73c | 8bcfe10101964e0c8ebc4de391f3e345 |       | 0ebbf7640d7948d2a17ac08bbbf0ca5b |        | False     |
    +----------------------------------+----------------------------------+-------+----------------------------------+--------+-----------+
  4. Note that only the role ID is displayed. To get the role name, execute the following:

    openstack role show <role_id>

    Example output:

    ardana >  openstack role show 8cdb02bab38347f3b65753099f3ab73c
    +-------+----------------------------------+
    | Field | Value                            |
    +-------+----------------------------------+
    | id    | 8cdb02bab38347f3b65753099f3ab73c |
    | name  | nova_admin                       |
    +-------+----------------------------------+
  5. To demonstrate that the user has nova admin privileges, authenticate with those user creds and then upload and publish an image. Only a user with an admin role or glance_admin can publish an image.

    1. The easiest way to do this will be to make a copy of the service.osrc file and edit it with your user credentials. You can do that with this command:

      cp ~/service.osrc ~/user.osrc
    2. Using your preferred editor, edit the user.osrc file and replace the values for the following entries to match your user credentials:

      export OS_USERNAME=<username>
      export OS_PASSWORD=<password>
    3. You will also need to edit the following lines for your environment:

      ## Change these values from 'unset' to 'export'
      export OS_PROJECT_NAME=<project_name>
      export OS_PROJECT_DOMAIN_NAME=Default

      Here is an example output:

      unset OS_DOMAIN_NAME
      export OS_IDENTITY_API_VERSION=3
      export OS_AUTH_VERSION=3
      export OS_PROJECT_NAME=test_project
      export OS_PROJECT_DOMAIN_NAME=Default
      export OS_USERNAME=testuser
      export OS_USER_DOMAIN_NAME=Default
      export OS_PASSWORD=testuser
      export OS_AUTH_URL=http://192.168.245.9:35357/v3
      export OS_ENDPOINT_TYPE=internalURL
      # OpenstackClient uses OS_INTERFACE instead of OS_ENDPOINT
      export OS_INTERFACE=internal
      export OS_CACERT=/etc/ssl/certs/ca-certificates.crt
  6. Source the environment variables for your user:

    source ~/user.osrc
  7. List all of the virtual machines in the project specified in user.osrc:

    openstack server list

    Example output showing no virtual machines, because there are no virtual machines created on the project specified in the user.osrc file:

    +--------------------------------------+-------------------------------------------------------+--------+-----------------------------------------------------------------+
    | ID                                   | Name                                                  | Status | Networks                                                        |
    +--------------------------------------+-------------------------------------------------------+--------+-----------------------------------------------------------------+
    +--------------------------------------+-------------------------------------------------------+--------+-----------------------------------------------------------------+
  8. For this demonstration, we do have a virtual machine associated with a different project and because your user has nova_admin permissions, you can view those virtual machines using a slightly different command:

    openstack server list --all-projects

    Example output, now showing a virtual machine:

    ardana >  openstack server list --all-projects
    +--------------------------------------+-------------------------------------------------------+--------+-----------------------------------------------------------------+
    | ID                                   | Name                                                  | Status | Networks                                                        |
    +--------------------------------------+-------------------------------------------------------+--------+-----------------------------------------------------------------+
    | da4f46e2-4432-411b-82f7-71ab546f91f3 | testvml                                               | ACTIVE |                                                                 |
    +--------------------------------------+-------------------------------------------------------+--------+-----------------------------------------------------------------+
  9. You can also now delete virtual machines in other projects by using the --all-tenants switch:

    openstack server delete --all-projects <instance_id>

    Example, showing us deleting the instance in the previous step:

    openstack server delete --all-projects da4f46e2-4432-411b-82f7-71ab546f91f3
  10. You can get a full list of available commands by using this:

    openstack -h

You can perform the same steps as above for the neutron and cinder service admin roles:

neutron_admin
cinder_admin

2.3.4 Customize policy.json on the Cloud Lifecycle Manager Edit source

One way to deploy policy.json for a service is by going to each of the target nodes and making changes there. This is not necessary anymore. This process has been streamlined and policy.json files can be edited on the Cloud Lifecycle Manager and then deployed to nodes. Please exercise caution when modifying policy.json files. It is best to validate the changes in a non-production environment before rolling out policy.json changes into production. It is not recommended that you make policy.json changes without a way to validate the desired policy behavior. Updated policy.json files can be deployed using the appropriate <service_name>-reconfigure.yml playbook.

2.3.5 Roles Edit source

Service roles represent the functionality used to implement the OpenStack role based access control (RBAC) model. This is used to manage access to each OpenStack service. Roles are named and assigned per user or group for each project by the identity service. Role definition and policy enforcement are defined outside of the identity service independently by each OpenStack service.

The token generated by the identity service for each user authentication contains the role(s) assigned to that user for a particular project. When a user attempts to access a specific OpenStack service, the role is parsed by the service, compared to the service-specific policy file, and then granted the resource access defined for that role by the service policy file.

Each service has its own service policy file with the /etc/[SERVICE_CODENAME]/policy.json file name format where [SERVICE_CODENAME] represents a specific OpenStack service name. For example, the OpenStack nova service would have a policy file called /etc/nova/policy.json.

Service policy files can be modified and deployed to control nodes from the Cloud Lifecycle Manager. Administrators are advised to validate policy changes before checking in the changes to the site branch of the local git repository before rolling the changes into production. Do not make changes to policy files without having a way to validate them.

The policy files are located at the following site branch directory on the Cloud Lifecycle Manager.

~/openstack/ardana/ansible/roles/

For test and validation, policy files can be modified in a non-production environment from the ~/scratch/ directory. For a specific policy file, run a search for policy.json. To deploy policy changes for a service, run the service specific reconfiguration playbook (for example, nova-reconfigure.yml). For a complete list of reconfiguration playbooks, change directories to ~/scratch/ansible/next/ardana/ansible and run this command:

ls –l | grep reconfigure
Note
Note

Comments added to any *.j2 files (including templates) must follow proper comment syntax. Otherwise you may see errors when running the config-processor or any of the service playbooks.

2.4 Log Management and Integration Edit source

2.4.1 Overview Edit source

SUSE OpenStack Cloud uses the ELK (Elasticsearch, Logstash, Kibana) stack for log management across the entire cloud infrastructure. This configuration facilitates simple administration as well as integration with third-party tools. This tutorial covers how to forward your logs to a third-party tool or service, and how to access and search the Elasticsearch log stores through API endpoints.

2.4.2 The ELK stack Edit source

The ELK logging stack consists of the Elasticsearch, Logstash, and Kibana elements.

  • Elasticsearch.  Elasticsearch is the storage and indexing component of the ELK stack. It stores and indexes the data received from Logstash. Indexing makes your log data searchable by tools designed for querying and analyzing massive sets of data. You can query the Elasticsearch datasets from the built-in Kibana console, a third-party data analysis tool, or through the Elasticsearch API (covered later).

  • Logstash.  Logstash reads the log data from the services running on your servers, and then aggregates and ships that data to a storage location. By default, Logstash sends the data to the Elasticsearch indexes, but it can also be configured to send data to other storage and indexing tools such as Splunk.

  • Kibana.  Kibana provides a simple and easy-to-use method for searching, analyzing, and visualizing the log data stored in the Elasticsearch indexes. You can customize the Kibana console to provide graphs, charts, and other visualizations of your log data.

2.4.3 Using the Elasticsearch API Edit source

You can query the Elasticsearch indexes through various language-specific APIs, as well as directly over the IP address and port that Elasticsearch exposes on your implementation. By default, Elasticsearch presents from localhost, port 9200. You can run queries directly from a terminal using curl. For example:

ardana > curl -XGET 'http://localhost:9200/_search?q=tag:yourSearchTag'

The preceding command searches all indexes for all data with the "yourSearchTag" tag.

You can also use the Elasticsearch API from outside the logging node. This method connects over the Kibana VIP address, port 5601, using basic http authentication. For example, you can use the following command to perform the same search as the preceding search:

curl -u kibana:<password> kibana_vip:5601/_search?q=tag:yourSearchTag

You can further refine your search to a specific index of data, in this case the "elasticsearch" index:

ardana > curl -XGET 'http://localhost:9200/elasticsearch/_search?q=tag:yourSearchTag'

The search API is RESTful, so responses are provided in JSON format. Here's a sample (though empty) response:

{
    "took":13,
    "timed_out":false,
    "_shards":{
        "total":45,
        "successful":45,
        "failed":0
    },
    "hits":{
        "total":0,
        "max_score":null,
        "hits":[]
    }
}

2.4.4 For More Information Edit source

You can find more detailed Elasticsearch API documentation at https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html.

Review the Elasticsearch Python API documentation at the following sources: http://elasticsearch-py.readthedocs.io/en/master/api.html

Read the Elasticsearch Java API documentation at https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/index.html.

2.4.5 Forwarding your logs Edit source

You can configure Logstash to ship your logs to an outside storage and indexing system, such as Splunk. Setting up this configuration is as simple as editing a few configuration files, and then running the Ansible playbooks that implement the changes. Here are the steps.

  1. Begin by logging in to the Cloud Lifecycle Manager.

  2. Verify that the logging system is up and running:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts logging-status.yml

    When the preceding playbook completes without error, proceed to the next step.

  3. Edit the Logstash configuration file, found at the following location:

    ~/openstack/ardana/ansible/roles/logging-server/templates/logstash.conf.j2

    Near the end of the Logstash configuration file, you will find a section for configuring Logstash output destinations. The following example demonstrates the changes necessary to forward your logs to an outside server (changes in bold). The configuration block sets up a TCP connection to the destination server's IP address over port 5514.

    # Logstash outputs
        output {
          # Configure Elasticsearch output
          # http://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html
          elasticsearch {
            index => "${[@metadata][es_index]"}
            hosts => ["{{ elasticsearch_http_host }}:{{ elasticsearch_http_port }}"]
            flush_size => {{ logstash_flush_size }}
            idle_flush_time => 5
            workers => {{ logstash_threads }}
          }
            # Forward Logs to Splunk on TCP port 5514 which matches the one specified in Splunk Web UI.
          tcp {
            mode => "client"
            host => "<Enter Destination listener IP address>"
            port => 5514
          }
        }

    Logstash can forward log data to multiple sources, so there is no need to remove or alter the Elasticsearch section in the preceding file. However, if you choose to stop forwarding your log data to Elasticsearch, you can do so by removing the related section in this file, and then continue with the following steps.

  4. Commit your changes to the local git repository:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git add -A
    ardana > git commit -m "Your commit message"
  5. Run the configuration processor to check the status of all configuration files:

    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  6. Run the ready-deployment playbook:

    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  7. Implement the changes to the Logstash configuration file:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts kronos-server-configure.yml

Configuring the receiving service will vary from product to product. Consult the documentation for your particular product for instructions on how to set it up to receive log files from Logstash.

2.5 Integrating Your Logs with Splunk Edit source

2.5.1 Integrating with Splunk Edit source

The SUSE OpenStack Cloud 9 logging solution provides a flexible and extensible framework to centralize the collection and processing of logs from all nodes in your cloud. The logs are shipped to a highly available and fault-tolerant cluster where they are transformed and stored for better searching and reporting. The SUSE OpenStack Cloud 9 logging solution uses the ELK stack (Elasticsearch, Logstash and Kibana) as a production-grade implementation and can support other storage and indexing technologies.

You can configure Logstash, the service that aggregates and forwards the logs to a searchable index, to send the logs to a third-party target, such as Splunk.

For how to integrate the SUSE OpenStack Cloud 9 centralized logging solution with Splunk, including the steps to set up and forward logs, please refer to Section 4.1, “Splunk Integration”.

2.6 Integrating SUSE OpenStack Cloud with an LDAP System Edit source

You can configure your SUSE OpenStack Cloud cloud to work with an outside user authentication source such as Active Directory or OpenLDAP. keystone, the SUSE OpenStack Cloud identity service, functions as the first stop for any user authorization/authentication requests. keystone can also function as a proxy for user account authentication, passing along authentication and authorization requests to any LDAP-enabled system that has been configured as an outside source. This type of integration lets you use an existing user-management system such as Active Directory and its powerful group-based organization features as a source for permissions in SUSE OpenStack Cloud.

Upon successful completion of this tutorial, your cloud will refer user authentication requests to an outside LDAP-enabled directory system, such as Microsoft Active Directory or OpenLDAP.

2.6.1 Configure your LDAP source Edit source

To configure your SUSE OpenStack Cloud cloud to use an outside user-management source, perform the following steps:

  1. Make sure that the LDAP-enabled system you plan to integrate with is up and running and accessible over the necessary ports from your cloud management network.

  2. Edit the /var/lib/ardana/openstack/my_cloud/config/keystone/keystone.conf.j2 file and set the following options:

    domain_specific_drivers_enabled = True
    domain_configurations_from_database = False
  3. Create a YAML file in the /var/lib/ardana/openstack/my_cloud/config/keystone/ directory that defines your LDAP connection. You can make a copy of the sample keystone-LDAP configuration file, and then edit that file with the details of your LDAP connection.

    The following example copies the keystone_configure_ldap_sample.yml file and names the new file keystone_configure_ldap_my.yml:

    ardana > cp /var/lib/ardana/openstack/my_cloud/config/keystone/keystone_configure_ldap_sample.yml \
      /var/lib/ardana/openstack/my_cloud/config/keystone/keystone_configure_ldap_my.yml
  4. Edit the new file to define the connection to your LDAP source. This guide does not provide comprehensive information on all aspects of the keystone_configure_ldap.yml file. Find a complete list of keystone/LDAP configuration file options at: https://github.com/openstack/keystone/tree/stable/rocky/etc

    The following file illustrates an example keystone configuration that is customized for an Active Directory connection.

    keystone_domainldap_conf:
    
        # CA certificates file content.
        # Certificates are stored in Base64 PEM format. This may be entire LDAP server
        # certificate (in case of self-signed certificates), certificate of authority
        # which issued LDAP server certificate, or a full certificate chain (Root CA
        # certificate, intermediate CA certificate(s), issuer certificate).
        #
        cert_settings:
          cacert: |
            -----BEGIN CERTIFICATE-----
    
            certificate appears here
    
            -----END CERTIFICATE-----
    
        # A domain will be created in MariaDB with this name, and associated with ldap back end.
        # Installer will also generate a config file named /etc/keystone/domains/keystone.<domain_name>.conf
        #
        domain_settings:
          name: ad
          description: Dedicated domain for ad users
    
        conf_settings:
          identity:
             driver: ldap
    
    
          # For a full list and description of ldap configuration options, please refer to
          # http://docs.openstack.org/liberty/config-reference/content/keystone-configuration-file.html.
          #
          # Please note:
          #  1. LDAP configuration is read-only. Configuration which performs write operations (i.e. creates users, groups, etc)
          #     is not supported at the moment.
          #  2. LDAP is only supported for identity operations (reading users and groups from LDAP). Assignment
          #     operations with LDAP (i.e. managing roles, projects) are not supported.
          #  3. LDAP is configured as non-default domain. Configuring LDAP as a default domain is not supported.
          #
    
          ldap:
            url: ldap://YOUR_COMPANY_AD_URL
            suffix: YOUR_COMPANY_DC
            query_scope: sub
            user_tree_dn: CN=Users,YOUR_COMPANY_DC
            user : CN=admin,CN=Users,YOUR_COMPANY_DC
            password: REDACTED
            user_objectclass: user
            user_id_attribute: cn
            user_name_attribute: cn
            group_tree_dn: CN=Users,YOUR_COMPANY_DC
            group_objectclass: group
            group_id_attribute: cn
            group_name_attribute: cn
            use_pool: True
            user_enabled_attribute: userAccountControl
            user_enabled_mask: 2
            user_enabled_default: 512
            use_tls: True
            tls_req_cert: demand
            # if you are configuring multiple LDAP domains, and LDAP server certificates are issued
            # by different authorities, make sure that you place certs for all the LDAP backend domains in the
            # cacert parameter as seen in this sample yml file so that all the certs are combined in a single CA file
            # and every LDAP domain configuration points to the combined CA file.
            # Note:
            # 1. Please be advised that every time a new ldap domain is configured, the single CA file gets overwritten
            # and hence ensure that you place certs for all the LDAP backend domains in the cacert parameter.
            # 2. There is a known issue on one cert per CA file per domain when the system processes
            # concurrent requests to multiple LDAP domains. Using the single CA file with all certs combined
            # shall get the system working properly.
    
            tls_cacertfile: /etc/keystone/ssl/certs/all_ldapdomains_ca.pem
  5. Add your new file to the local Git repository and commit the changes.

    ardana > cd ~/openstack
    ardana > git checkout site
    ardana > git add -A
    ardana > git commit -m "Adding LDAP server integration config"
  6. Run the configuration processor and deployment preparation playbooks to validate the YAML files and prepare the environment for configuration.

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  7. Run the keystone reconfiguration playbook to implement your changes, passing the newly created YAML file as an argument to the -e@FILE_PATH parameter:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts keystone-reconfigure.yml \
      -e@/var/lib/ardana/openstack/my_cloud/config/keystone/keystone_configure_ldap_my.yml

    To integrate your SUSE OpenStack Cloud cloud with multiple domains, repeat these steps starting from Step 3 for each domain.

3 Cloud Lifecycle Manager Admin UI User Guide Edit source

The Cloud Lifecycle Manager Admin UI is a web-based GUI for viewing and managing the configuration of an installed cloud. After successfully deploying the cloud with the Install UI, the final screen displays a link to the CLM Admin UI. (For example, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 21 “Installing with the Install UI”, Section 21.5 “Running the Install UI”, Cloud Deployment Successful). Usually the URL associated with this link is https://DEPLOYER_MGMT_NET_IP:9085, although it may be different depending on the cloud configuration and the installed version of SUSE OpenStack Cloud.

3.1 Accessing the Admin UI Edit source

In a browser, go to https://DEPLOYER_MGMT_NET_IP:9085.

The DEPLOYER_MGMT_NET_IP:PORT_NUMBER is not necessarily the same for all installations, and can be displayed with the following command:

ardana > openstack endpoint list --service ardana --interface admin -c URL

Accessing the Cloud Lifecycle Manager Admin UI requires access to the MANAGEMENT network that was configured when the Cloud was deployed. Access to this network is necessary to be able to access the Cloud Lifecycle Manager Admin UI and log in. Depending on the network setup, it may be necessary to use an SSH tunnel similar to what is recommended in Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 21 “Installing with the Install UI”, Section 21.5 “Running the Install UI”. The Admin UI requires keystone and HAProxy to be running and to be accesible. If keystone or HAProxy are not running, cloud reconfiguration is limited to the command line.

Logging in requires a keystone user. If the user is not an admin on the default domain and one or more projects, the Cloud Lifecycle Manager Admin UI will not display information about the Cloud and may present errors.

Cloud Lifecycle Manager Admin UI Login Page
Figure 3.1: Cloud Lifecycle Manager Admin UI Login Page

3.2 Admin UI Pages Edit source

3.2.1 Services Edit source

Services pages relay information about the various OpenStack and other services that have been deployed as part of the cloud. Service information displays the list of services registered with keystone and the endpoints associated with those services. The information is equivalent to running the command openstack endpoint list.

The Service Information table contains the following information, based on how the service is registered with keystone:

Name

The name of the service, this may be an OpenStack code name

Description

Service description, for some services this is a repeat of the name

Endpoints

Services typically have 1 or more endpoints that are accessible to make API calls. The most common configuration is for a service to have Admin, Public, and Internal endpoints, with each intended for access by consumers corresponding to the type of endpoint.

Region

Service endpoints are part of a region. In multi-region clouds, some services will have endpoints in multiple regions.

Cloud Lifecycle Manager Admin UI Service Information
Figure 3.2: Cloud Lifecycle Manager Admin UI Service Information

3.2.2 Packages Edit source

The Packages tab displays packages that are part of the SUSE OpenStack Cloud product.

The SUSE Cloud Packages table contains the following:

Name

The name of the SUSE Cloud package

Version

The version of the package which is installed in the Cloud

Cloud Lifecycle Manager Admin UI SUSE Cloud Package
Figure 3.3: Cloud Lifecycle Manager Admin UI SUSE Cloud Package
Note
Note

Packages with the venv- prefix denote the version of the specific OpenStack package that is deployed. The release name can be determined from the OpenStack Releases page.

3.2.3 Configuration Edit source

The Configuration tab displays services that are deployed in the cloud and the configuration files associated with those services. Services may be reconfigured by editing the .j2 files listed and clicking the Update button.

This page also provides the ability to set up SUSE Enterprise Storage Integration after initial deployment.

Cloud Lifecycle Manager Admin UI SUSE Service Configuration
Figure 3.4: Cloud Lifecycle Manager Admin UI SUSE Service Configuration

Clicking one of the listed configuration files opens the file editor where changes can be made. Asterisks identify files that have been edited but have not had their updates applied to the cloud.

Cloud Lifecycle Manager Admin UI SUSE Service Configuration Editor
Figure 3.5: Cloud Lifecycle Manager Admin UI SUSE Service Configuration Editor

After editing the service configuration, click the Update button to begin deploying configuration changes to the cloud. The status of those changes will be streamed to the UI.

Configure SUSE Enterprise Storage After Initial Deployment

A link to the settings.yml file is available under the ses selection on the Configuration tab.

To set up SUSE Enterprise Storage Integration:

  1. Click on the link to edit the settings.yml file.

  2. Uncomment the ses_config_path parameter, specify the location on the deployer host containing the ses_config.yml file, and save the settings.yml file.

  3. If the ses_config.yml file does not yet exist in that location on the deployer host, a new link will appear for uploading a file from your local workstation.

  4. When ses_config.yml is present on the deployer host, it will appear in the ses section of the Configuration tab and can be edited directly there.

Note
Note

If the cloud is configured using self-signed certificates, the streaming status updates (including the log) may be interupted and require a reload of the CLM Admin UI. See Book “Security Guide”, Chapter 8 “Transport Layer Security (TLS) Overview”, Section 8.2 “TLS Configuration” for details on using signed certificates.

Cloud Lifecycle Manager Admin UI SUSE Service Configuration Update
Figure 3.6: Cloud Lifecycle Manager Admin UI SUSE Service Configuration Update

3.2.4 Model Edit source

The Model tab displays input models that are deployed in the cloud and the associated model files. The model files listed can be modified.

Cloud Lifecycle Manager Admin UI SUSE Service Model
Figure 3.7: Cloud Lifecycle Manager Admin UI SUSE Service Model

Clicking one of the listed model files opens the file editor where changes can be made. Asterisks identify files that have been edited but have not had their updates applied to the cloud.

Cloud Lifecycle Manager Admin UI SUSE Service Model Editor
Figure 3.8: Cloud Lifecycle Manager Admin UI SUSE Service Model Editor

After editing the model file, click the Validate button to validate changes. If validation is successful, Update is enabled. Click the Update button to deploy the changes to the cloud. Before starting deployment, a confirmation dialog shows the choices of only running config-processor-run.yml and ready-deployment.yml playbooks or running a full deployment. It also indicates the risk of updating the deployed cloud.

Cloud Lifecycle Manager Admin UI SUSE Service Model Confirmation
Figure 3.9: Cloud Lifecycle Manager Admin UI SUSE Service Model Confirmation

Click Update to start deployment. The status of the changes will be streamed to the UI.

Note
Note

If the cloud is configured using self-signed certificates, the streaming status updates (including the log) may be interrupted. The CLM Admin UI must be reloaded. See Book “Security Guide”, Chapter 8 “Transport Layer Security (TLS) Overview”, Section 8.2 “TLS Configuration” for details on using signed certificates.

Cloud Lifecycle Manager Admin UI SUSE Service Model Update
Figure 3.10: Cloud Lifecycle Manager Admin UI SUSE Service Model Update

3.2.5 Roles Edit source

The Services Per Role tab displays the list of all roles that have been defined in the Cloud Lifecycle Manager input model, the list of servers that role, and the services installed on those servers.

The Services Per Role table contains the following:

Role

The name of the role in the data model. In the included data model templates, these names are descriptive, such as MTRMON-ROLE for a metering and monitoring server. There is no strict constraint on role names and they may have been altered at install time.

Servers

The model IDs for the servers that have been assigned this role. This does not necessarily correspond to any DNS or other naming labels a host has, unless the host ID was set that way during install.

Services

A list of OpenStack and other Cloud related services that comprise this role. Servers that have been assigned this role will have these services installed and enabled.

Cloud Lifecycle Manager Admin UI Services Per Role
Figure 3.11: Cloud Lifecycle Manager Admin UI Services Per Role

3.2.6 Servers Edit source

The Servers pages contain information about the hardware that comprises the cloud, including the configuration of the servers, and the ability to add new compute nodes to the cloud.

The Servers table contains the following information:

ID

This is the ID of the server in the data model. This does not necessarily correspond to any DNS or other naming labels a host has, unless the host ID was set that way during install.

IP Address

The management network IP address of the server

Server Group

The server group which this server is assigned to

NIC Mapping

The NIC mapping that describes the PCI slot addresses for the servers ethernet adapters

Mac Address

The hardware address of the servers primary physical ethernet adapter

Cloud Lifecycle Manager Admin UI Server Summary
Figure 3.12: Cloud Lifecycle Manager Admin UI Server Summary

3.2.7 Admin UI Server Details Edit source

Server Details can be viewed by clicking the menu at the right side of each row in the Servers table, the server details dialog contains the information from the Servers table and the following additional fields:

IPMI IP Address

The IPMI network address, this may be empty if the server was provisioned prior to being added to the Cloud

IPMI Username

The username that was specified for IPMI access

IPMI Password

This is obscured in the readonly dialog, but is editable when adding a new server

Network Interfaces

The network interfaces configured on the server

Filesystem Utilization

Filesystem usage (percentage of filesystem in use). Only available if monasca is in use

Server Details (1/2)
Figure 3.13: Server Details (1/2)
Server Details (2/2)
Figure 3.14: Server Details (2/2)

3.3 Topology Edit source

The topology section of the Cloud Lifecycle Manager Admin UI displays an overview of how the Cloud is configured. Each section of the topology represents some facet of the Cloud configuration and provides a visual layout of the way components are associated with each other. Many of the components in the topology are linked to each other, and can be navigated between by clicking on any component that appears as a hyperlink.

3.3.1 Control Planes Edit source

The Control Planes tab displays control planes and availability zones within the Cloud.

Each control plane is show as a table of clusters, resources, and load balancers (represented by vertical columns in the table).

Control Plane

A set of servers dedicated to running the infrastructure of the Cloud. Many Cloud configurations will have only a single control plane.

Clusters

A set of one or more servers hosting a particular set of services, tied to the role that has been assigned to that server. Clusters are generally differentiated from Resources in that they are fixed size groups of servers that do not grow as the Cloud grows.

Resources

Servers hosting the scalable parts of the Cloud, such as Compute Hosts that host VMs, or swift servers for object storage. These will vary in number with the size and scale of the Cloud and can generally be increased after the initial Cloud deployment.

Load Balancers

Servers that distribute API calls across servers hosting the called services.

Control Plane Topology
Figure 3.15: Control Plane Topology
Availability Zones

Listed beneath the running services, groups together in a row the hosts in a particular availability zone for a particular cluster or resource type (the rows are AZs, the columns are clusters/resources)

Control Plane Topology - Availability Zones
Figure 3.16: Control Plane Topology - Availability Zones

3.3.2 Regions Edit source

Displays the distribution of control plane services across regions. Clouds that have only a single region will list all services in the same cell.

Control Planes

The group of services that run the Cloud infrastructure

Region

Each region will be represented by a column with the region name as the column header. The list of services that are running in that region will be in that column, with each row corresponding to a particular control plane.

Regions Topology
Figure 3.17: Regions Topology

3.3.3 Services Edit source

A list of services running in the Cloud, organized by the type (class) of service. Each service is then listed along with the control planes that the service is part of, the other services that each particular service consumes (requires), and the endpoints of the service, if the service exposes an API.

Class

A category of like services, such as "security" or "operations". Multiple services may belong to the same category.

Description

A short description of the service, typically sourced from the service itself

Service

The name of the service. For OpenStack services, this is the project codename, such as nova for virtual machine provisioning. Clicking a service will navigate to the section of this page with details for that particular service.

Services Topology
Figure 3.18: Services Topology

The detail data about a service provides additional insight into the service, such as what other services are required to run a service, and what network protocols can be used to access the service

Components

Each service is made up of one or more components, which are listed separately here. The components of a service may represent pieces of the service that run on different hosts, provide distinct functionality, or modularize business logic.

Control Planes

A service may be running in multiple control planes. Each control plane that a service is running in will be listed here.

Consumes

Other services required for this service to operate correctly.

Endpoints

How a service can be accessed, typically a REST API, though other network protocols may be listed here. Services that do not expose an API or have any sort of external access will not list any entries here.

Service Details Topology
Figure 3.19: Service Details Topology

3.3.4 Networks Edit source

Lists the networks and network groups that comprise the Cloud. Each network group is respresented by a row in the table, with columns identifying which networks are used by the intersection of the group (row) and cluster/resource (column).

Group

The network group

Clusters

A set of one or more servers hosting a particular set of services, tied to the role that has been assigned to that server. Clusters are generally differentiated from Resources in that they are fixed size groups of servers that do not grow as the Cloud grows.

Resources

Servers hosting the scalable parts of the Cloud, such as Compute Hosts that host VMs, or swift servers for object storage. These will vary in number with the size and scale of the Cloud and can generally be increased after the initial Cloud deployment.

Cells in the middle of the table represent the network that is running on the resource/cluster represented by that column and is part of the network group identified in the leftmost column of the same row.

Networks Topology
Figure 3.20: Networks Topology

Each network group is listed along with the servers and interfaces that comprise the network group.

Network Group

The elements that make up the network group, whose name is listed above the table

Networks

Networks that are part of the specified network group

Address

IP address of the corresponding server

Server

Server name of the server that is part of this network. Clicking on a server will load the server topology details.

Interface Model

The particular combination of hardware address and bonding that tie this server to the specified network group. Clicking on an Interface Model will load the corresponding section of the Roles page.

Network Groups Topology
Figure 3.21: Network Groups Topology

3.3.5 Servers Edit source

A hierarchical display of the tree of Server Groups. Groups will be represented by a heading with their name, starting with the first row which contains the Cloud-wide server group (often called CLOUD). Within each Server Group, the Network Groups, Networks, Servers, and Server Roles are broken down. Note that server groups can be nested, producing a tree-like structure of groups.

Network Groups

The network groups that are part of this server group.

Networks

The network that is part of the server group and corresponds to the network group in the same row.

Server Roles

The model defined role that was applied to the server, made up of a combination of services, and network/storage configurations unique to that role within the Cloud

Servers

The servers that have the role defined in their row and are part of the network group represented by the column the server is in.

Server Groups Topology
Figure 3.22: Server Groups Topology

3.3.6 Roles Edit source

The list of server roles that define the server configurations for the Cloud. Each server role consists of several configurations. In this topology the focus is on the Disk Models and Network Interface Models that are applied to the servers with that role.

Server Role

The name of the role, as it is defined in the model

Disk Model

The name of the disk model

Volume Group

Name of the volume group

Mount

Name of the volume being mounted on the server

Size

The size of the volume as a percentage of physical disk space

FS Type

Filesystem type

Options

Optional flags applied when mounting the volume

PVol(s)

The physical address to the storage used for this volume group

Interface Model

The name of the interface model

Network Group

The name of the network group. Clicking on a Network Group will load the details of that group on the Networks page.

Interface/Options

Includes logical network name, such as hed1, hed2, and bond information grouping the logical network name together. The Cloud software will map these to physical devices.

Roles Topology
Figure 3.23: Roles Topology

3.4 Server Management Edit source

3.4.1 Adding Servers Edit source

The Add Server page in the Cloud Lifecycle Manager Admin UI allows for adding additional Compute Nodes to the Cloud.

Add Server Overview
Figure 3.24: Add Server Overview

3.4.1.1 Available Servers Edit source

Servers that can be added to the Cloud are shown on the left side of the Add Server screen. Additional servers can be included in this list three different ways:

  1. Discover servers via SUSE Manager or HPE OneView (for details on adding servers via autodiscovery, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 21 “Installing with the Install UI”, Section 21.4 “Optional: Importing Certificates for SUSE Manager and HPE OneView” and Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 21 “Installing with the Install UI”, Section 21.5 “Running the Install UI”

  2. Manually add servers individually by clicking Manual Entry and filling out the form with the server information (instructions below)

  3. Create a CSV file of the servers to be added (see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 21 “Installing with the Install UI”, Section 21.3 “Optional: Creating a CSV File to Import Server Data”)

Manually adding a server requires the following fields:

ID

A unique name for the server

IP Address

The IP address that the server has, or will have, in the Cloud

Server Group

Which server group the server will belong to. The IP address must be compatible with the selected Server Group. If the required Server Group is not present, it can be created

NIC Mapping

The NIC to PCI address mapping for the server being added to the Cloud. If the required NIC mapping is not present, it can be created

Role

Which compute role to add the server to. If this is set, the server will be immediately assigned that role on the right side of the page. If it is not set, the server will be added to the left side panel of available servers

Some additional fields must be set if the server is not already provisioned with an OS, or if a new OS install is desired for the server. These fields are not required if an OpenStack Cloud compatible OS is already installed:

MAC Address

The MAC address of the IPMI network card of the server

IPMI IP Address

The IPMI network address (IP address) of the server

IPMI Username

Username to log in to IPMI on the server

IPMI Password

Password to log in to IPMI on the server

Manually Add Server
Figure 3.25: Manually Add Server

Servers in the available list can be dragged to the desired role on the right. Only Compute-related roles will be displayed.

Manually Add Server
Figure 3.26: Manually Add Server

3.4.1.2 Add Server Settings Edit source

There are several settings that apply across all Compute Nodes being added to the Cloud. Beneath the list of nodes, users will find options to control whether existing nodes can be modified, whether the new nodes should have their data disks wiped, and whether to activate the new Compute Nodes as part of the update process.

Safe Mode

Prevents modification of existing Compute Nodes. Can be unchecked to allow modifications. Modifying existing Compute Nodes has the potential to disrupt the continuous operation of the Cloud and should be done with caution.

Wipe Data Disks

The data disks on the new server will not be wiped by default, but users can specify to wipe clean the data disks as part of the process of adding the Compute Node(s) to the Cloud.

Activate

Activates the added Compute Node(s) during the process of adding them to the Cloud. Activation adds a Compute Node to the pool of nodes that the nova-scheduler uses when instantiating VMs.

Add Server Settings options
Figure 3.27: Add Server Settings options

3.4.1.3 Install OS Edit source

Servers that have been assigned a role but not yet deployed can have SLES installed as part of the Cloud deployment. This step is necessary for servers that are not provisioned with an OS.

On the Install OS page, the Available Servers list will be populated with servers that have been assigned to a role but not yet deployed to the Cloud. From here, select which servers to install an OS onto and use the arrow controls to move them to the Selected Servers box on the right. After all servers that require an OS to be provisioned have been added to the Selected Servers list and click Next.

Select Servers to Provision OS
Figure 3.28: Select Servers to Provision OS

The UI will prompt for confirmation that the OS should be installed, because provisioning an OS will replace any existing operating system on the server.

Confirm Provision OS
Figure 3.29: Confirm Provision OS

When the OS install begins, progress of the install will be displayed on screen

OS Install Progress
Figure 3.30: OS Install Progress

After OS provisioning is complete, a summary of the provisioned servers will be displayed. Clicking Close will return the user to the role selection page where deployment can continue.

OS Install Summary
Figure 3.31: OS Install Summary

3.4.1.4 Deploy New Servers Edit source

When all newly added servers have an OS provisioned, either via the Install OS process detailed above or having previously been provisioned outside of the Cloud Lifecycle Manager Admin UI, deployment can begin.

The Deploy button will be enabled when one or more new servers have been assigned roles. Clicking Deploy prompt for confirmation before beginning the deployment process

Confirm Deploy Servers
Figure 3.32: Confirm Deploy Servers

The deployment process will begin by running the Configuration Processor in basic validation mode to check the values input for the servers being added. This will check IP addresses, server groups, and NIC mappings for syntax or format errors.

Validate Server Changes
Figure 3.33: Validate Server Changes

After validation is successful, the servers will be prepared for deployment. The preparation consists of running the full Configuration Processor and two additional playbooks to ready servers for deployment.

Prepare Servers
Figure 3.34: Prepare Servers

After the servers have been prepared, deployment can begin. This process will generate a new hosts file, run the site.yml playbook, and update monasca (if monasca is deployed)

Deploy Servers
Figure 3.35: Deploy Servers

When deployment is completed, a summary page will be displayed. Clicking Close will return to the Add Server page.

Deploy Summary
Figure 3.36: Deploy Summary

3.4.2 Activating Servers Edit source

The Server Summary page in the Cloud Lifecycle Manager Admin UI allows for activating Compute Nodes in the Cloud. Compute Nodes may be activated when they are added to the Cloud. An activated compute node is available for the nova-scheduler to use for hosting new VMs that are created. Only servers that are not currently activated will have the activation menu option available.

Activate Server
Figure 3.37: Activate Server

Once activation is triggered, the progress of activating the node and adding it to the nova-scheduler is displayed.

Activate Server Progress
Figure 3.38: Activate Server Progress

3.4.3 Deactivating Servers Edit source

The Server Summary page in the Cloud Lifecycle Manager Admin UI allows for deactivating Compute Nodes in the Cloud. Deactivating a Compute Node removes it from the pool of servers that the nova-scheduler will put VMs on. When a Compute Node is deactivated, the UI attempts to migrate any currently running VMs from that server to an active node.

Deactivate Server
Figure 3.39: Deactivate Server

The deactivation process requires confirmation before proceeding.

Deactivate Server Confirmation
Figure 3.40: Deactivate Server Confirmation

Once deactivation is triggered, the progress of deactivating the node and removing it from the nova-scheduler is displayed.

Deactivate Server Progress
Figure 3.41: Deactivate Server Progress

If a Compute Node selected for deactivation has VMs running on it, a prompt will appear to select where to migrate the running VMs

Select Migration Target
Figure 3.42: Select Migration Target

A summary of the VMs being migrated will be displayed, along with the progress migrating them from the deactivated Compute Node to the target host. Once the migration attempt is complete, click 'Done' to continue the deactivation process.

Deactivate Migration Progress
Figure 3.43: Deactivate Migration Progress

3.4.4 Deleting Servers Edit source

The Server Summary page in the Cloud Lifecycle Manager Admin UI allows for deleting Compute Nodes from the Cloud. Deleting a Compute Node removes it from the cloud. Only Compute Nodes that are deactivated can be deleted.

Delete Server
Figure 3.44: Delete Server

The deletion process requires confirmation before proceeding.

Delete Server Confirmation
Figure 3.45: Delete Server Confirmation

If the Compute Node is not reachable (SSH from the deployer is not possible), a warning will appear, requesting confirmation that the node is shut down or otherwise removed from the environment. Reachable Compute Nodes will be shutdown as part of the deletion process.

Unreachable Delete Confirmation
Figure 3.46: Unreachable Delete Confirmation

The progress of deleting the Compute Node will be displayed, including a streaming log with additional details of the running playbooks.

Delete Server Progress
Figure 3.47: Delete Server Progress

3.5 Server Replacement Edit source

The process of replacing a server is initiated from the Server Summary (see Section 3.2.6, “Servers”). Replacing a server will remove the existing server from the Cloud configuration and install the new server in its place. The rest of this process varies slightly depending on the type of server being replaced.

3.5.1 Control Plane Servers Edit source

Servers that are part of the Control Plane (generally those that are not hosting Compute VMs or ephemeral storage) are replaced "in-place". This means the replacement server has the same IP Address and is expected to have the same NIC Mapping and Server Group as the server being replaced.

To replace a Control Plane server, click the menu to the right of the server listing on the Summary tab of the Section 3.2.6, “Servers” page. From the menu options, select Replace.

Replace Server Menu
Figure 3.48: Replace Server Menu

Selecting Replace will open a dialog box that includes information about the server being replaced, as well as a form for inputting the required information for the new server.

Replace Controller Form
Figure 3.49: Replace Controller Form

The IPMI information for the new server is required to perform the replacement process.

MAC Address

The hardware address of the server's primary physical ethernet adapter

IPMI IP Address

The network address for IPMI access to the new server

IPMI Username

The username credential for IPMI access to the new server

IPMI Password

The password associated with the IPMI Username on the new server

To use a server that has already been discovered, check the box for Use available servers and select an existing server from the Available Servers dropdown. This will automatically populate the server information fields above with the information previously entered/discovered for the specified server.

If SLES is not already installed, or to reinstall SLES on the new server, check the box for Install OS. The username will be pre-populated with the username from the Cloud install. Installing the OS requires specifying the password that was used for deploying the cloud so that the replacement process can access the host after the OS is installed.

The data disks on the new server will not be wiped by default, but users can specify to wipe clean the data disks as part of the replacement process.

Once the new server information is set, click the Replace button in the lower right to begin replacement. A list of the replacement process steps will be displayed, and there will be a link at the bottom of the list to show the log file as the changes are made.

Replace Controller Progress
Figure 3.50: Replace Controller Progress

When all of the steps are complete, click Close to return to the Servers page.

3.5.2 Compute Servers Edit source

When servers that host VMs are replaced, the following actions happen:

  1. a new server is added

  2. existing instances are migrated from the existing server to the new server

  3. the existing server is deleted from the model

The new server will not have the same IP Address and may have a different NIC Mapping and Server Group than the server being replaced.

To replace a Compute server, click the menu to the right of the server listing on the Summary tab of the Section 3.2.6, “Servers” page. From the menu options, select Replace.

Replace Compute Menu
Figure 3.51: Replace Compute Menu

Selecting Replace will open a dialog box that includes information about the server being replaced, and a form for inputting the required information for the new server.

If the IP address of the server being replaced cannot be reached by the deployer, a warning will appear to verify that the replacement should continue.

Unreachable Compute Node Warning
Figure 3.52: Unreachable Compute Node Warning
Replace Compute Form
Figure 3.53: Replace Compute Form

Replacing a Compute server involves adding the new server and then performing migration. This requires some new information:

  • an unused IP address

  • a new ID

  • selections for Server Group and NIC Mapping, which do not need to match the original server.

ID

This is the ID of the server in the data model. This does not necessarily correspond to any DNS or other naming labels of a host, unless the host ID was set that way during install.

IP Address

The management network IP address of the server

Server Group

The server group which this server is assigned to. If the required Server Group does not exist, it can be created

NIC Mapping

The NIC mapping that describes the PCI slot addresses for the server's ethernet adapters. If the required NIC mapping does not exist, it can be created

The IPMI information for the new server is also required to perform the replacement process.

Mac Address

The hardware address of the server's primary physical ethernet adapter

IPMI IP Address

The network address for IPMI access to the new server

IPMI Username

The username credential for IPMI access to the new server

IPMI Password

The password associated with the IPMI Username

To use a server that has already been discovered, check the box for Use available servers and select an existing server from the Available Servers dropdown. This will automatically populate the server information fields above with the information previously entered/discovered for the specified server.

If SLES is not already installed, or to reinstall SLES on the new server, check the box for Install OS. The username will be pre-populated with the username from the Cloud install. Installing the OS requires specifying the password that was used for deploying the cloud so that the replacement process can access the host after the OS is installed.

The data disks on the new server will not be wiped by default, but wipe clean can specified for the data disks as part of the replacement process.

When the new server information is set, click the Replace button in the lower right to begin replacement. The configuration processor will be run to validate that the entered information is compatible with the configuration of the Cloud.

When validation has completed, the Compute replacement takes place in several distinct steps, and each will have its own page with a list of process steps displayed. A link at the bottom of the list can show the log file as the changes are made.

  1. Install SLES if that option was selected

    Install SLES on New Compute
    Figure 3.54: Install SLES on New Compute
  2. Commit the changes to the data model and run the configuration processor

    Prepare Compute Server
    Figure 3.55: Prepare Compute Server
  3. Deploy the new server, install services on it, update monasca (if installed), activate the server with nova so that it can host VMs.

    Deploy New Compute Server
    Figure 3.56: Deploy New Compute Server
  4. Disable the existing server. If the existing server is unreachable, there may be warnings about disabling services on that server.

    Host Aggregate Removal Warning
    Figure 3.57: Host Aggregate Removal Warning

    If the existing server is reachable, instances on that server will be migrated to the new server.

    Migrate Instances from Existing Compute Server
    Figure 3.58: Migrate Instances from Existing Compute Server

    If the existing server is not reachable, the migration step will be skipped.

    Disable Existing Compute Server
    Figure 3.59: Disable Existing Compute Server
  5. Remove the existing server from the model and update the cloud configuration. If the server is not reachable, the user is asked to verify that the server is shut down. If server is reachable, the cloud services running on it will be stopped and the server will be shut down as part of the removal from the Cloud.

    Existing Server Shutdown Check
    Figure 3.60: Existing Server Shutdown Check

    Upon verification that the unreachable host is shut down, it will be removed from the data model.

    Existing Server Delete
    Figure 3.61: Existing Server Delete

    After the model has been updated, a summary of the changes will appear. Click Close to return to the server summary screen.

    Compute Replacement Summary
    Figure 3.62: Compute Replacement Summary

4 Third-Party Integrations Edit source

4.1 Splunk Integration Edit source

This documentation demonstrates the possible integration between the SUSE OpenStack Cloud 9 centralized logging solution and Splunk including the steps to set up and forward logs.

The SUSE OpenStack Cloud 9 logging solution provides a flexible and extensible framework to centralize the collection and processing of logs from all of the nodes in a cloud. The logs are shipped to a highly available and fault tolerant cluster where they are transformed and stored for better searching and reporting. The SUSE OpenStack Cloud 9 logging solution uses the ELK stack (Elasticsearch, Logstash and Kibana) as a production grade implementation and can support other storage and indexing technologies. The Logstash pipeline can be configured to forward the logs to an alternative target if you wish.

This documentation demonstrates the possible integration between the SUSE OpenStack Cloud 9 centralized logging solution and Splunk including the steps to set up and forward logs.

4.1.1 What is Splunk? Edit source

Splunk is software for searching, monitoring, and analyzing machine-generated big data, via a web-style interface. Splunk captures, indexes and correlates real-time data in a searchable repository from which it can generate graphs, reports, alerts, dashboards and visualizations. It is commercial software (unlike Elasticsearch) and more details about Splunk can be found at https://www.splunk.com.

4.1.2 Configuring Splunk to receive log messages from SUSE OpenStack Cloud 9 Edit source

This documentation assumes that you already have Splunk set up and running. For help with installing and setting up Splunk, refer to Splunk Tutorial.

There are different ways in which a log message (or "event" in Splunk's terminology) can be sent to Splunk. These steps will set up a TCP port where Splunk will listen for messages.

  1. On the Splunk web UI, click on the Settings menu in the upper right-hand corner.

  2. In the Data section of the Settings menu, click Data Inputs.

  3. Choose the TCP option.

  4. Click the New button to add an input.

  5. In the Port field, enter the port number you want to use.

    Note
    Note

    If you are on a less secure network and want to restrict connections to this port, use the Only accept connection from field to restrict the traffic to a specific IP address.

  6. Click the Next button.

  7. Specify the Source Type by clicking on the Select button and choosing linux_messages_syslog from the list.

  8. Click the Review button.

  9. Review the configuration and click the Submit button.

  10. A success message will be displayed.

4.1.3 Forwarding log messages from SUSE OpenStack Cloud 9 Centralized Logging to Splunk Edit source

When you have Splunk set up and configured to receive log messages, you can configure SUSE OpenStack Cloud 9 to forward the logs to Splunk.

  1. Log in to the Cloud Lifecycle Manager.

  2. Check the status of the logging service:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts logging-status.yml

    If everything is up and running, continue to the next step.

  3. Edit the logstash config file at the location below:

    ~/openstack/ardana/ansible/roles/logging-server/templates/logstash.conf.j2

    At the bottom of the file will be a section for the Logstash outputs. Add details about your Splunk environment details.

    Below is an example, showing the placement in bold:

    # Logstash outputs
    #------------------------------------------------------------------------------
    output {
      # Configure Elasticsearch output
      # http://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html
      elasticsearch {
        index => %{[@metadata][es_index]}
        hosts => ["{{ elasticsearch_http_host }}:{{ elasticsearch_http_port }}"]
        flush_size => {{ logstash_flush_size }}
        idle_flush_time => 5
        workers => {{ logstash_threads }}
      }
       # Forward Logs to Splunk on the TCP port that matches the one specified in Splunk Web UI.
     tcp {
       mode => "client"
       host => "<Enter Splunk listener IP address>"
       port => TCP_PORT_NUMBER
     }
    }
    Note
    Note

    If you are not planning on using the Splunk UI to parse your centralized logs, there is no need to forward your logs to Elasticsearch. In this situation, comment out the lines in the Logstash outputs pertaining to Elasticsearch. However, you can continue to forward your centralized logs to multiple locations.

  4. Commit your changes to git:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git add -A
    ardana > git commit -m "Logstash configuration change for Splunk
    integration"
  5. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  6. Update your deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost
    ready-deployment.yml
  7. Complete this change with a reconfigure of the logging environment:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts kronos-reconfigure.yml
  8. In your Splunk UI, confirm that the logs have begun to forward.

4.1.4 Searching for log messages from the Spunk dashboard Edit source

To both verify that your integration worked and to search your log messages that have been forwarded you can navigate back to your Splunk dashboard. In the search field, use this string:

source="tcp:TCP_PORT_NUMBER"

Find information on using the Splunk search tool at http://docs.splunk.com/Documentation/Splunk/6.4.3/SearchTutorial/WelcometotheSearchTutorial.

4.2 Operations Bridge Integration Edit source

The SUSE OpenStack Cloud 9 monitoring solution (monasca) can easily be integrated with your existing monitoring tools. Integrating SUSE OpenStack Cloud 9 monasca with Operations Bridge using the Operations Bridge Connector simplifies monitoring and managing events and topology information.

The integration provides the following functionality:

  • Forwarding of SUSE OpenStack Cloud monasca alerts and topology to Operations Bridge for event correlation

  • Customization of forwarded events and topology

For more information about this connector please see https://software.microfocus.com/en-us/products/operations-bridge-suite/overview.

4.3 Monitoring Third-Party Components With Monasca Edit source

4.3.1 monasca Monitoring Integration Overview Edit source

monasca, the SUSE OpenStack Cloud 9 monitoring service, collects information about your cloud's systems, and allows you to create alarm definitions based on these measurements. monasca-agent is the component that collects metrics such as metric storage and alarm thresholding and forwards them to the monasca-api for further processing.

With a small amount of configuration, you can use the detection and check plugins that are provided with your cloud to monitor integrated third-party components. In addition, you can write custom plugins and integrate them with the existing monitoring service.

Find instructions for customizing existing plugins to monitor third-party components in the Section 4.3.4, “Configuring Check Plugins”.

Find instructions for installing and configuring new custom plugins in the Section 4.3.3, “Writing Custom Plugins”.

You can also use existing alarm definitions, as well as create new alarm definitions that relate to a custom plugin or metric. Instructions for defining new alarm definitions are in the Section 4.3.6, “Configuring Alarm Definitions”.

You can use the Operations Console and monasca CLI to list all of the alarms, alarm-definitions, and metrics that exist on your cloud.

4.3.2 monasca Agent Edit source

The monasca agent (monasca-agent) collects information about your cloud using the installed plugins. The plugins are written in Python, and determine the monitoring metrics for your system, as well as the interval for collection. The default collection interval is 30 seconds, and we strongly recommend not changing this default value.

The following two types of custom plugins can be added to your cloud.

  • Detection Plugin. Determines whether the monasca-agent has the ability to monitor the specified component or service on a host. If successful, this type of plugin configures an associated check plugin by creating a YAML configuration file.

  • Check Plugin. Specifies the metrics to be monitored, using the configuration file created by the detection plugin.

monasca-agent is installed on every server in your cloud, and provides plugins that monitor the following.

  • System metrics relating to CPU, memory, disks, host availability, etc.

  • Process health metrics (process, http_check)

  • SUSE OpenStack Cloud 9-specific component metrics, such as apache rabbitmq, kafka, cassandra, etc.

monasca is pre-configured with default check plugins and associated detection plugins. The default plugins can be reconfigured to monitor third-party components, and often only require small adjustments to adapt them to this purpose. Find a list of the default plugins here: https://github.com/openstack/monasca-agent/blob/master/docs/Plugins.md#detection-plugins

Often, a single check plugin will be used to monitor multiple services. For example, many services use the http_check.py detection plugin to detect the up/down status of a service endpoint. Often the process.py check plugin, which provides process monitoring metrics, is used as a basis for a custom process detection plugin.

More information about the monasca agent can be found in the following locations

4.3.3 Writing Custom Plugins Edit source

When the pre-built monasca plugins do not meet your monitoring needs, you can write custom plugins to monitor your cloud. After you have written a plugin, you must install and configure it.

When your needs dictate a very specific custom monitoring check, you must provide both a detection and check plugin.

The steps involved in configuring a custom plugin include running a detection plugin and passing any necesssary parameters to the detection plugin so the resulting check configuration file is created with all necessary data.

When using an existing check plugin to monitor a third-party component, a custom detection plugin is needed only if there is not an associated default detection plugin.

Check plugin configuration files

Each plugin needs a corresponding YAML configuration file with the same stem name as the plugin check file. For example, the plugin file http_check.py (in /usr/lib/python2.7/site-packages/monasca_agent/collector/checks_d/) should have a corresponding configuration file, http_check.yaml (in /etc/monasca/agent/conf.d/http_check.yaml). The stem name http_check must be the same for both files.

Permissions for the YAML configuration file must be read+write for mon-agent user (the user that must also own the file), and read for the mon-agent group. Permissions for the file must be restricted to the mon-agent user and monasca group. The following example shows correct permissions settings for the file http_check.yaml.

ardana > ls -alt /etc/monasca/agent/conf.d/http_check.yaml
-rw-r----- 1 monasca-agent monasca 10590 Jul 26 05:44 http_check.yaml

A check plugin YAML configuration file has the following structure.

init_config:
    key1: value1
    key2: value2

instances:
    - name: john_smith
      username: john_smith
      password: 123456
    - name: jane_smith
      username: jane_smith
      password: 789012

In the above file structure, the init_config section allows you to specify any number of global key:value pairs. Each pair will be available on every run of the check that relates to the YAML configuration file.

The instances section allows you to list the instances that the related check will be run on. The check will be run once on each instance listed in the instances section. Ensure that each instance listed in the instances section has a unique name.

Custom detection plugins

Detection plugins should be written to perform checks that ensure that a component can be monitored on a host. Any arguments needed by the associated check plugin are passed into the detection plugin at setup (configuration) time. The detection plugin will write to the associated check configuration file.

When a detection plugin is successfully run in the configuration step, it will write to the check configuration YAML file. The configuration file for the check is written to the following directory.

/etc/monasca/agent/conf.d/

Writing process detection plugin using the ServicePlugin class

The monasca-agent provides a ServicePlugin class that makes process detection monitoring easy.

Process check

The process check plugin generates metrics based on the process status for specified process names. It generates process.pid_count metrics for the specified dimensions, and a set of detailed process metrics for the specified dimensions by default.

The ServicePlugin class allows you to specify a list of process name(s) to detect, and uses psutil to see if the process exists on the host. It then appends the process.yml configuration file with the process name(s), if they do not already exist.

The following is an example of a process.py check ServicePlugin.

import monasca_setup.detection

class monascaTransformDetect(monasca_setup.detection.ServicePlugin):
    """Detect monasca Transform daemons and setup configuration to monitor them."""
    def __init__(self, template_dir, overwrite=False, args=None):
        log.info("      Watching the monasca transform processes.")
        service_params = {
            'args': {},
            'template_dir': template_dir,
            'overwrite': overwrite,
            'service_name': 'monasca-transform',
            'process_names': ['monasca-transform','pyspark',
                              'transform/lib/driver']
        }
        super(monascaTransformDetect, self).__init__(service_params)

Writing a Custom Detection Plugin using Plugin or ArgsPlugin classes

A custom detection plugin class should derive from either the Plugin or ArgsPlugin classes provided in the /usr/lib/python2.7/site-packages/monasca_setup/detection directory.

If the plugin parses command line arguments, the ArgsPlugin class is useful. The ArgsPlugin class derives from the Plugin class. The ArgsPlugin class has a method to check for required arguments, and a method to return the instance that will be used for writing to the configuration file with the dimensions from the command line parsed and included.

If the ArgsPlugin methods do not seem to apply, then derive directly from the Plugin class.

When deriving from these classes, the following methods should be implemented.

  • _detect - set self.available=True when conditions are met that the thing to monitor exists on a host.

  • build_config - writes the instance information to the configuration and return the configuration.

  • dependencies_installed (default implementation is in ArgsPlugin, but not Plugin) - return true when python dependent libraries are installed.

The following is an example custom detection plugin.

import ast
import logging

import monasca_setup.agent_config
import monasca_setup.detection

log = logging.getLogger(__name__)


class HttpCheck(monasca_setup.detection.ArgsPlugin):
    """Setup an http_check according to the passed in args.
       Despite being a detection plugin this plugin does no detection and will be a noop without   arguments.
       Expects space separated arguments, the required argument is url. Optional parameters include:
       disable_ssl_validation and match_pattern.
    """

    def _detect(self):
        """Run detection, set self.available True if the service is detected.
        """
        self.available = self._check_required_args(['url'])

    def build_config(self):
        """Build the config as a Plugins object and return.
        """
        config = monasca_setup.agent_config.Plugins()
        # No support for setting headers at this time
        instance = self._build_instance(['url', 'timeout', 'username', 'password',
                                         'match_pattern', 'disable_ssl_validation',
                                         'name', 'use_keystone', 'collect_response_time'])

        # Normalize any boolean parameters
        for param in ['use_keystone', 'collect_response_time']:
            if param in self.args:
                instance[param] = ast.literal_eval(self.args[param].capitalize())
        # Set some defaults
        if 'collect_response_time' not in instance:
            instance['collect_response_time'] = True
        if 'name' not in instance:
            instance['name'] = self.args['url']

        config['http_check'] = {'init_config': None, 'instances': [instance]}

        return config

Installing a detection plugin in the OpenStack version delivered with SUSE OpenStack Cloud

Install a plugin by copying it to the plugin directory (/usr/lib/python2.7/site-packages/monasca_agent/collector/checks_d/).

The plugin should have file permissions of read+write for the root user (the user that should also own the file) and read for the root group and all other users.

The following is an example of correct file permissions for the http_check.py file.

-rw-r--r-- 1 root root 1769 Sep 19 20:14 http_check.py

Detection plugins should be placed in the following directory.

/usr/lib/monasca/agent/custom_detect.d/

The detection plugin directory name should be accessed using the monasca_agent_detection_plugin_dir Ansible variable. This variable is defined in the roles/monasca-agent/vars/main.yml file.

monasca_agent_detection_plugin_dir: /usr/lib/monasca/agent/custom_detect.d/

Example: Add Ansible monasca_configure task to install the plugin. (The monasca_configure task can be added to any service playbook.) In this example, it is added to ~/openstack/ardana/ansible/roles/_CEI-CMN/tasks/monasca_configure.yml.

---
- name: _CEI-CMN | monasca_configure |
    Copy ceilometer Custom plugin
  become: yes
  copy:
    src: ardanaceilometer_mon_plugin.py
    dest: "{{ monasca_agent_detection_plugin_dir }}"
    owner: root
    group: root
    mode: 0440

Custom check plugins

Custom check plugins generate metrics. Scalability should be taken into consideration on systems that will have hundreds of servers, as a large number of metrics can affect performance by impacting disk performance, RAM and CPU usage.

You may want to tune your configuration parameters so that less-important metrics are not monitored as frequently. When check plugins are configured (when they have an associated YAML configuration file) the agent will attempt to run them.

Checks should be able to run within the 30-second metric collection window. If your check runs a command, you should provide a timeout to prevent the check from running longer than the default 30-second window. You can use the monasca_agent.common.util.timeout_command to set a timeout for in your custom check plugin python code.

Find a description of how to write custom check plugins at https://github.com/openstack/monasca-agent/blob/master/docs/Customizations.md#creating-a-custom-check-plugin

Custom checks derive from the AgentCheck class located in the monasca_agent/collector/checks/check.py file. A check method is required.

Metrics should contain dimensions that make each item that you are monitoring unique (such as service, component, hostname). The hostname dimension is defined by default within the AgentCheck class, so every metric has this dimension.

A custom check will do the following.

  • Read the configuration instance passed into the check method.

  • Set dimensions that will be included in the metric.

  • Create the metric with gauge, rate, or counter types.

Metric Types:

  • gauge: Instantaneous reading of a particular value (for example, mem.free_mb).

  • rate: Measurement over a time period. The following equation can be used to define rate.

    rate=delta_v/float(delta_t)
  • counter: The number of events, increment and decrement methods, for example, zookeeper.timeouts

The following is an example component check named SimpleCassandraExample.

import monasca_agent.collector.checks as checks
from monasca_agent.common.util import timeout_command

CASSANDRA_VERSION_QUERY = "SELECT version();"


class SimpleCassandraExample(checks.AgentCheck):

    def __init__(self, name, init_config, agent_config):
        super(SimpleCassandraExample, self).__init__(name, init_config, agent_config)

    @staticmethod
    def _get_config(instance):
        user = instance.get('user')
        password = instance.get('password')
        service = instance.get('service')
        timeout = int(instance.get('timeout'))

        return user, password, service, timeout

    def check(self, instance):
        user, password, service, node_name, timeout = self._get_config(instance)

        dimensions = self._set_dimensions({'component': 'cassandra', 'service': service}, instance)

        results, connection_status = self._query_database(user, password, timeout, CASSANDRA_VERSION_QUERY)

        if connection_status != 0:
            self.gauge('cassandra.connection_status', 1, dimensions=dimensions)
        else:
            # successful connection status
            self.gauge('cassandra.connection_status', 0, dimensions=dimensions)

    def _query_database(self, user, password, timeout, query):
        stdout, stderr, return_code = timeout_command(["/opt/cassandra/bin/vsql", "-U", user, "-w", password, "-A", "-R",
                                                       "|", "-t", "-F", ",", "-x"], timeout, command_input=query)
        if return_code == 0:
            # remove trailing newline
            stdout = stdout.rstrip()
            return stdout, 0
        else:
            self.log.error("Error querying cassandra with return code of {0} and error {1}".format(return_code, stderr))
            return stderr, 1

Installing check plugin

The check plugin needs to have the same file permissions as the detection plugin. File permissions must be read+write for the root user (the user that should own the file), and read for the root group and all other users.

Check plugins should be placed in the following directory.

/usr/lib/monasca/agent/custom_checks.d/

The check plugin directory should be accessed using the monasca_agent_check_plugin_dir Ansible variable. This variable is defined in the roles/monasca-agent/vars/main.yml file.

monasca_agent_check_plugin_dir: /usr/lib/monasca/agent/custom_checks.d/

4.3.4 Configuring Check Plugins Edit source

Manually configure a plugin when unit-testing using the monasca-setup script installed with the monasca-agent

Find a good explanation of configuring plugins here: https://github.com/openstack/monasca-agent/blob/master/docs/Agent.md#configuring

SSH to a node that has both the monasca-agent installed as well as the component you wish to monitor.

The following is an example command that configures a plugin that has no parameters (uses the detection plugin class name).

root # /usr/bin/monasca-setup -d ARDANACeilometer

The following is an example command that configures the apache plugin and includes related parameters.

root # /usr/bin/monasca-setup -d apache -a 'url=http://192.168.245.3:9095/server-status?auto'

If there is a change in the configuration it will restart the monasca-agent on the host so the configuration is loaded.

After the plugin is configured, you can verify that the configuration file has your changes (see the next Verify that your check plugin is configured section).

Use the monasca CLI to see if your metric exists (see the Verify that metrics exist section).

Using Ansible modules to configure plugins in SUSE OpenStack Cloud 9

The monasca_agent_plugin module is installed as part of the monasca-agent role.

The following Ansible example configures the process.py plugin for the ceilometer detection plugin. The following example only passes in the name of the detection class.

- name: _CEI-CMN | monasca_configure |
    Run monasca agent Cloud Lifecycle Manager specific ceilometer detection plugin
  become: yes
  monasca_agent_plugin:
    name: "ARDANACeilometer"

If a password or other sensitive data are passed to the detection plugin, the no_log option should be set to True. If the no_log option is not set to True, the data passed to the plugin will be logged to syslog.

The following Ansible example configures the Cassandra plugin and passes in related arguments.

 - name: Run monasca Agent detection plugin for Cassandra
   monasca_agent_plugin:
     name: "Cassandra"
     args="directory_names={{ FND_CDB.vars.cassandra_data_dir }},{{ FND_CDB.vars.cassandra_commit_log_dir }} process_username={{ FND_CDB.vars.cassandra_user }}"
   when: database_type == 'cassandra'

The following Ansible example configures the keystone endpoint using the http_check.py detection plugin. The class name httpcheck of the http_check.py detection plugin is the name.

root # - name:  keystone-monitor | local_monitor |
    Setup active check on keystone internal endpoint locally
  become: yes
  monasca_agent_plugin:
    name: "httpcheck"
    args: "use_keystone=False \
           url=http://{{ keystone_internal_listen_ip }}:{{
               keystone_internal_port }}/v3 \
           dimensions=service:identity-service,\
                       component:keystone-api,\
                       api_endpoint:internal,\
                       monitored_host_type:instance"
  tags:
    - keystone
    - keystone_monitor

Verify that your check plugin is configured

All check configuration files are located in the following directory. You can see the plugins that are running by looking at the plugin configuration directory.

/etc/monasca/agent/conf.d/

When the monasca-agent starts up, all of the check plugins that have a matching configuration file in the /etc/monasca/agent/conf.d/ directory will be loaded.

If there are errors running the check plugin they will be written to the following error log file.

/var/log/monasca/agent/collector.log

You can change the monasca-agent log level by modifying the log_level option in the /etc/monasca/agent/agent.yaml configuration file, and then restarting the monasca-agent, using the following command.

root # service openstack-monasca-agent restart

You can debug a check plugin by running monasca-collector with the check option. The following is an example of the monasca-collector command.

tux > sudo /usr/bin/monasca-collector check CHECK_NAME

Verify that metrics exist

Begin by logging in to your deployer or controller node.

Run the following set of commands, including the monasca metric-list command. If the metric exists, it will be displayed in the output.

ardana > source ~/service.osrc
ardana > monasca metric-list --name METRIC_NAME

4.3.5 Metric Performance Considerations Edit source

Collecting metrics on your virtual machines can greatly affect performance. SUSE OpenStack Cloud 9 supports 200 compute nodes, with up to 40 VMs each. If your environment is managing maximum number of VMs, adding a single metric for all VMs is the equivalent of adding 8000 metrics.

Because of the potential impact that new metrics have on system performance, consider adding only new metrics that are useful for alarm-definition, capacity planning, or debugging process failure.

4.3.6 Configuring Alarm Definitions Edit source

The monasca-api-spec, found here https://github.com/openstack/monasca-api/blob/master/docs/monasca-api-spec.md provides an explanation of Alarm Definitions and Alarms. You can find more information on alarm definition expressions at the following page: https://github.com/openstack/monasca-api/blob/master/docs/monasca-api-spec.md#alarm-definition-expressions.

When an alarm definition is defined, the monasca-threshold engine will generate an alarm for each unique instance of the match_by metric dimensions found in the metric. This allows a single alarm definition that can dynamically handle the addition of new hosts.

There are default alarm definitions configured for all "process check" (process.py check) and "HTTP Status" (http_check.py check) metrics in the monasca-default-alarms role. The monasca-default-alarms role is installed as part of the monasca deployment phase of your cloud's deployment. You do not need to create alarm definitions for these existing checks.

Third parties should create an alarm definition when they wish to alarm on a custom plugin metric. The alarm definition should only be defined once. Setting a notification method for the alarm definition is recommended but not required.

The following Ansible modules used for alarm definitions are installed as part of the monasca-alarm-definition role. This process takes place during the monasca set up phase of your cloud's deployment.

  • monasca_alarm_definition

  • monasca_notification_method

The following examples, found in the ~/openstack/ardana/ansible/roles/monasca-default-alarms directory, illustrate how monasca sets up the default alarm definitions.

monasca Notification Methods

The monasca-api-spec, found in the following link, provides details about creating a notification https://github.com/openstack/monasca-api/blob/master/docs/monasca-api-spec.md#create-notification-method

The following are supported notification types.

  • EMAIL

  • WEBHOOK

  • PAGERDUTY

The keystone_admin_tenant project is used so that the alarms will show up on the Operations Console UI.

The following file snippet shows variables from the ~/openstack/ardana/ansible/roles/monasca-default-alarms/defaults/main.yml file.

---
notification_address: root@localhost
notification_name: 'Default Email'
notification_type: EMAIL

monasca_keystone_url: "{{ KEY_API.advertises.vips.private[0].url }}/v3"
monasca_api_url: "{{ MON_AGN.consumes_MON_API.vips.private[0].url }}/v2.0"
monasca_keystone_user: "{{ MON_API.consumes_KEY_API.vars.keystone_monasca_user }}"
monasca_keystone_password: "{{ MON_API.consumes_KEY_API.vars.keystone_monasca_password | quote }}"
monasca_keystone_project: "{{ KEY_API.vars.keystone_admin_tenant }}"

monasca_client_retries: 3
monasca_client_retry_delay: 2

You can specify a single default notification method in the ~/openstack/ardana/ansible/roles/monasca-default-alarms/tasks/main.yml file. You can also add or modify the notification type and related details using the Operations Console UI or monasca CLI.

The following is a code snippet from the ~/openstack/ardana/ansible/roles/monasca-default-alarms/tasks/main.yml file.

---
- name: monasca-default-alarms | main | Setup default notification method
  monasca_notification_method:
    name: "{{ notification_name }}"
    type: "{{ notification_type }}"
    address: "{{ notification_address }}"
    keystone_url: "{{ monasca_keystone_url }}"
    keystone_user: "{{ monasca_keystone_user }}"
    keystone_password: "{{ monasca_keystone_password }}"
    keystone_project: "{{ monasca_keystone_project }}"
    monasca_api_url: "{{ monasca_api_url }}"
  no_log: True
  tags:
    - system_alarms
    - monasca_alarms
    - openstack_alarms
  register: default_notification_result
  until: not default_notification_result | failed
  retries: "{{ monasca_client_retries }}"
  delay: "{{ monasca_client_retry_delay }}"

monasca Alarm Definition

In the alarm definition "expression" field, you can specify the metric name and threshold. The "match_by" field is used to create a new alarm for every unique combination of the match_by metric dimensions.

Find more details on alarm definitions at the monasca API documentation: (https://github.com/stackforge/monasca-api/blob/master/docs/monasca-api-spec.md#alarm-definitions-and-alarms).

The following is a code snippet from the ~/openstack/ardana/ansible/roles/monasca-default-alarms/tasks/main.yml file.

- name: monasca-default-alarms | main | Create Alarm Definitions
  monasca_alarm_definition:
    name: "{{ item.name }}"
    description: "{{ item.description | default('') }}"
    expression: "{{ item.expression }}"
    keystone_token: "{{ default_notification_result.keystone_token }}"
    match_by: "{{ item.match_by | default(['hostname']) }}"
    monasca_api_url: "{{ default_notification_result.monasca_api_url }}"
    severity: "{{ item.severity | default('LOW') }}"
    alarm_actions:
      - "{{ default_notification_result.notification_method_id }}"
    ok_actions:
      - "{{ default_notification_result.notification_method_id }}"
    undetermined_actions:
      - "{{ default_notification_result.notification_method_id }}"
  register: monasca_system_alarms_result
  until: not monasca_system_alarms_result | failed
  retries: "{{ monasca_client_retries }}"
  delay: "{{ monasca_client_retry_delay }}"
  with_flattened:
    - monasca_alarm_definitions_system
    - monasca_alarm_definitions_monasca
    - monasca_alarm_definitions_openstack
    - monasca_alarm_definitions_misc_services
  when: monasca_create_definitions

In the following example ~/openstack/ardana/ansible/roles/monasca-default-alarms/vars/main.yml Ansible variables file, the alarm definition named Process Check sets the match_by variable with the following parameters.

  • process_name

  • hostname

monasca_alarm_definitions_system:
  - name: "Host Status"
    description: "Alarms when the specified host is down or not reachable"
    severity: "HIGH"
    expression: "host_alive_status > 0"
    match_by:
      - "target_host"
      - "hostname"
  - name: "HTTP Status"
    description: >
      "Alarms when the specified HTTP endpoint is down or not reachable"
    severity: "HIGH"
    expression: "http_status > 0"
    match_by:
      - "service"
      - "component"
      - "hostname"
      - "url"
  - name: "CPU Usage"
    description: "Alarms when CPU usage is high"
    expression: "avg(cpu.idle_perc) < 10 times 3"
  - name: "High CPU IOWait"
    description: "Alarms when CPU IOWait is high, possible slow disk issue"
    expression: "avg(cpu.wait_perc) > 40 times 3"
    match_by:
      - "hostname"
  - name: "Disk Inode Usage"
    description: "Alarms when disk inode usage is high"
    expression: "disk.inode_used_perc > 90"
    match_by:
      - "hostname"
      - "device"
    severity: "HIGH"
  - name: "Disk Usage"
    description: "Alarms when disk usage is high"
    expression: "disk.space_used_perc > 90"
    match_by:
      - "hostname"
      - "device"
    severity: "HIGH"
  - name: "Memory Usage"
    description: "Alarms when memory usage is high"
    severity: "HIGH"
    expression: "avg(mem.usable_perc) < 10 times 3"
  - name: "Network Errors"
    description: >
      "Alarms when either incoming or outgoing network errors are high"
    severity: "MEDIUM"
    expression: "net.in_errors_sec > 5 or net.out_errors_sec > 5"
  - name: "Process Check"
    description: "Alarms when the specified process is not running"
    severity: "HIGH"
    expression: "process.pid_count < 1"
    match_by:
      - "process_name"
      - "hostname"
  - name: "Crash Dump Count"
    description: "Alarms when a crash directory is found"
    severity: "MEDIUM"
    expression: "crash.dump_count > 0"
    match_by:
      - "hostname"

The preceding configuration would result in the creation of an alarm for each unique metric that matched the following criteria.

process.pid_count + process_name + hostname

Check that the alarms exist

Begin by using the following commands, including monasca alarm-definition-list, to check that the alarm definition exists.

ardana > source ~/service.osrc
ardana > monasca alarm-definition-list --name ALARM_DEFINITION_NAME

Then use either of the following commands to check that the alarm has been generated. A status of "OK" indicates a healthy alarm.

ardana > monasca alarm-list --metric-name metric name

Or

ardana > monasca alarm-list --alarm-definition-id ID_FROM_ALARM-DEFINITION-LIST
Note
Note

To see CLI options use the monasca help command.

Alarm state upgrade considerations

If the name of a monitoring metric changes or is no longer being sent, existing alarms will show the alarm state as UNDETERMINED. You can update an alarm definition as long as you do not change the metric name or dimension name values in the expression or match_by fields. If you find that you need to alter either of these values, you must delete the old alarm definitions and create new definitions with the updated values.

If a metric is never sent, but has a related alarm definition, then no alarms would exist. If you find that metrics are never sent, then you should remove the related alarm definitions.

When removing an alarm definition, the Ansible module monasca_alarm_definition supports the state absent.

The following file snippet shows an example of how to remove an alarm definition by setting the state to absent.

- name: monasca-pre-upgrade | Remove alarm definitions
   monasca_alarm_definition:
     name: "{{ item.name }}"
     state: "absent"
     keystone_url: "{{ monasca_keystone_url }}"
     keystone_user: "{{ monasca_keystone_user }}"
     keystone_password: "{{ monasca_keystone_password }}"
     keystone_project: "{{ monasca_keystone_project }}"
     monasca_api_url: "{{ monasca_api_url }}"
   with_items:
     - { name: "Kafka Consumer Lag" }

An alarm exists in the OK state when the monasca threshold engine has seen at least one metric associated with the alarm definition and has not exceeded the alarm definition threshold.

4.3.7 Openstack Integration of Custom Plugins into monasca-Agent (if applicable) Edit source

monasca-agent is an OpenStack open-source project. monasca can also monitor non-openstack services. Third parties should install custom plugins into their SUSE OpenStack Cloud 9 system using the steps outlined in the Section 4.3.3, “Writing Custom Plugins”. If the OpenStack community determines that the custom plugins are of general benefit, the plugin may be added to the openstack/monasca-agent so that they are installed with the monasca-agent. During the review process for openstack/monasca-agent there are no guarantees that code will be approved or merged by a deadline. Open-source contributors are expected to help with codereviews in order to get their code accepted. Once changes are approved and integrated into the openstack/monasca-agent and that version of the monasca-agent is integrated with SUSE OpenStack Cloud 9, the third party can remove the custom plugin installation steps since they would be installed in the default monasca-agent venv.

Find the open source repository for the monaca-agent here: https://github.com/openstack/monasca-agent

5 Managing Identity Edit source

The Identity service provides the structure for user authentication to your cloud.

5.1 The Identity Service Edit source

This topic explains the purpose and mechanisms of the identity service.

The SUSE OpenStack Cloud Identity service, based on the OpenStack keystone API, is responsible for providing UserID authentication and access authorization to enable organizations to achieve their access security and compliance objectives and successfully deploy OpenStack. In short, the Identity service is the gateway to the rest of the OpenStack services.

5.1.1 Which version of the Identity service should you use? Edit source

Use Identity API version 3.0, as previous versions no longer exist as endpoints for Identity API queries.

Similarly, when performing queries, you must use the OpenStack CLI (the openstack command), and not the keystone CLI (keystone) as the latter is only compatible with API versions prior to 3.0.

5.1.2 Authentication Edit source

The authentication function provides the initial login function to OpenStack. keystone supports multiple sources of authentication, including a native or built-in authentication system. The keystone native system can be used for all user management functions for proof of concept deployments or small deployments not requiring integration with a corporate authentication system, but it lacks some of the advanced functions usually found in user management systems such as forcing password changes. The focus of the keystone native authentication system is to be the source of authentication for OpenStack-specific users required for the operation of the various OpenStack services. These users are stored by keystone in a default domain; the addition of these IDs to an external authentication system is not required.

keystone is more commonly integrated with external authentication systems such as OpenLDAP or Microsoft Active Directory. These systems are usually centrally deployed by organizations to serve as the single source of user management and authentication for all in-house deployed applications and systems requiring user authentication. In addition to LDAP and Microsoft Active Directory, support for integration with Security Assertion Markup Language (SAML)-based identity providers from companies such as Ping, CA, IBM, Oracle, and others is also nearly "production-ready".

keystone also provides architectural support via the underlying Apache deployment for other types of authentication systems such as Multi-Factor Authentication. These types of systems typically require driver support and integration from the respective provider vendors.

Note
Note

While support for Identity Providers and Multi-factor authentication is available in keystone, it has not yet been certified by the SUSE OpenStack Cloud engineering team and is an experimental feature in SUSE OpenStack Cloud.

LDAP-compatible directories such as OpenLDAP and Microsoft Active Directory are recommended alternatives to using the keystone local authentication. Both methods are widely used by organizations and are integrated with a variety of other enterprise applications. These directories act as the single source of user information within an organization. keystone can be configured to authenticate against an LDAP-compatible directory on a per-domain basis.

Domains, as explained in Section 5.3, “Understanding Domains, Projects, Users, Groups, and Roles”, can be configured so that based on the user ID, a incoming user is automatically mapped to a specific domain. This domain can then be configured to authenticate against a specific LDAP directory. The user credentials provided by the user to keystone are passed along to the designated LDAP source for authentication. This communication can be optionally configured to be secure via SSL encryption. No special LDAP administrative access is required, and only read-only access is needed for this configuration. keystone will not add any LDAP information. All user additions, deletions, and modifications are performed by the application's front end in the LDAP directories. After a user has been successfully authenticated, they are then assigned to the groups, roles, and projects defined by the keystone domain or project administrators. This information is stored within the keystone service database.

Another form of external authentication provided by the keystone service is via integration with SAML-based Identity Providers (IdP) such as Ping Identity, IBM Tivoli, and Microsoft Active Directory Federation Server. A SAML-based identity provider provides authentication that is often called "single sign-on". The IdP server is configured to authenticate against identity sources such as Active Directory and provides a single authentication API against multiple types of downstream identity sources. This means that an organization could have multiple identity storage sources but a single authentication source. In addition, if a user has logged into one such source during a defined session time frame, they do not need to re-authenticate within the defined session. Instead, the IdP will automatically validate the user to requesting applications and services.

A SAML-based IdP authentication source is configured with keystone on a per-domain basis similar to the manner in which native LDAP directories are configured. Extra mapping rules are required in the configuration that define which keystone group an incoming UID is automatically assigned to. This means that groups need to be defined in keystone first, but it also removes the requirement that a domain or project admin assign user roles and project membership on a per-user basis. Instead, groups are used to define project membership and roles and incoming users are automatically mapped to keystone groups based on their upstream group membership. This provides a very consistent role-based access control (RBAC) model based on the upstream identity source. The configuration of this option is fairly straightforward. IdP vendors such as Ping and IBM are contributing to the maintenance of this function and have also produced their own integration documentation. Microsoft Active Directory Federation Services (ADFS) is used for functional testing and future documentation.

In addition to SAML-based IdP, keystone also supports external authentication with a third party IdP using OpenID Connect protocol by leveraging the capabilities provided by the Apache2 auth_mod_openidc module. The configuration of OpenID Connect is similar to SAML.

The third keystone-supported authentication source is known as Multi-Factor Authentication (MFA). MFA typically requires an external source of authentication beyond a login name and password, and can include options such as SMS text, a temporal token generator, a fingerprint scanner, etc. Each of these types of MFA are usually specific to a particular MFA vendor. The keystone architecture supports an MFA-based authentication system, but this has not yet been certified or documented for SUSE OpenStack Cloud.

5.1.3 Authorization Edit source

The second major function provided by the keystone service is access authorization that determines what resources and actions are available based on the UserID, the role of the user, and the projects that a user is provided access to. All of this information is created, managed, and stored by keystone. These functions are applied via the horizon web interface, the OpenStack command-line interface, or the direct keystone API.

keystone provides support for organizing users via three entities including:

Domains

Domains provide the highest level of organization. Domains are intended to be used as high-level containers for multiple projects. A domain can represent different tenants, companies or organizations for an OpenStack cloud deployed for public cloud deployments or represent major business units, functions, or any other type of top-level organization unit in an OpenStack private cloud deployment. Each domain has at least one Domain Admin assigned to it. This Domain Admin can then create multiple projects within the domain and assign the project admin role to specific project owners. Each domain created in an OpenStack deployment is unique and the projects assigned to a domain cannot exist in another domain.

Projects

Projects are entities within a domain that represent groups of users, each user role within that project, and how many underlying infrastructure resources can be consumed by members of the project.

Groups

Groups are an optional function and provide the means of assigning project roles to multiple users at once.

keystone also provides the means to create and assign roles to groups of users or individual users. The role names are created and user assignments are made within keystone. The actual function of a role is defined currently per each OpenStack service via scripts. When a user requests access to an OpenStack service, his access token contains information about his assigned project membership and role for that project. This role is then matched to the service-specific script and the user is allowed to perform functions within that service defined by the role mapping.

5.2 Supported Upstream Keystone Features Edit source

5.2.1 OpenStack upstream features that are enabled by default in SUSE OpenStack Cloud 9 Edit source

The following supported keystone features are enabled by default in the SUSE OpenStack Cloud 9 release.

NameUser/AdminNote: API support only. No CLI/UI support
Implied RolesAdminhttps://blueprints.launchpad.net/keystone/+spec/implied-roles
Domain-Specific RolesAdminhttps://blueprints.launchpad.net/keystone/+spec/domain-specific-roles
Fernet Token ProviderUser and Adminhttps://docs.openstack.org/keystone/rocky/admin/identity-fernet-token-faq.html

Implied rules

To allow for the practice of hierarchical permissions in user roles, this feature enables roles to be linked in such a way that they function as a hierarchy with role inheritance.

When a user is assigned a superior role, the user will also be assigned all roles implied by any subordinate roles. The hierarchy of the assigned roles will be expanded when issuing the user a token.

Domain-specific roles

This feature extends the principle of implied roles to include a set of roles that are specific to a domain. At the time a token is issued, the domain-specific roles are not included in the token, however, the roles that they map to are.

Fernet token provider

Provides tokens in the Fernet format. This feature is automatically configured and is enabled by default. Fernet tokens are preferred and used by default instead of the older UUID token format.

5.2.2 OpenStack upstream features that are disabled by default in SUSE OpenStack Cloud 9 Edit source

The following is a list of features which are fully supported in the SUSE OpenStack Cloud 9 release, but are disabled by default. Customers can run a playbook to enable the features.

NameUser/AdminReason Disabled
Support multiple LDAP backends via per-domain configurationAdminNeeds explicit configuration.
WebSSOUser and AdminNeeds explicit configuration.
keystone-to-keystone (K2K) federationUser and AdminNeeds explicit configuration.
Domain-specific config in SQLAdminDomain specific configuration options can be stored in SQL instead of configuration files, using the new REST APIs.

Multiple LDAP backends for each domain

This feature allows identity backends to be configured on a domain-by-domain basis. Domains will be capable of having their own exclusive LDAP service (or multiple services). A single LDAP service can also serve multiple domains, with each domain in a separate subtree.

To implement this feature, individual domains will require domain-specific configuration files. Domains that do not implement this feature will continue to share a common backend driver.

WebSSO

This feature enables the keystone service to provide federated identity services through a token-based single sign-on page. This feature is disabled by default, as it requires explicit configuration.

keystone-to-keystone (K2K) federation

This feature enables separate keystone instances to federate identities among the instances, offering inter-cloud authorization. This feature is disabled by default, as it requires explicit configuration.

Domain-specific config in SQL

Using the new REST APIs, domain-specific configuration options can be stored in a SQL database instead of in configuration files.

5.2.3 Stack upstream features that have been specifically disabled in SUSE OpenStack Cloud 9 Edit source

The following is a list of extensions which are disabled by default in SUSE OpenStack Cloud 9, according to keystone policy.

Target ReleaseNameUser/AdminReason Disabled
TBDEndpoint FilteringAdmin

This extension was implemented to facilitate service activation. However, due to lack of enforcement at the service side, this feature is only half effective right now.

TBDEndpoint PolicyAdmin

This extension was intended to facilitate policy (policy.json) management and enforcement. This feature is useless right now due to lack of the needed middleware to utilize the policy files stored in keystone.

TBDOATH 1.0aUser and Admin

Complexity in workflow. Lack of adoption. Its alternative, keystone Trust, is enabled by default. HEAT is using keystone Trust.

TBDRevocation EventsAdmin

For PKI token only and PKI token is disabled by default due to usability concerns.

TBDOS CERTAdmin

For PKI token only and PKI token is disabled by default due to usability concerns.

TBDPKI TokenAdmin

PKI token is disabled by default due to usability concerns.

TBDDriver level cachingAdmin

Driver level caching is disabled by default due to complexity in setup.

TBDTokenless AuthzAdmin

Tokenless authorization with X.509 SSL client certificate.

TBDTOTP AuthenticationUser

Not fully baked. Has not been battle-tested.

TBDis_admin_projectAdmin

No integration with the services.

5.3 Understanding Domains, Projects, Users, Groups, and Roles Edit source

The identity service uses these concepts for authentication within your cloud and these are descriptions of each of them.

The SUSE OpenStack Cloud 9 identity service uses OpenStack keystone and the concepts of domains, projects, users, groups, and roles to manage authentication. This page describes how these work together.

5.3.1 Domains, Projects, Users, Groups, and Roles Edit source

Most large business organizations use an identity system such as Microsoft Active Directory to store and manage their internal user information. A variety of applications such as HR systems are, in turn, used to manage the data inside of Active Directory. These same organizations often deploy a separate user management system for external users such as contractors, partners, and customers. Multiple authentication systems are then deployed to support multiple types of users.

An LDAP-compatible directory such as Active Directory provides a top-level organization or domain component. In this example, the organization is called Acme. The domain component (DC) is defined as acme.com. Underneath the top level domain component are entities referred to as organizational units (OU). Organizational units are typically designed to reflect the entity structure of the organization. For example, this particular schema has 3 different organizational units for the Marketing, IT, and Contractors units or departments of the Acme organization. Users (and other types of entities like printers) are then defined appropriately underneath each organizational entity. The keystone domain entity can be used to match the LDAP OU entity; each LDAP OU can have a corresponding keystone domain created. In this example, both the Marketing and IT domains represent internal employees of Acme and use the same authentication source. The Contractors domain contains all external people associated with Acme. UserIDs associated with the Contractor domain are maintained in a separate user directory and thus have a different authentication source assigned to the corresponding keystone-defined Contractors domain.

A public cloud deployment usually supports multiple, separate organizations. keystone domains can be created to provide a domain per organization with each domain configured to the underlying organization's authentication source. For example, the ABC company would have a keystone domain created called "abc". All users authenticating to the "abc" domain would be authenticated against the authentication system provided by the ABC organization; in this case ldap://ad.abc.com

5.3.2 Domains Edit source

A domain is a top-level container targeted at defining major organizational entities.

  • Domains can be used in a multi-tenant OpenStack deployment to segregate projects and users from different companies in a public cloud deployment or different organizational units in a private cloud setting.

  • Domains provide the means to identify multiple authentication sources.

  • Each domain is unique within an OpenStack implementation.

  • Multiple projects can be assigned to a domain but each project can only belong to a single domain.

  • Each domain has an assigned "admin".

  • Each project has an assigned "admin".

  • Domains are created by the "admin" service account and domain admins are assigned by the "admin" user.

  • The "admin" UserID (UID) is created during the keystone installation, has the "admin" role assigned to it, and is defined as the "Cloud Admin". This UID is created using the "magic" or "secret" admin token found in the default 'keystone.conf' file installed during SUSE OpenStack Cloud keystone installation after the keystone service has been installed. This secret token should be removed after installation and the "admin" password changed.

  • The "default" domain is created automatically during the SUSE OpenStack Cloud keystone installation.

  • The "default" domain contains all OpenStack service accounts that are installed during the SUSE OpenStack Cloud keystone installation process.

  • No users but the OpenStack service accounts should be assigned to the "default" domain.

  • Domain admins can be any UserID inside or outside of the domain.

5.3.3 Domain Administrator Edit source

A UUID is a domain administrator for a given domain if that UID has a domain-scoped token scoped for the given domain. This means that the UID has the "admin" role assigned to it for the selected domain.

  • The Cloud Admin UID assigns the domain administrator role for a domain to a selected UID.

  • A domain administrator can create and delete local users who have authenticated against keystone. These users will be assigned to the domain belonging to the domain administrator who creates the UserID.

  • A domain administrator can only create users and projects within her assigned domains.

  • A domain administrator can assign the "admin" role of their domains to another UID or revoke it; each UID with the "admin" role for a specified domain will be a co-administrator for that domain.

  • A UID can be assigned to be the domain admin of multiple domains.

  • A domain administrator can assign non-admin roles to any users and groups within their assigned domain, including projects owned by their assigned domain.

  • A domain admin UID can belong to projects within their administered domains.

  • Each domain can have a different authentication source.

  • The domain field is used during the initial login to define the source of authentication.

  • The "List Users" function can only be executed by a UID with the domain admin role.

  • A domain administrator can assign a UID from outside of their domain the "domain admin" role, but it is assumed that the domain admin would know the specific UID and would not need to list users from an external domain.

  • A domain administrator can assign a UID from outside of their domain the "project admin" role for a specific project within their domain, but it is assumed that the domain admin would know the specific UID and would not need to list users from an external domain.

  • Any user that needs the ability to create a user in a project should be granted the "admin" role for the domain where the user and the project reside.

  • In order for the horizon Compute › Images panel to properly fill the "Owner" column, any user that is granted the admin role on a project must also be granted the "member" or "admin" role in the domain.

5.3.4 Projects Edit source

The domain administrator creates projects within his assigned domain and assigns the project admin role to each project to a selected UID. A UID is a project administrator for a given project if that UID has a project-scoped token scoped for the given project. There can be multiple projects per domain. The project admin sets the project quota settings, adds/deletes users and groups to and from the project, and defines the user/group roles for the assigned project. Users can be belong to multiple projects and have different roles on each project. Users are assigned to a specific domain and a default project. Roles are assigned per project.

5.3.5 Users and Groups Edit source

Each user belongs to one domain only. Domain assignments are defined either by the domain configuration files or by a domain administrator when creating a new, local (user authenticated against keystone) user. There is no current method for "moving" a user from one domain to another. A user can belong to multiple projects within a domain with a different role assignment per project. A group is a collection of users. Users can be assigned to groups either by the project admin or automatically via mappings if an external authentication source is defined for the assigned domain. Groups can be assigned to multiple projects within a domain and have different roles assigned to the group per project. A group can be assigned the "admin" role for a domain or project. All members of the group will be an "admin" for the selected domain or project.

5.3.6 Roles Edit source

Service roles represent the functionality used to implement the OpenStack role based access control (RBAC), model used to manage access to each OpenStack service. Roles are named and assigned per user or group for each project by the identity service. Role definition and policy enforcement are defined outside of the identity service independently by each OpenStack service. The token generated by the identity service for each user authentication contains the role assigned to that user for a particular project. When a user attempts to access a specific OpenStack service, the role is parsed by the service, compared to the service-specific policy file, and then granted the resource access defined for that role by the service policy file.

Each service has its own service policy file with the /etc/[SERVICE_CODENAME]/policy.json file name format where [SERVICE_CODENAME] represents a specific OpenStack service name. For example, the OpenStack nova service would have a policy file called /etc/nova/policy.json. With Service policy files can be modified and deployed to control nodes from the Cloud Lifecycle Manager. Administrators are advised to validate policy changes before checking in the changes to the site branch of the local git repository before rolling the changes into production. Do not make changes to policy files without having a way to validate them.

The policy files are located at the following site branch locations on the Cloud Lifecycle Manager.

~/openstack/ardana/ansible/roles/GLA-API/templates/policy.json.j2
~/openstack/ardana/ansible/roles/ironic-common/files/policy.json
~/openstack/ardana/ansible/roles/KEYMGR-API/templates/policy.json
~/openstack/ardana/ansible/roles/heat-common/files/policy.json
~/openstack/ardana/ansible/roles/CND-API/templates/policy.json
~/openstack/ardana/ansible/roles/nova-common/files/policy.json
~/openstack/ardana/ansible/roles/CEI-API/templates/policy.json.j2
~/openstack/ardana/ansible/roles/neutron-common/templates/policy.json.j2

For test and validation, policy files can be modified in a non-production environment from the ~/scratch/ directory. For a specific policy file, run a search for policy.json. To deploy policy changes for a service, run the service specific reconfiguration playbook (for example, nova-reconfigure.yml). For a complete list of reconfiguration playbooks, change directories to ~/scratch/ansible/next/ardana/ansible and run this command:

ardana > ls | grep reconfigure

A read-only role named project_observer is explicitly created in SUSE OpenStack Cloud 9. Any user who is granted this role can use list_project.

5.4 Identity Service Token Validation Example Edit source

The following diagram illustrates the flow of typical Identity service (keystone) requests/responses between SUSE OpenStack Cloud services and the Identity service. It shows how keystone issues and validates tokens to ensure the identity of the caller of each service.

  1. horizon sends an HTTP authentication request to keystone for user credentials.

  2. keystone validates the credentials and replies with token.

  3. horizon sends a POST request, with token to nova to start provisioning a virtual machine.

  4. nova sends token to keystone for validation.

  5. keystone validates the token.

  6. nova forwards a request for an image with the attached token.

  7. glance sends token to keystone for validation.

  8. keystone validates the token.

  9. glance provides image-related information to nova.

  10. nova sends request for networks to neutron with token.

  11. neutron sends token to keystone for validation.

  12. keystone validates the token.

  13. neutron provides network-related information to nova.

  14. nova reports the status of the virtual machine provisioning request.

5.5 Configuring the Identity Service Edit source

5.5.1 What is the Identity service? Edit source

The SUSE OpenStack Cloud Identity service, based on the OpenStack keystone API, provides UserID authentication and access authorization to help organizations achieve their access security and compliance objectives and successfully deploy OpenStack. In short, the Identity service is the gateway to the rest of the OpenStack services.

The identity service is installed automatically by the Cloud Lifecycle Manager (just after MySQL and RabbitMQ). When your cloud is up and running, you can customize keystone in a number of ways, including integrating with LDAP servers. This topic describes the default configuration. See Section 5.8, “Reconfiguring the Identity service” for changes you can implement. Also see Section 5.9, “Integrating LDAP with the Identity Service” for information on integrating with an LDAP provider.

5.5.2 Which version of the Identity service should you use? Edit source

Note that you should use identity API version 3.0. Identity API v2.0 was has been deprecated. Many features such as LDAP integration and fine-grained access control will not work with v2.0. The following are a few questions you may have regarding versions.

Why does the keystone identity catalog still show version 2.0?

Tempest tests still use the v2.0 API. They are in the process of migrating to v3.0. We will remove the v2.0 version once tempest has migrated the tests. The Identity catalog has version 2.0 just to support tempest migration.

Will the keystone identity v3.0 API work if the identity catalog has only the v2.0 endpoint?

Identity v3.0 does not rely on the content of the catalog. It will continue to work regardless of the version of the API in the catalog.

Which CLI client should you use?

You should use the OpenStack CLI, not the keystone CLI, because it is deprecated. The keystone CLI does not support the v3.0 API; only the OpenStack CLI supports the v3.0 API.

5.5.3 Authentication Edit source

The authentication function provides the initial login function to OpenStack. keystone supports multiple sources of authentication, including a native or built-in authentication system. You can use the keystone native system for all user management functions for proof-of-concept deployments or small deployments not requiring integration with a corporate authentication system, but it lacks some of the advanced functions usually found in user management systems such as forcing password changes. The focus of the keystone native authentication system is to be the source of authentication for OpenStack-specific users required to operate various OpenStack services. These users are stored by keystone in a default domain; the addition of these IDs to an external authentication system is not required.

keystone is more commonly integrated with external authentication systems such as OpenLDAP or Microsoft Active Directory. These systems are usually centrally deployed by organizations to serve as the single source of user management and authentication for all in-house deployed applications and systems requiring user authentication. In addition to LDAP and Microsoft Active Directory, support for integration with Security Assertion Markup Language (SAML)-based identity providers from companies such as Ping, CA, IBM, Oracle, and others is also nearly "production-ready."

keystone also provides architectural support through the underlying Apache deployment for other types of authentication systems, such as multi-factor authentication. These types of systems typically require driver support and integration from the respective providers.

Note
Note

While support for Identity providers and multi-factor authentication is available in keystone, it has not yet been certified by the SUSE OpenStack Cloud engineering team and is an experimental feature in SUSE OpenStack Cloud.

LDAP-compatible directories such as OpenLDAP and Microsoft Active Directory are recommended alternatives to using keystone local authentication. Both methods are widely used by organizations and are integrated with a variety of other enterprise applications. These directories act as the single source of user information within an organization. You can configure keystone to authenticate against an LDAP-compatible directory on a per-domain basis.

Domains, as explained in Section 5.3, “Understanding Domains, Projects, Users, Groups, and Roles”, can be configured so that, based on the user ID, an incoming user is automatically mapped to a specific domain. You can then configure this domain to authenticate against a specific LDAP directory. User credentials provided by the user to keystone are passed along to the designated LDAP source for authentication. You can optionally configure this communication to be secure through SSL encryption. No special LDAP administrative access is required, and only read-only access is needed for this configuration. keystone will not add any LDAP information. All user additions, deletions, and modifications are performed by the application's front end in the LDAP directories. After a user has been successfully authenticated, that user is then assigned to the groups, roles, and projects defined by the keystone domain or project administrators. This information is stored in the keystone service database.

Another form of external authentication provided by the keystone service is through integration with SAML-based identity providers (IdP) such as Ping Identity, IBM Tivoli, and Microsoft Active Directory Federation Server. A SAML-based identity provider provides authentication that is often called "single sign-on." The IdP server is configured to authenticate against identity sources such as Active Directory and provides a single authentication API against multiple types of downstream identity sources. This means that an organization could have multiple identity storage sources but a single authentication source. In addition, if a user has logged into one such source during a defined session time frame, that user does not need to reauthenticate within the defined session. Instead, the IdP automatically validates the user to requesting applications and services.

A SAML-based IdP authentication source is configured with keystone on a per-domain basis similar to the manner in which native LDAP directories are configured. Extra mapping rules are required in the configuration that define which keystone group an incoming UID is automatically assigned to. This means that groups need to be defined in keystone first, but it also removes the requirement that a domain or project administrator assign user roles and project membership on a per-user basis. Instead, groups are used to define project membership and roles and incoming users are automatically mapped to keystone groups based on their upstream group membership. This strategy provides a consistent role-based access control (RBAC) model based on the upstream identity source. The configuration of this option is fairly straightforward. IdP vendors such as Ping and IBM are contributing to the maintenance of this function and have also produced their own integration documentation. HPE is using the Microsoft Active Directory Federation Services (AD FS) for functional testing and future documentation.

The third keystone-supported authentication source is known as multi-factor authentication (MFA). MFA typically requires an external source of authentication beyond a login name and password, and can include options such as SMS text, a temporal token generator, or a fingerprint scanner. Each of these types of MFAs are usually specific to a particular MFA vendor. The keystone architecture supports an MFA-based authentication system, but this has not yet been certified or documented for SUSE OpenStack Cloud.

5.5.4 Authorization Edit source

Another major function provided by the keystone service is access authorization that determines which resources and actions are available based on the UserID, the role of the user, and the projects that a user is provided access to. All of this information is created, managed, and stored by keystone. These functions are applied through the horizon web interface, the OpenStack command-line interface, or the direct keystone API.

keystone provides support for organizing users by using three entities:

Domains

Domains provide the highest level of organization. Domains are intended to be used as high-level containers for multiple projects. A domain can represent different tenants, companies, or organizations for an OpenStack cloud deployed for public cloud deployments or it can represent major business units, functions, or any other type of top-level organization unit in an OpenStack private cloud deployment. Each domain has at least one Domain Admin assigned to it. This Domain Admin can then create multiple projects within the domain and assign the project administrator role to specific project owners. Each domain created in an OpenStack deployment is unique and the projects assigned to a domain cannot exist in another domain.

Projects

Projects are entities within a domain that represent groups of users, each user role within that project, and how many underlying infrastructure resources can be consumed by members of the project.

Groups

Groups are an optional function and provide the means of assigning project roles to multiple users at once.

keystone also makes it possible to create and assign roles to groups of users or individual users. Role names are created and user assignments are made within keystone. The actual function of a role is defined currently for each OpenStack service via scripts. When users request access to an OpenStack service, their access tokens contain information about their assigned project membership and role for that project. This role is then matched to the service-specific script and users are allowed to perform functions within that service defined by the role mapping.

5.5.5 Default settings Edit source

Identity service configuration settings

The identity service configuration options are described in the OpenStack documentation at keystone Configuration Options on the OpenStack site.

Default domain and service accounts

The "default" domain is automatically created during the installation to contain the various required OpenStack service accounts, including the following:

admin

heat

monasca-agent

barbican

logging

neutron

barbican_service

logging_api

nova

ceilometer

logging_beaver

nova_monasca

cinder

logging_monitor

octavia

cinderinternal

magnum

placement

demo

manila

swift

designate

manilainternal

swift-demo

glance

monasca

swift-dispersion

glance-check

monasca_read_only

swift-monitor

glance-swift

These are required accounts and are used by the underlying OpenStack services. These accounts should not be removed or reassigned to a different domain. These "default" domain should be used only for these service accounts.

5.5.6 Preinstalled roles Edit source

The following are the preinstalled roles. You can create additional roles by UIDs with the "admin" role. Roles are defined on a per-service basis (more information is available at Manage projects, users, and roles on the OpenStack website).

RoleDescription
admin

The "superuser" role. Provides full access to all SUSE OpenStack Cloud services across all domains and projects. This role should be given only to a cloud administrator.

member

A general role that enables a user to access resources within an assigned project including creating, modifying, and deleting compute, storage, and network resources.

You can find additional information on these roles in each service policy stored in the /etc/PROJECT/policy.json files where PROJECT is a placeholder for an OpenStack service. For example, the Compute (nova) service roles are stored in the /etc/nova/policy.json file. Each service policy file defines the specific API functions available to a role label.

5.6 Retrieving the Admin Password Edit source

The admin password will be used to access the dashboard and Operations Console as well as allow you to authenticate to use the command-line tools and API.

In a default SUSE OpenStack Cloud 9 installation there is a randomly generated password for the Admin user created. These steps will show you how to retrieve this password.

5.6.1 Retrieving the Admin Password Edit source

You can retrieve the randomly generated Admin password by using this command on the Cloud Lifecycle Manager:

ardana > cat ~/service.osrc

In this example output, the value for OS_PASSWORD is the Admin password:

ardana > cat ~/service.osrc
unset OS_DOMAIN_NAME
export OS_IDENTITY_API_VERSION=3
export OS_AUTH_VERSION=3
export OS_PROJECT_NAME=admin
export OS_PROJECT_DOMAIN_NAME=Default
export OS_USERNAME=admin
export OS_USER_DOMAIN_NAME=Default
export OS_PASSWORD=SlWSfwxuJY0
export OS_AUTH_URL=https://10.13.111.145:5000/v3
export OS_ENDPOINT_TYPE=internalURL
# OpenstackClient uses OS_INTERFACE instead of OS_ENDPOINT
export OS_INTERFACE=internal
export OS_CACERT=/etc/ssl/certs/ca-certificates.crt
export OS_COMPUTE_API_VERSION=2

5.7 Changing Service Passwords Edit source

SUSE OpenStack Cloud provides a process for changing the default service passwords, including your admin user password, which you may want to do for security or other purposes.

You can easily change the inter-service passwords used for authenticating communications between services in your SUSE OpenStack Cloud deployment, promoting better compliance with your organization’s security policies. The inter-service passwords that can be changed include (but are not limited to) keystone, MariaDB, RabbitMQ, Cloud Lifecycle Manager cluster, monasca and barbican.

The general process for changing the passwords is to:

  • Indicate to the configuration processor which password(s) you want to change, and optionally include the value of that password

  • Run the configuration processor to generate the new passwords (you do not need to run git add before this)

  • Run ready-deployment

  • Check your password name(s) against the tables included below to see which high-level credentials-change playbook(s) you need to run

  • Run the appropriate high-level credentials-change playbook(s)

5.7.1 Password Strength Edit source

Encryption passwords supplied to the configuration processor for use with Ansible Vault and for encrypting the configuration processor’s persistent state must have a minimum length of 12 characters and a maximum of 128 characters. Passwords must contain characters from each of the following three categories:

  • Uppercase characters (A-Z)

  • Lowercase characters (a-z)

  • Base 10 digits (0-9)

Service Passwords that are automatically generated by the configuration processor are chosen from the 62 characters made up of the 26 uppercase, the 26 lowercase, and the 10 numeric characters, with no preference given to any character or set of characters, with the minimum and maximum lengths being determined by the specific requirements of individual services.

Important
Important

Currently, you can not use any special characters with Ansible Vault, Service Passwords, or vCenter configuration.

5.7.2 Telling the configuration processor which password(s) you want to change Edit source

In SUSE OpenStack Cloud 9, the configuration processor will produce metadata about each of the passwords (and other variables) that it generates in the file ~/openstack/my_cloud/info/private_data_metadata_ccp.yml. A snippet of this file follows. Expand the header to see the file:

5.7.3 private_data_metadata_ccp.yml Edit source

metadata_proxy_shared_secret:
  metadata:
  - clusters:
    - cluster1
    component: nova-metadata
    consuming-cp: ccp
    cp: ccp
  version: '2.0'
mysql_admin_password:
  metadata:
  - clusters:
    - cluster1
    component: ceilometer
    consumes: mysql
    consuming-cp: ccp
    cp: ccp
  - clusters:
    - cluster1
    component: heat
    consumes: mysql
    consuming-cp: ccp
    cp: ccp
  - clusters:
    - cluster1
    component: keystone
    consumes: mysql
    consuming-cp: ccp
    cp: ccp
  - clusters:
    - cluster1
    - compute
    component: nova
    consumes: mysql
    consuming-cp: ccp
    cp: ccp
  - clusters:
    - cluster1
    component: cinder
    consumes: mysql
    consuming-cp: ccp
    cp: ccp
  - clusters:
    - cluster1
    component: glance
    consumes: mysql
    consuming-cp: ccp
    cp: ccp
  - clusters:
    - cluster1
    - compute
    component: neutron
    consumes: mysql
    consuming-cp: ccp
    cp: ccp
  - clusters:
    - cluster1
    component: horizon
    consumes: mysql
    consuming-cp: ccp
    cp: ccp
  version: '2.0'
mysql_barbican_password:
  metadata:
  - clusters:
    - cluster1
    component: barbican
    consumes: mysql
    consuming-cp: ccp
    cp: ccp
  version: '2.0'

For each variable, there is a metadata entry for each pair of services that use the variable including a list of the clusters on which the service component that consumes the variable (defined as "component:" in private_data_metadata_ccp.yml above) runs.

Note above that the variable mysql_admin_password is used by a number of service components, and the service that is consumed in each case is mysql, which in this context refers to the MariaDB instance that is part of the product.

5.7.4 Steps to change a password Edit source

First, make sure that you have a copy of private_data_metadata_ccp.yml. If you do not, generate one to run the configuration processor:

ardana > cd ~/openstack/ardana/ansible
ardana > ansible-playbook -i hosts/localhost config-processor-run.yml

Make a copy of the private_data_metadata_ccp.yml file and place it into the ~/openstack/change_credentials directory:

ardana > cp ~/openstack/my_cloud/info/private_data_metadata_control-plane-1.yml \
 ~/openstack/change_credentials/

Edit the copied file in ~/openstack/change_credentials leaving only those passwords you intend to change. All entries in this template file should be deleted except for those passwords.

Important
Important

If you leave other passwords in that file that you do not want to change, they will be regenerated and no longer match those in use which could disrupt operations.

Note
Note

It is required that you change passwords in batches of each category listed below.

For example, the snippet below would result in the configuration processor generating new random values for keystone_backup_password, keystone_ceilometer_password, and keystone_cinder_password:

keystone_backup_password:
  metadata:
  - clusters:
    - cluster0
    - cluster1
    - compute
    consumes: keystone-api
    consuming-cp: ccp
    cp: ccp
  version: '2.0'
keystone_ceilometer_password:
  metadata:
  - clusters:
    - cluster1
    component: ceilometer-common
    consumes: keystone-api
    consuming-cp: ccp
    cp: ccp
  version: '2.0'
keystone_cinder_password:
  metadata:
  - clusters:
    - cluster1
    component: cinder-api
    consumes: keystone-api
    consuming-cp: ccp
    cp: ccp
  version: '2.0'

5.7.5 Specifying password value Edit source

Optionally, you can specify a value for the password by including a "value:" key and value at the same level as metadata:

keystone_backup_password:
    value: 'new_password'
    metadata:
    - clusters:
        - cluster0
        - cluster1
        - compute
        consumes: keystone-api
        consuming-cp: ccp
        cp: ccp
      version: '2.0'

Note that you can have multiple files in openstack/change_credentials. The configuration processor will only read files that end in .yml or .yaml.

Note
Note

If you have specified a password value in your credential change file, you may want to encrypt it using ansible-vault. If you decide to encrypt with ansible-vault, make sure that you use the encryption key you have already used when running the configuration processor.

To encrypt a file using ansible-vault, execute:

ardana > cd ~/openstack/change_credentials
ardana > ansible-vault encrypt credential change file ending in .yml or .yaml

Be sure to provide the encryption key when prompted. Note that if you have specified the wrong ansible-vault password, the configuration-processor will error out with a message like the following:

################################################## Reading Persistent State ##################################################

################################################################################
# The configuration processor failed.
# PersistentStateCreds: User-supplied creds file test1.yml was not parsed properly
################################################################################

5.7.6 Running the configuration processor to change passwords Edit source

The directory openstack/change_credentials is not managed by git, so to rerun the configuration processor to generate new passwords and prepare for the next deployment just enter the following commands:

ardana > cd ~/openstack/ardana/ansible
ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
Note
Note

The files that you placed in ~/openstack/change_credentials should be removed once you have run the configuration processor because the old password values and new password values will be stored in the configuration processor's persistent state.

Note that if you see output like the following after running the configuration processor:

################################################################################
# The configuration processor completed with warnings.
# PersistentStateCreds: User-supplied password name 'blah' is not valid
################################################################################

this tells you that the password name you have supplied, 'blah,' does not exist. A failure to correctly parse the credentials change file will result in the configuration processor erroring out with a message like the following:

################################################## Reading Persistent State ##################################################

################################################################################
# The configuration processor failed.
# PersistentStateCreds: User-supplied creds file test1.yml was not parsed properly
################################################################################

Once you have run the configuration processor to change passwords, an information file ~/openstack/my_cloud/info/password_change.yml similar to the private_data_metadata_ccp.yml is written to tell you which passwords have been changed, including metadata but not including the values.

5.7.7 Password change playbooks and tables Edit source

Once you have completed the steps above to change password(s) value(s) and then prepare for the deployment that will actually switch over to the new passwords, you will need to run some high-level playbooks. The passwords that can be changed are grouped into six categories. The tables below list the password names that belong in each category. The categories are:

keystone

Playbook: ardana-keystone-credentials-change.yml

RabbitMQ

Playbook: ardana-rabbitmq-credentials-change.yml

MariaDB

Playbook: ardana-reconfigure.yml

Cluster:

Playbook: ardana-cluster-credentials-change.yml

monasca:

Playbook: monasca-reconfigure-credentials-change.yml

Other:

Playbook: ardana-other-credentials-change.yml

It is recommended that you change passwords in batches; in other words, run through a complete password change process for each batch of passwords, preferably in the above order. Once you have followed the process indicated above to change password(s), check the names against the tables below to see which password change playbook(s) you should run.

Changing identity service credentials

The following table lists identity service credentials you can change.

keystone credentials
Password name
barbican_admin_password
barbican_service_password
keystone_admin_pwd
keystone_ceilometer_password
keystone_cinder_password
keystone_cinderinternal_password
keystone_demo_pwd
keystone_designate_password
keystone_glance_password
keystone_glance_swift_password
keystone_heat_password
keystone_magnum_password
keystone_monasca_agent_password
keystone_monasca_password
keystone_neutron_password
keystone_nova_password
keystone_octavia_password
keystone_swift_dispersion_password
keystone_swift_monitor_password
keystone_swift_password
nova_monasca_password

The playbook to run to change keystone credentials is ardana-keystone-credentials-change.yml. Execute the following commands to make the changes:

ardana > cd ~/scratch/ansible/next/ardana/ansible/
ardana > ansible-playbook -i hosts/verb_hosts ardana-keystone-credentials-change.yml

Changing RabbitMQ credentials

The following table lists the RabbitMQ credentials you can change.

RabbitMQ credentials
Password name
rmq_barbican_password
rmq_ceilometer_password
rmq_cinder_password
rmq_designate_password
rmq_keystone_password
rmq_magnum_password
rmq_monasca_monitor_password
rmq_nova_password
rmq_octavia_password
rmq_service_password

The playbook to run to change RabbitMQ credentials is ardana-rabbitmq-credentials-change.yml. Execute the following commands to make the changes:

ardana > cd ~/scratch/ansible/next/ardana/ansible/
ardana > ansible-playbook -i hosts/verb_hosts ardana-rabbitmq-credentials-change.yml

Changing MariaDB credentials

The following table lists the MariaDB credentials you can change.

MariaDB credentials
Password name
mysql_admin_password
mysql_barbican_password
mysql_clustercheck_pwd
mysql_designate_password
mysql_magnum_password
mysql_monasca_api_password
mysql_monasca_notifier_password
mysql_monasca_thresh_password
mysql_octavia_password
mysql_powerdns_password
mysql_root_pwd
mysql_sst_password
ops_mon_mdb_password
mysql_monasca_transform_password
mysql_nova_api_password
password

The playbook to run to change MariaDB credentials is ardana-reconfigure.yml. To make the changes, execute the following commands:

ardana > cd ~/scratch/ansible/next/ardana/ansible/
ardana > ansible-playbook -i hosts/verb_hosts ardana-reconfigure.yml

Changing cluster credentials

The following table lists the cluster credentials you can change.

cluster credentials
Password name
haproxy_stats_password
keepalive_vrrp_password

The playbook to run to change cluster credentials is ardana-cluster-credentials-change.yml. To make changes, execute the following commands:

ardana > cd ~/scratch/ansible/next/ardana/ansible/
ardana > ansible-playbook -i hosts/verb_hosts ardana-cluster-credentials-change.yml

Changing monasca credentials

The following table lists the monasca credentials you can change.

monasca credentials
Password name
cassandra_monasca_api_password
cassandra_monasca_persister_password

The playbook to run to change monasca credentials is monasca-reconfigure-credentials-change.yml. To make the changes, execute the following commands:

ardana > cd ~/scratch/ansible/next/ardana/ansible/
ardana > ansible-playbook -i hosts/verb_hosts monasca-reconfigure-credentials-change.yml

Changing other credentials

The following table lists the other credentials you can change.

Other credentials
Password name
logging_beaver_password
logging_api_password
logging_monitor_password
logging_kibana_password

The playbook to run to change these credentials is ardana-other-credentials-change.yml. To make the changes, execute the following commands:

ardana > cd ~/scratch/ansible/next/ardana/ansible/
ardana > ansible-playbook -i hosts/verb_hosts ardana-other-credentials-change.yml

5.7.8 Changing RADOS Gateway Credential Edit source

To change the keystone credentials of RADOS Gateway, follow the preceding steps documented in Section 5.7, “Changing Service Passwords” by modifying the keystone_rgw_password section in private_data_metadata_ccp.yml file in Section 5.7.4, “Steps to change a password” or Section 5.7.5, “Specifying password value”.

5.7.9 Immutable variables Edit source

The values of certain variables are immutable, which means that once they have been generated by the configuration processor they cannot be changed. These variables are:

  • barbican_master_kek_db_plugin

  • swift_hash_path_suffix

  • swift_hash_path_prefix

  • mysql_cluster_name

  • heartbeat_key

  • erlang_cookie

The configuration processor will not re-generate the values of the above passwords, nor will it allow you to specify a value for them. In addition to the above variables, the following are immutable in SUSE OpenStack Cloud 9:

  • All ssh keys generated by the configuration processor

  • All UUIDs generated by the configuration processor

  • metadata_proxy_shared_secret

  • horizon_secret_key

  • ceilometer_metering_secret

5.8 Reconfiguring the Identity service Edit source

5.8.1 Updating the keystone Identity Service Edit source

This topic explains configuration options for the Identity service.

SUSE OpenStack Cloud lets you perform updates on the following parts of the Identity service configuration:

5.8.2 Updating the Main Identity service Configuration File Edit source

  1. The main keystone Identity service configuration file (/etc/keystone/keystone.conf), located on each control plane server, is generated from the following template file located on a Cloud Lifecycle Manager: ~/openstack/my_cloud/config/keystone/keystone.conf.j2

    Modify this template file as appropriate. See keystone Liberty documentation for full descriptions of all settings. This is a Jinja2 template, which expects certain template variables to be set. Do not change values inside double curly braces: {{ }}.

    Note
    Note

    SUSE OpenStack Cloud 9 has the following token expiration setting, which differs from the upstream value 3600:

    [token]
    expiration = 14400
  2. After you modify the template, commit the change to the local git repository, and rerun the configuration processor / deployment area preparation playbooks (as suggested in Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”):

    ardana > cd ~/openstack
    ardana > git checkout site
    ardana > git add my_cloud/config/keystone/keystone.conf.j2
    ardana > git commit -m "Adjusting some parameters in keystone.conf"
    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  3. Run the reconfiguration playbook in the deployment area:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts keystone-reconfigure.yml

5.8.3 Enabling Identity Service Features Edit source

To enable or disable keystone features, do the following:

  1. Adjust respective parameters in ~/openstack/my_cloud/config/keystone/keystone_deploy_config.yml

  2. Commit the change into local git repository, and rerun the configuration processor/deployment area preparation playbooks (as suggested in Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”):

    ardana > cd ~/openstack
    ardana > git checkout site
    ardana > git add my_cloud/config/keystone/keystone_deploy_config.yml
    ardana > git commit -m "Adjusting some WSGI or logging parameters for keystone"
    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  3. Run the reconfiguration playbook in the deployment area:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts keystone-reconfigure.yml

5.8.4 Fernet Tokens Edit source

SUSE OpenStack Cloud 9 supports Fernet tokens by default. The benefit of using Fernet tokens is that tokens are not persisted in a database, which is helpful if you want to deploy the keystone Identity service as one master and multiple slaves; only roles, projects, and other details are replicated from master to slaves. The token table is not replicated.

Note
Note

Tempest does not work with Fernet tokens in SUSE OpenStack Cloud 9. If Fernet tokens are enabled, do not run token tests in Tempest.

Note
Note

During reconfiguration when switching to a Fernet token provider or during Fernet key rotation, you may see a warning in keystone.log stating [fernet_tokens] key_repository is world readable: /etc/keystone/fernet-keys/. This is expected. You can safely ignore this message. For other keystone operations, this warning is not displayed. Directory permissions are set to 600 (read/write by owner only), not world readable.

Fernet token-signing key rotation is being handled by a cron job, which is configured on one of the controllers. The controller with the Fernet token-signing key rotation cron job is also known as the Fernet Master node. By default, the Fernet token-signing key is rotated once every 24 hours. The Fernet token-signing keys are distributed from the Fernet Master node to the rest of the controllers at each rotation. Therefore, the Fernet token-signing keys are consistent for all the controlers at all time.

When enabling Fernet token provider the first time, specific steps are needed to set up the necessary mechanisms for Fernet token-signing key distributions.

  1. Set keystone_configure_fernet to True in ~/openstack/my_cloud/config/keystone/keystone_deploy_config.yml.

  2. Run the following commands to commit your change in Git and enable Fernet:

    ardana > git add my_cloud/config/keystone/keystone_deploy_config.yml
    ardana > git commit -m "enable Fernet token provider"
    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts keystone-deploy.yml

When the Fernet token provider is enabled, a Fernet Master alarm definition is also created on monasca to monitor the Fernet Master node. If the Fernet Master node is offline or unreachable, a CRITICAL alarm is raised for the Cloud Admin to take corrective actions. If the Fernet Master node is offline for a prolonged period of time, Fernet token-signing key rotation is not performed. This may introduce security risks to the cloud. The Cloud Admin must take immediate actions to resurrect the Fernet Master node.

5.9 Integrating LDAP with the Identity Service Edit source

5.9.1 Integrating with an external LDAP server Edit source

The keystone identity service provides two primary functions: user authentication and access authorization. The user authentication function validates a user's identity. keystone has a very basic user management system that can be used to create and manage user login and password credentials but this system is intended only for proof of concept deployments due to the very limited password control functions. The internal identity service user management system is also commonly used to store and authenticate OpenStack-specific service account information.

The recommended source of authentication is external user management systems such as LDAP directory services. The identity service can be configured to connect to and use external systems as the source of user authentication. The identity service domain construct is used to define different authentication sources based on domain membership. For example, cloud deployment could consist of as few as two domains:

  • The default domain that is pre-configured for the service account users that are authenticated directly against the identity service internal user management system

  • A customer-defined domain that contains all user projects and membership definitions. This domain can then be configured to use an external LDAP directory such as Microsoft Active Directory as the authentication source.

SUSE OpenStack Cloud can support multiple domains for deployments that support multiple tenants. Multiple domains can be created with each domain configured to either the same or different external authentication sources. This deployment model is known as a "per-domain" model.

There are currently two ways to configure "per-domain" authentication sources:

  • File store – each domain configuration is created and stored in separate text files. This is the older and current default method for defining domain configurations.

  • Database store – each domain configuration can be created using either the identity service manager utility (recommenced) or a Domain Admin API (from OpenStack.org), and the results are stored in the identity service MariaDB database. This database store is a new method introduced in the OpenStack Kilo release and now available in SUSE OpenStack Cloud.

Instructions for initially creating per-domain configuration files and then migrating to the Database store method via the identity service manager utility are provided as follows.

5.9.2 Set up domain-specific driver configuration - file store Edit source

To update configuration to a specific LDAP domain:

  1. Ensure that the following configuration options are in the main configuration file template: ~/openstack/my_cloud/config/keystone/keystone.conf.j2

    [identity]
    domain_specific_drivers_enabled = True
    domain_configurations_from_database = False
  2. Create a YAML file that contains the definition of the LDAP server connection. The sample file below is already provided as part of the Cloud Lifecycle Manager in the Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”. It is available on the Cloud Lifecycle Manager in the following file:

    ~/openstack/my_cloud/config/keystone/keystone_configure_ldap_sample.yml

    Save a copy of this file with a new name, for example:

    ~/openstack/my_cloud/config/keystone/keystone_configure_ldap_my.yml
    Note
    Note

    Please refer to the LDAP section of the keystone configuration example for OpenStack for the full option list and description.

    Below are samples of YAML configurations for identity service LDAP certificate settings, optimized for Microsoft Active Directory server.

    Sample YAML configuration keystone_configure_ldap_my.yml

    ---
    keystone_domainldap_conf:
    
        # CA certificates file content.
        # Certificates are stored in Base64 PEM format. This may be entire LDAP server
        # certificate (in case of self-signed certificates), certificate of authority
        # which issued LDAP server certificate, or a full certificate chain (Root CA
        # certificate, intermediate CA certificate(s), issuer certificate).
        #
        cert_settings:
          cacert: |
            -----BEGIN CERTIFICATE-----
    
            certificate appears here
    
            -----END CERTIFICATE-----
    
        # A domain will be created in MariaDB with this name, and associated with ldap back end.
        # Installer will also generate a config file named /etc/keystone/domains/keystone.<domain_name>.conf
        #
        domain_settings:
          name: ad
          description: Dedicated domain for ad users
    
        conf_settings:
          identity:
             driver: ldap
    
    
          # For a full list and description of ldap configuration options, please refer to
          # https://github.com/openstack/keystone/blob/master/etc/keystone.conf.sample or
          # http://docs.openstack.org/liberty/config-reference/content/keystone-configuration-file.html.
          #
          # Please note:
          #  1. LDAP configuration is read-only. Configuration which performs write operations (i.e. creates users, groups, etc)
          #     is not supported at the moment.
          #  2. LDAP is only supported for identity operations (reading users and groups from LDAP). Assignment
          #     operations with LDAP (i.e. managing roles, projects) are not supported.
          #  3. LDAP is configured as non-default domain. Configuring LDAP as a default domain is not supported.
          #
          ldap:
            url: ldap://ad.hpe.net
            suffix: DC=hpe,DC=net
            query_scope: sub
            user_tree_dn: CN=Users,DC=hpe,DC=net
            user : CN=admin,CN=Users,DC=hpe,DC=net
            password: REDACTED
            user_objectclass: user
            user_id_attribute: cn
            user_name_attribute: cn
            group_tree_dn: CN=Users,DC=hpe,DC=net
            group_objectclass: group
            group_id_attribute: cn
            group_name_attribute: cn
            use_pool: True
            user_enabled_attribute: userAccountControl
            user_enabled_mask: 2
            user_enabled_default: 512
            use_tls: True
            tls_req_cert: demand
            # if you are configuring multiple LDAP domains, and LDAP server certificates are issued
            # by different authorities, make sure that you place certs for all the LDAP backend domains in the
            # cacert parameter as seen in this sample yml file so that all the certs are combined in a single CA file
            # and every LDAP domain configuration points to the combined CA file.
            # Note:
            # 1. Please be advised that every time a new ldap domain is configured, the single CA file gets overwritten
            # and hence ensure that you place certs for all the LDAP backend domains in the cacert parameter.
            # 2. There is a known issue on one cert per CA file per domain when the system processes
            # concurrent requests to multiple LDAP domains. Using the single CA file with all certs combined
            # shall get the system working properly*.
    
            tls_cacertfile: /etc/keystone/ssl/certs/all_ldapdomains_ca.pem
    
            # The issue is in the underlying SSL library. Upstream is not investing in python-ldap package anymore.
            # It is also not python3 compliant.
    keystone_domain_MSAD_conf:
    
        # CA certificates file content.
        # Certificates are stored in Base64 PEM format. This may be entire LDAP server
        # certificate (in case of self-signed certificates), certificate of authority
        # which issued LDAP server certificate, or a full certificate chain (Root CA
        # certificate, intermediate CA certificate(s), issuer certificate).
        #
        cert_settings:
          cacert: |
            -----BEGIN CERTIFICATE-----
    
            certificate appears here
    
            -----END CERTIFICATE-----
    
        # A domain will be created in MariaDB with this name, and associated with ldap back end.
        # Installer will also generate a config file named /etc/keystone/domains/keystone.<domain_name>.conf
        #
            domain_settings:
              name: msad
              description: Dedicated domain for msad users
    
            conf_settings:
              identity:
                driver: ldap
    
        # For a full list and description of ldap configuration options, please refer to
        # https://github.com/openstack/keystone/blob/master/etc/keystone.conf.sample or
        # http://docs.openstack.org/liberty/config-reference/content/keystone-configuration-file.html.
        #
        # Please note:
        #  1. LDAP configuration is read-only. Configuration which performs write operations (i.e. creates users, groups, etc)
        #     is not supported at the moment.
        #  2. LDAP is only supported for identity operations (reading users and groups from LDAP). Assignment
        #     operations with LDAP (i.e. managing roles, projects) are not supported.
        #  3. LDAP is configured as non-default domain. Configuring LDAP as a default domain is not supported.
        #
        ldap:
          # If the url parameter is set to ldap then typically use_tls should be set to True. If
          # url is set to ldaps, then use_tls should be set to False
          url: ldaps://10.16.22.5
          use_tls: False
          query_scope: sub
          user_tree_dn: DC=l3,DC=local
          # this is the user and password for the account that has access to the AD server
          user: administrator@l3.local
          password: OpenStack123
          user_objectclass: user
          # For a default Active Directory schema this is where to find the user name, openldap uses a different value
          user_id_attribute: userPrincipalName
          user_name_attribute: sAMAccountName
          group_tree_dn: DC=l3,DC=local
          group_objectclass: group
          group_id_attribute: cn
          group_name_attribute: cn
          # An upstream defect requires use_pool to be set false
          use_pool: False
          user_enabled_attribute: userAccountControl
          user_enabled_mask: 2
          user_enabled_default: 512
          tls_req_cert: allow
          # Referals may contain urls that can't be resolved and will cause timeouts, ignore them
          chase_referrals: False
          # if you are configuring multiple LDAP domains, and LDAP server certificates are issued
          # by different authorities, make sure that you place certs for all the LDAP backend domains in the
          # cacert parameter as seen in this sample yml file so that all the certs are combined in a single CA file
          # and every LDAP domain configuration points to the combined CA file.
          # Note:
          # 1. Please be advised that every time a new ldap domain is configured, the single CA file gets overwritten
          # and hence ensure that you place certs for all the LDAP backend domains in the cacert parameter.
          # 2. There is a known issue on one cert per CA file per domain when the system processes
          # concurrent requests to multiple LDAP domains. Using the single CA file with all certs combined
          # shall get the system working properly.
    
          tls_cacertfile: /etc/keystone/ssl/certs/all_ldapdomains_ca.pem
  3. As suggested in Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”, commit the new file to the local git repository, and rerun the configuration processor and ready deployment playbooks:

    ardana > cd ~/openstack
    ardana > git checkout site
    ardana > git add my_cloud/config/keystone/keystone_configure_ldap_my.yml
    ardana > git commit -m "Adding LDAP server integration config"
    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  4. Run the reconfiguration playbook in a deployment area, passing the YAML file created in the previous step as a command-line option:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts keystone-reconfigure.yml -e@~/openstack/my_cloud/config/keystone/keystone_configure_ldap_my.yml
  5. Follow these same steps for each LDAP domain with which you are integrating the identity service, creating a YAML file for each and running the reconfigure playbook once for each additional domain.

  6. Ensure that a new domain was created for LDAP (Microsoft AD in this example) and set environment variables for admin level access

    ardana > source keystone.osrc

    Get a list of domains

    ardana > openstack domain list

    As output here:

    +----------------------------------+---------+---------+----------------------------------------------------------------------+
    | ID                               | Name    | Enabled | Description                                                          |
    +----------------------------------+---------+---------+----------------------------------------------------------------------+
    | 6740dbf7465a4108a36d6476fc967dbd | heat    | True    | Owns users and projects created by heat                              |
    | default                          | Default | True    | Owns users and tenants (i.e. projects) available on Identity API v2. |
    | b2aac984a52e49259a2bbf74b7c4108b | ad      | True    | Dedicated domain for users managed by Microsoft AD server            |
    +----------------------------------+---------+---------+----------------------------------------------------------------------+
    Note
    Note

    LDAP domain is read-only. This means that you cannot create new user or group records in it.

  7. Once the LDAP user is granted the appropriate role, you can authenticate within the specified domain. Set environment variables for admin-level access

    ardana > source keystone.osrc

    Get user record within the ad (Active Directory) domain

    ardana > openstack user show testuser1 --domain ad

    Note the output:

    +-----------+------------------------------------------------------------------+
    | Field     | Value                                                            |
    +-----------+------------------------------------------------------------------+
    | domain_id | 143af847018c4dc7bd35390402395886                                 |
    | id        | e6d8c90abdc4510621271b73cc4dda8bc6009f263e421d8735d5f850f002f607 |
    | name      | testuser1                                                        |
    +-----------+------------------------------------------------------------------+

    Now, get list of LDAP groups:

    ardana > openstack group list --domain ad

    Here you see testgroup1 and testgroup2:

    +------------------------------------------------------------------+------------+
    |  ID                                                              | Name       |
    +------------------------------------------------------------------+------------+
    |  03976b0ea6f54a8e4c0032e8f756ad581f26915c7e77500c8d4aaf0e83afcdc6| testgroup1 |
    7ba52ee1c5829d9837d740c08dffa07ad118ea1db2d70e0dc7fa7853e0b79fcf   | testgroup2 |
    +------------------------------------------------------------------+------------+

    Create a new role. Note that the role is not bound to the domain.

    ardana > openstack role create testrole1

    Testrole1 has been created:

    +-------+----------------------------------+
    | Field | Value                            |
    +-------+----------------------------------+
    | id    | 02251585319d459ab847409dea527dee |
    | name  | testrole1                        |
    +-------+----------------------------------+

    Grant the user a role within the domain by executing the code below. Note that due to a current OpenStack CLI limitation, you must use the user ID rather than the user name when working with a non-default domain.

    ardana > openstack role add testrole1 --user e6d8c90abdc4510621271b73cc4dda8bc6009f263e421d8735d5f850f002f607 --domain ad

    Verify that the role was successfully granted, as shown here:

    ardana > openstack role assignment list --user e6d8c90abdc4510621271b73cc4dda8bc6009f263e421d8735d5f850f002f607 --domain ad
    +----------------------------------+------------------------------------------------------------------+-------+---------+----------------------------------+
    | Role                             | User                                                             | Group | Project | Domain                           |
    +----------------------------------+------------------------------------------------------------------+-------+---------+----------------------------------+
    | 02251585319d459ab847409dea527dee | e6d8c90abdc4510621271b73cc4dda8bc6009f263e421d8735d5f850f002f607 |       |         | 143af847018c4dc7bd35390402395886 |
    +----------------------------------+------------------------------------------------------------------+-------+---------+----------------------------------+

    Authenticate (get a domain-scoped token) as a new user with a new role. The --os-* command-line parameters specified below override the respective OS_* environment variables set by the keystone.osrc script to provide admin access. To ensure that the command below is executed in a clean environment, you may want log out from the node and log in again.

    ardana > openstack --os-identity-api-version 3 \
                --os-username testuser1 \
                --os-password testuser1_password \
                --os-auth-url http://10.0.0.6:35357/v3 \
                --os-domain-name ad \
                --os-user-domain-name ad \
                token issue

    Here is the result:

    +-----------+------------------------------------------------------------------+
    | Field     | Value                                                            |
    +-----------+------------------------------------------------------------------+
    | domain_id | 143af847018c4dc7bd35390402395886                                 |
    | expires   | 2015-09-09T21:36:15.306561Z                                      |
    | id        | 6f8f9f1a932a4d01b7ad9ab061eb0917                                 |
    | user_id   | e6d8c90abdc4510621271b73cc4dda8bc6009f263e421d8735d5f850f002f607 |
    +-----------+------------------------------------------------------------------+
  8. Users can also have a project within the domain and get a project-scoped token. To accomplish this, set environment variables for admin level access:

    ardana > source keystone.osrc

    Then create a new project within the domain:

    ardana > openstack project create testproject1 --domain ad

    The result shows that they have been created:

    +-------------+----------------------------------+
    | Field       | Value                            |
    +-------------+----------------------------------+
    | description |                                  |
    | domain_id   | 143af847018c4dc7bd35390402395886 |
    | enabled     | True                             |
    | id          | d065394842d34abd87167ab12759f107 |
    | name        | testproject1                     |
    +-------------+----------------------------------+

    Grant the user a role with a project, re-using the role created in the previous example. Note that due to a current OpenStack CLI limitation, you must use user ID rather than user name when working with a non-default domain.

    ardana > openstack role add testrole1 --user e6d8c90abdc4510621271b73cc4dda8bc6009f263e421d8735d5f850f002f607 --project testproject1

    Verify that the role was successfully granted by generating a list:

    ardana > openstack role assignment list --user e6d8c90abdc4510621271b73cc4dda8bc6009f263e421d8735d5f850f002f607 --project testproject1

    The output shows the result:

    +----------------------------------+------------------------------------------------------------------+-------+----------------------------------+--------+
    | Role                             | User                                                             | Group | Project                          | Domain |
    +----------------------------------+------------------------------------------------------------------+-------+----------------------------------+--------+
    | 02251585319d459ab847409dea527dee | e6d8c90abdc4510621271b73cc4dda8bc6009f263e421d8735d5f850f002f607 |       | d065394842d34abd87167ab12759f107 |        |
    +----------------------------------+------------------------------------------------------------------+-------+----------------------------------+--------+

    Authenticate (get a project-scoped token) as the new user with a new role. The --os-* command line parameters specified below override their respective OS_* environment variables set by keystone.osrc to provide admin access. To ensure that the command below is executed in a clean environment, you may want log out from the node and log in again. Note that both the --os-project-domain-name and --os-project-user-name parameters are needed to verify that both user and project are not in the default domain.

    ardana > openstack --os-identity-api-version 3 \
                --os-username testuser1 \
                --os-password testuser1_password \
                --os-auth-url http://10.0.0.6:35357/v3 \
                --os-project-name testproject1 \
                --os-project-domain-name ad \
                --os-user-domain-name ad \
                token issue

    Below is the result:

    +------------+------------------------------------------------------------------+
    | Field      | Value                                                            |
    +------------+------------------------------------------------------------------+
    | expires    | 2015-09-09T21:50:49.945893Z                                      |
    | id         | 328e18486f69441fb13f4842423f52d1                                 |
    | project_id | d065394842d34abd87167ab12759f107                                 |
    | user_id    | e6d8c90abdc4510621271b73cc4dda8bc6009f263e421d8735d5f850f002f607 |
    +------------+------------------------------------------------------------------+

5.9.3 Set up or switch to domain-specific driver configuration using a database store Edit source

To make the switch, execute the steps below. Remember, you must have already set up the configuration for a file store as explained in Section 5.9.2, “Set up domain-specific driver configuration - file store”, and it must be working properly.

  1. Ensure that the following configuration options are set in the main configuration file, ~/openstack/my_cloud/config/keystone/keystone.conf.j2:

    [identity]
    domain_specific_drivers_enabled = True
    domain_configurations_from_database = True
    
    [domain_config]
    driver = sql
  2. Once the template is modified, commit the change to the local git repository, and rerun the configuration processor / deployment area preparation playbooks (as suggested at Using Git for Configuration Management):

    ardana > cd ~/openstack
    ardana > git checkout site
    ardana > git add -A

    Verify that the files have been added using git status:

    ardana > git status

    Then commit the changes:

    ardana > git commit -m "Use Domain-Specific Driver Configuration - Database Store: more description here..."

    Next, run the configuration processor and ready deployment playbooks:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  3. Run the reconfiguration playbook in a deployment area:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts keystone-reconfigure.yml
  4. Upload the domain-specific config files to the database if they have not been loaded. If they have already been loaded and you want to switch back to database store mode, then skip this upload step and move on to step 5.

    1. Go to one of the controller nodes where keystone is deployed.

    2. Verify that domain-specific driver configuration files are located under the directory (default /etc/keystone/domains) with the format: keystone.<domain name>.conf Use the keystone manager utility to load domain-specific config files to the database. There are two options for uploading the files:

      1. Option 1: Upload all configuration files to the SQL database:

        ardana > keystone-manage domain_config_upload --all
      2. Option 2: Upload individual domain-specific configuration files by specifying the domain name one by one:

        ardana > keystone-manage domain_config_upload --domain-name domain name

        Here is an example:

        keystone-manage domain_config_upload --domain-name ad

        Note that the keystone manager utility does not upload the domain-specific driver configuration file the second time for the same domain. For the management of the domain-specific driver configuration in the database store, you may refer to OpenStack Identity API - Domain Configuration.

  5. Verify that the switched domain driver configuration for LDAP (Microsoft AD in this example) in the database store works properly. Then set the environment variables for admin level access:

    ardana > source ~/keystone.osrc

    Get a list of domain users:

    ardana > openstack user list --domain ad

    Note the three users returned:

    +------------------------------------------------------------------+------------+
    | ID                                                               | Name       |
    +------------------------------------------------------------------+------------+
    | e7dbec51ecaf07906bd743debcb49157a0e8af557b860a7c1dadd454bdab03fe | testuser1  |
    | 8a09630fde3180c685e0cd663427e8638151b534a8a7ccebfcf244751d6f09bd | testuser2  |
    | ea463d778dadcefdcfd5b532ee122a70dce7e790786678961420ae007560f35e | testuser3  |
    +------------------------------------------------------------------+------------+

    Get user records within the ad domain:

    ardana > openstack user show testuser1 --domain ad

    Here testuser1 is returned:

    +-----------+------------------------------------------------------------------+
    | Field     | Value                                                            |
    +-----------+------------------------------------------------------------------+
    | domain_id | 143af847018c4dc7bd35390402395886                                 |
    | id        | e6d8c90abdc4510621271b73cc4dda8bc6009f263e421d8735d5f850f002f607 |
    | name      | testuser1                                                        |
    +-----------+------------------------------------------------------------------+

    Get a list of LDAP groups:

    ardana > openstack group list --domain ad

    Note that testgroup1 and testgroup2 are returned:

    +------------------------------------------------------------------+------------+
    | ID                                                               | Name       |
    +------------------------------------------------------------------+------------+
    | 03976b0ea6f54a8e4c0032e8f756ad581f26915c7e77500c8d4aaf0e83afcdc6 | testgroup1 |
    | 7ba52ee1c5829d9837d740c08dffa07ad118ea1db2d70e0dc7fa7853e0b79fcf | testgroup2 |
    +------------------------------------------------------------------+------------+
    Note
    Note

    LDAP domain is read-only. This means that you cannot create new user or group records in it.

5.9.4 Domain-specific driver configuration. Switching from a database to a file store Edit source

Following is the procedure to switch a domain-specific driver configuration from a database store to a file store. It is assumed that:

  • The domain-specific driver configuration with a database store has been set up and is working properly.

  • Domain-specific driver configuration files with the format: keystone.<domain name>.conf have already been located and verified in the specific directory (by default, /etc/keystone/domains/) on all of the controller nodes.

  1. Ensure that the following configuration options are set in the main configuration file template in ~/openstack/my_cloud/config/keystone/keystone.conf.j2:

    [identity]
     domain_specific_drivers_enabled = True
     domain_configurations_from_database = False
    
    [domain_config]
    # driver = sql
  2. Once the template is modified, commit the change to the local git repository, and rerun the configuration processor / deployment area preparation playbooks (as suggested at Using Git for Configuration Management):

    ardana > cd ~/openstack
    ardana > git checkout site
    ardana > git add -A

    Verify that the files have been added using git status, then commit the changes:

    ardana > git status
    ardana > git commit -m "Domain-Specific Driver Configuration - Switch From Database Store to File Store: more description here..."

    Then run the configuration processor and ready deployment playbooks:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  3. Run reconfiguration playbook in a deployment area:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts keystone-reconfigure.yml
  4. Verify that the switched domain driver configuration for LDAP (Microsoft AD in this example) using file store works properly: Set environment variables for admin level access

    ardana > source ~/keystone.osrc

    Get list of domain users:

    ardana > openstack user list --domain ad

    Here you see the three users:

    +------------------------------------------------------------------+------------+
    | ID                                                               | Name       |
    +------------------------------------------------------------------+------------+
    | e7dbec51ecaf07906bd743debcb49157a0e8af557b860a7c1dadd454bdab03fe | testuser1  |
    | 8a09630fde3180c685e0cd663427e8638151b534a8a7ccebfcf244751d6f09bd | testuser2  |
    | ea463d778dadcefdcfd5b532ee122a70dce7e790786678961420ae007560f35e | testuser3  |
    +------------------------------------------------------------------+------------+

    Get user records within the ad domain:

    ardana > openstack user show testuser1 --domain ad

    Here is the result:

    +-----------+------------------------------------------------------------------+
    | Field     | Value                                                            |
    +-----------+------------------------------------------------------------------+
    | domain_id | 143af847018c4dc7bd35390402395886                                 |
    | id        | e6d8c90abdc4510621271b73cc4dda8bc6009f263e421d8735d5f850f002f607 |
    | name      | testuser1                                                        |
    +-----------+------------------------------------------------------------------+

    Get a list of LDAP groups:

    ardana > openstack group list --domain ad

    Here are the groups returned:

    +------------------------------------------------------------------+------------+
    | ID                                                               | Name       |
    +------------------------------------------------------------------+------------+
    | 03976b0ea6f54a8e4c0032e8f756ad581f26915c7e77500c8d4aaf0e83afcdc6 | testgroup1 |
    | 7ba52ee1c5829d9837d740c08dffa07ad118ea1db2d70e0dc7fa7853e0b79fcf | testgroup2 |
    +------------------------------------------------------------------+------------+

    Note: Note: LDAP domain is read-only. This means that you can not create new user or group record in it.

5.9.5 Update LDAP CA certificates Edit source

There is a chance that LDAP CA certificates may expire or for some reason not work anymore. Below are steps to update the LDAP CA certificates on the identity service side. Follow the steps below to make the updates.

  1. Locate the file keystone_configure_ldap_certs_sample.yml

    ~/openstack/my_cloud/config/keystone/keystone_configure_ldap_certs_sample.yml
  2. Save a copy of this file with a new name, for example:

    ~/openstack/my_cloud/config/keystone/keystone_configure_ldap_certs_all.yml
  3. Edit the file and specify the correct single file path name for the ldap certificates. This file path name has to be consistent with the one defined in tls_cacertfile of the domain-specific configuration. Edit the file and populate or update it with LDAP CA certificates for all LDAP domains.

  4. As suggested in Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”, add the new file to the local git repository:

    ardana > cd ~/openstack
    ardana > git checkout site
    ardana > git add -A

    Verify that the files have been added using git status and commit the file:

    ardana > git status
    ardana > git commit -m "Update LDAP CA certificates: more description here..."

    Then run the configuration processor and ready deployment playbooks:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  5. Run the reconfiguration playbook in the deployment area:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts keystone-reconfigure.yml -e@~/openstack/my_cloud/config/keystone/keystone_configure_ldap_certs_all.yml

5.9.6 Limitations Edit source

SUSE OpenStack Cloud 9 domain-specific configuration:

  • No Global User Listing: Once domain-specific driver configuration is enabled, listing all users and listing all groups are not supported operations. Those calls require a specific domain filter and a domain-scoped token for the target domain.

  • You cannot have both a file store and a database store for domain-specific driver configuration in a single identity service instance. Once a database store is enabled within the identity service instance, any file store will be ignored, and vice versa.

  • The identity service allows a list limit configuration to globally set the maximum number of entities that will be returned in an identity collection per request but it does not support per-domain list limit setting at this time.

  • Each time a new domain is configured with LDAP integration the single CA file gets overwritten. Ensure that you place certs for all the LDAP back-end domains in the cacert parameter. Detailed CA file inclusion instructions are provided in the comments of the sample YAML configuration file keystone_configure_ldap_my.yml (Section 5.9.2, “Set up domain-specific driver configuration - file store”).

  • LDAP is only supported for identity operations (reading users and groups from LDAP).

  • keystone assignment operations from LDAP records such as managing or assigning roles and projects, are not currently supported.

  • The SUSE OpenStack Cloud 'default' domain is pre-configured to store service account users and is authenticated locally against the identity service. Domains configured for external LDAP integration are non-default domains.

  • When using the current OpenStackClient CLI you must use the user ID rather than the user name when working with a non-default domain.

  • Each LDAP connection with the identity service is for read-only operations. Configurations that require identity service write operations (to create users, groups, etc.) are not currently supported.

  • LDAP is only supported for identity operations (reading users and groups from LDAP). keystone assignment operations from LDAP records such as managing or assigning roles and projects, are not currently supported.

  • When using the current OpenStackClient CLI you must use the user ID rather than the user name when working with a non-default domain.

SUSE OpenStack Cloud 9 API-based domain-specific configuration management

  • No GUI dashboard for domain-specific driver configuration management

  • API-based Domain specific config does not check for type of option.

  • API-based Domain specific config does not check for option values supported.

  • API-based Domain config method does not provide retrieval of default values of domain-specific configuration options.

  • Status: Domain-specific driver configuration database store is a non-core feature for SUSE OpenStack Cloud 9.

Note
Note

When integrating with an external identity provider, cloud security is dependent upon the security of that identify provider. You should examine the security of the identity provider, and in particular the SAML 2.0 token generation process and decide what security properties you need to ensure adequate security of your cloud deployment. More information about SAML can be found at https://www.owasp.org/index.php/SAML_Security_Cheat_Sheet.

5.10 keystone-to-keystone Federation Edit source

This topic explains how you can use one instance of keystone as an identity provider and one as a service provider.

5.10.1 What Is Keystone-to-Keystone Federation? Edit source

Identity federation lets you configure SUSE OpenStack Cloud using existing identity management systems such as an LDAP directory as the source of user access authentication. The keystone-to-keystone federation (K2K) function extends this concept for accessing resources in multiple, separate SUSE OpenStack Cloud clouds. You can configure each cloud to trust the authentication credentials of other clouds to provide the ability for users to authenticate with their home cloud and to access authorized resources in another cloud without having to reauthenticate with the remote cloud. This function is sometimes referred to as "single sign-on" or SSO.

The SUSE OpenStack Cloud cloud that provides the initial user authentication is called the identity provider (IdP). The identity provider cloud can support domain-based authentication against external authentication sources including LDAP-based directories such as Microsoft Active Directory. The identity provider creates the user attributes, known as assertions, which are used to automatically authenticate users with other SUSE OpenStack Cloud clouds.

An SUSE OpenStack Cloud cloud that provides resources is called a service provider (SP). A service provider cloud accepts user authentication assertions from the identity provider and provides access to project resources based on the mapping file settings developed for each service provider cloud. The following are characteristics of a service provider:

  • Each service provider cloud has a unique set of projects, groups, and group role assignments that are created and managed locally.

  • The mapping file consists a set of rules that define user group membership.

  • The mapping file enables the ability to auto-assign incoming users to a specific group. Project membership and access are defined by group membership.

  • Project quotas are defined locally by each service provider cloud.

keystone-to-keystone federation is supported and enabled in SUSE OpenStack Cloud 9 using configuration parameters in specific Ansible files. Instructions are provided to define and enable the required configurations.

Support for keystone-to-keystone federation happens on the API level, and you must implement it using your own client code by calling the supported APIs. Python-keystoneclient has supported APIs to access the K2K APIs.

Example 5.1: k2kclient.py

The following k2kclient.py file is an example, and the request diagram Figure 5.1, “Keystone Authentication Flow” explains the flow of client requests.

import json
import os
import requests

import xml.dom.minidom

from keystoneclient.auth.identity import v3
from keystoneclient import session

class K2KClient(object):

    def __init__(self):
        # IdP auth URL
        self.auth_url = "http://192.168.245.9:35357/v3/"
        self.project_name = "admin"
        self.project_domain_name = "Default"
        self.username = "admin"
        self.password = "vvaQIZ1S"
        self.user_domain_name = "Default"
        self.session = requests.Session()
        self.verify = False
        # identity provider Id
        self.idp_id = "z420_idp"
        # service provider Id
        self.sp_id = "z620_sp"
        #self.sp_ecp_url = "https://16.103.149.44:8443/Shibboleth.sso/SAML2/ECP"
        #self.sp_auth_url = "https://16.103.149.44:8443/v3"

    def v3_authenticate(self):
        auth = v3.Password(auth_url=self.auth_url,
                           username=self.username,
                           password=self.password,
                           user_domain_name=self.user_domain_name,
                           project_name=self.project_name,
                           project_domain_name=self.project_domain_name)

        self.auth_session = session.Session(session=requests.session(),
                                       auth=auth, verify=self.verify)
        auth_ref = self.auth_session.auth.get_auth_ref(self.auth_session)
        self.token = self.auth_session.auth.get_token(self.auth_session)

    def _generate_token_json(self):
        return {
            "auth": {
                "identity": {
                    "methods": [
                        "token"
                    ],
                    "token": {
                        "id": self.token
                    }
                },
                "scope": {
                    "service_provider": {
                        "id": self.sp_id
                    }
                }
            }
        }

    def get_saml2_ecp_assertion(self):
        token = json.dumps(self._generate_token_json())
        url = self.auth_url + 'auth/OS-FEDERATION/saml2/ecp'
        r = self.session.post(url=url,
                              data=token,
                              verify=self.verify)
        if not r.ok:
            raise Exception("Something went wrong, %s" % r.__dict__)
        self.ecp_assertion = r.text

    def _get_sp_url(self):
        url = self.auth_url + 'OS-FEDERATION/service_providers/' + self.sp_id
        r = self.auth_session.get(
           url=url,
           verify=self.verify)
        if not r.ok:
            raise Exception("Something went wrong, %s" % r.__dict__)

        sp = json.loads(r.text)[u'service_provider']
        self.sp_ecp_url = sp[u'sp_url']
        self.sp_auth_url = sp[u'auth_url']

    def _handle_http_302_ecp_redirect(self, response, method, **kwargs):
        location = self.sp_auth_url + '/OS-FEDERATION/identity_providers/' + self.idp_id + '/protocols/saml2/auth'
        return self.auth_session.request(location, method, authenticated=False, **kwargs)

    def exchange_assertion(self):
        """Send assertion to a keystone SP and get token."""
        self._get_sp_url()
        print("SP ECP Url:%s" % self.sp_ecp_url)
        print("SP Auth Url:%s" % self.sp_auth_url)
        #self.sp_ecp_url = 'https://16.103.149.44:8443/Shibboleth.sso/SAML2/ECP'
        r = self.auth_session.post(
            self.sp_ecp_url,
            headers={'Content-Type': 'application/vnd.paos+xml'},
            data=self.ecp_assertion,
            authenticated=False, redirect=False)
        r = self._handle_http_302_ecp_redirect(r, 'GET',
            headers={'Content-Type': 'application/vnd.paos+xml'})
        self.fed_token_id = r.headers['X-Subject-Token']
        self.fed_token = r.text

if __name__ == "__main__":
    client = K2KClient()
    client.v3_authenticate()
    client.get_saml2_ecp_assertion()
    client.exchange_assertion()
    print('Unscoped token_id: %s' % client.fed_token_id)
    print('Unscoped token body:
%s' % client.fed_token)

5.10.2 Setting Up a keystone Provider Edit source

To set up keystone as a service provider, follow these steps.

  1. Create a config file called k2k.yml with the following parameters and place it in any directory on your Cloud Lifecycle Manager, such as /tmp.

    keystone_trusted_idp: k2k
    keystone_sp_conf:
      shib_sso_idp_entity_id: <protocol>://<idp_host>:<port>/v3/OS-FEDERATION/saml2/idp
      shib_sso_application_entity_id: http://service_provider_uri_entityId
      target_domain:
        name: domain1
        description: my domain
      target_project:
        name: project1
        description: my project
      target_group:
        name: group1
        description: my group
      role:
        name: service
      idp_metadata_file: /tmp/idp_metadata.xml
      identity_provider:
        id: my_idp_id
        description: This is the identity service provider.
      mapping:
        id: mapping1
        rules_file: /tmp/k2k_sp_mapping.json
      protocol:
        id: saml2
      attribute_map:
        -
          name: name1
          id: id1

    The following are descriptions of each of the attributes.

    AttributeDefinition
    keystone_trusted_idp

    A flag to indicate if this configuration is used for keystone-to-keystone or WebSSO. The value can be either k2k or adfs.

    keystone_sp_conf  
    shib_sso_idp_entity_id

    The identity provider URI used as an entity Id to identity the IdP. You shoud use the following value: <protocol>://<idp_host>:<port>/v3/OS-FEDERATION/saml2/idp.

    shib_sso_application_entity_id

    The service provider URI used as an entity Id. It can be any URI here for keystone-to-keystone.

    target_domain

    A domain where the group will be created.

    name

    Any domain name. If it does not exist, it will be created or updated.

    description

    Any description.

    target_project

    A project scope of the group.

    name

    Any project name. If it does not exist, it will be created or updated.

    descriptionAny description.
    target_group

    A group will be created from target_domain.

    name

    Any group name. If it does not exist, it will be created or updated.

    descriptionAny description.
    role

    A role will be assigned on target_project. This role impacts the IdP user scoped token permission on the service provider side.

    nameMust be an existing role.
    idp_metadata_file

    A reference to the IdP metadata file that validates the SAML2 assertion.

    identity_providerA supported IdP.
    id

    Any Id. If it does not exist, it will be created or updated. This Id needs to be shared with the client so that the right mapping will be selected.

    descriptionAny description.
    mapping

    A mapping in JSON format that maps a federated user to a corresponding group.

    id

    Any Id. If it does not exist, it will be created or updated.

    rules_file

    A reference to the file that has the mapping in JSON.

    protocol

    The supported federation protocol.

    id

    Security Assertion Markup Language 2.0 (SAML2) is the only supported protocol for K2K.

    attribute_map

    A shibboleth mapping that defines additional attributes to map the attributes from the SAML2 assertion to the K2K mapping that the service provider understands. K2K does not require any additional attribute mapping.

    nameAn attribute name from the SAML2 assertion.
    idAn Id that the preceding name will be mapped to.
  2. Create a metadata file that is referenced from k2k.yml, such as /tmp/idp_metadata.xml. The content of the metadata file comes from the identity provider and can be found in /etc/keystone/idp_metadata.xml.

    1. Create a mapping file that is referenced in k2k.yml, shown previously. An example is /tmp/k2k_sp_mapping.json. You can see the reference in bold in the preceding k2k.yml example. The following is an example of the mapping file.

      [
        {
          "local": [
            {
              "user": {
                "name": "{0}"
              }
            },
            {
              "group": {
                 "name": "group1",
                 "domain":{
                   "name": "domain1"
                 }
              }
            }
          ],
          "remote":[{
            "type": "openstack_user"
          },
          {
            "type": "Shib-Identity-Provider",
            "any_one_of":[
               "https://idp_host:5000/v3/OS-FEDERATION/saml2/idp"
            ]
           }
          ]
         }
      ]

      You can find more information on how the K2K mapping works at http://docs.openstack.org.

  3. Go to ~/stack/scratch/ansible/next/ardana/ansible and run the following playbook to enable the service provider:

    ardana > ansible-playbook -i hosts/verb_hosts keystone-reconfigure.yml -e@/tmp/k2k.yml

Setting Up an Identity Provider

To set up keystone as an identity provider, follow these steps:

  1. Create a config file k2k.yml with the following parameters and place it in any directory on your Cloud Lifecycle Manager, such as /tmp. Note that the certificate and key here are excerpted for space.

    keystone_k2k_idp_conf:
        service_provider:
              -
                id: my_sp_id
                description: This is service provider.
                sp_url: https://sp_host:5000
                auth_url: https://sp_host:5000/v3
        signer_cert: -----BEGIN CERTIFICATE-----
    MIIDmDCCAoACCQDS+ZDoUfr
        cIzANBgkqhkiG9w0BAQsFADCBjDELMAkGA1UEBhMC\ nVVMxEzARBgNVB
        AgMCkNhbGlmb3JuaWExEjAQBgNVBAcMCVN1bm55dmFsZTEMMAoG\
       
                ...
        nOpKEvhlMsl5I/tle
    -----END CERTIFICATE-----
        signer_key: -----BEGIN RSA PRIVATE KEY-----
    MIIEowIBAAKCAQEA1gRiHiwSO6L5PrtroHi/f17DQBOpJ1KMnS9FOHS
                
                ...

    The following are descriptions of each of the attributes under keystone_k2k_idp_conf

    service_provider

    One or more service providers can be defined. If it does not exist, it will be created or updated.

    id

    Any Id. If it does not exist, it will be created or updated. This Id needs to be shared with the client so that it knows where the service provider is.

    description

    Any description.

    sp_url

    Service provider base URL.

    auth_url

    Service provider auth URL.

    signer_cert

    Content of self-signed certificate that is embedded in the metadata file. We recommend setting the validity for a longer period of time, such as 3650 days (10 years).

    signer_key

    A private key that has a key size of 2048 bits.

  2. Create a private key and a self-signed certificate. The command-line tool, openssl, is required to generate the keys and certificates. If the system does not have it, you must install it.

    1. Create a private key of size 2048.

      ardana > openssl genrsa -out myidp.key 2048
    2. Generate a certificate request named myidp.csr. When prompted, choose CommonName for the server's hostname.

      ardana > openssl req -new -key myidp.key -out myidp.csr
    3. Generate a self-signed certificate named myidp.cer.

      ardana > openssl x509 -req -days 3650 -in myidp.csr -signkey myidp.key -out myidp.cer
  3. Go to ~/scratch/ansible/next/ardana/ansible and run the following playbook to enable the service provider in keystone:

    ardana > ansible-playbook -i hosts/verb_hosts keystone-reconfigure.yml -e@/tmp/k2k.yml

5.10.3 Test It Out Edit source

You can use the script listed earlier, k2kclient.py (Example 5.1, “k2kclient.py”), as an example for the end-to-end flows. To run k2kclient.py, follow these steps:

  1. A few parameters must be changed in the beginning of k2kclient.py. For example, enter your specific URL, project name, and user name, as follows:

    # IdP auth URL
    self.auth_url = "http://idp_host:5000/v3/"
    self.project_name = "my_project_name"
    self.project_domain_name = "my_project_domain_name"
    self.username = "test"
    self.password = "mypass"
    self.user_domain_name = "my_domain"
    # identity provider Id that is defined in the SP config
    self.idp_id = "my_idp_id"
    # service provider Id that is defined in the IdP config
    self.sp_id = "my_sp_id"
  2. Install python-keystoneclient along with its dependencies.

  3. Run the k2kclient.py script. An unscoped token will be returned from the service provider.

At this point, the domain or project scope of the unscoped taken can be discovered by sending the following URLs:

ardana > curl -k -X GET -H "X-Auth-Token: unscoped token" \
 https://<sp_public_endpoint>:5000/v3/OS-FEDERATION/domains
ardana > curl -k -X GET -H "X-Auth-Token: unscoped token" \
 https://<sp_public_endpoint:5000/v3/OS-FEDERATION/projects

5.10.4 Inside keystone-to-keystone Federation Edit source

K2K federation places a lot of responsibility with the user. The complexity is apparent from the following diagram.

  1. Users must first authenticate to their home or local cloud, or local identity provider keystone instance to obtain a scoped token.

  2. Users must discover which service providers (or remote clouds) are available to them by querying their local cloud.

  3. For a given remote cloud, users must discover which resources are available to them by querying the remote cloud for the projects they can scope to.

  4. To talk to the remote cloud, users must first exchange, with the local cloud, their locally scoped token for a SAML2 assertion to present to the remote cloud.

  5. Users then present the SAML2 assertion to the remote cloud. The remote cloud applies its mapping for the incoming SAML2 assertion to map each user to a local ephemeral persona (such as groups) and issues an unscoped token.

  6. Users query the remote cloud for the list of projects they have access to.

  7. Users then rescope their token to a given project.

  8. Users now have access to the resources owned by the project.

The following diagram illustrates the flow of authentication requests.

Keystone Authentication Flow
Figure 5.1: Keystone Authentication Flow

5.10.5 Additional Testing Scenarios Edit source

The following tests assume one identity provider and one service provider.

Test Case 1: Any federated user in the identity provider maps to a single designated group in the service provider

  1. On the identity provider side:

    hostname=myidp.com
    username=user1
  2. On the service provider side:

    group=group1
    group_domain_name=domain1
    'group1' scopes to 'project1'
  3. Mapping used:

    testcase1_1.json

    testcase1_1.json

    [
      {
        "local": [
          {
            "user": {
              "name": "{0}"
            }
          },
          {
            "group": {
               "name": "group1",
               "domain":{
                 "name": "domain1"
               }
            }
          }
        ],
        "remote":[{
          "type": "openstack_user"
        },
        {
          "type": "Shib-Identity-Provider",
          "any_one_of":[
             "https://myidp.com:5000/v3/OS-FEDERATION/saml2/idp"
          ]
         }
        ]
       }
    ]
  4. Expected result: The federated user will scope to project1.

Test Case 2: A federated user in a specific domain in the identity provider maps to two different groups in the service provider

  1. On the identity provider side:

    hostname=myidp.com
    username=user1
    user_domain_name=Default
  2. On the service provider side:

    group=group1
    group_domain_name=domain1
    'group1' scopes to 'project1' group=group2
    group_domain_name=domain2
    'group2' scopes to 'project2'
  3. Mapping used:

    testcase1_2.json

    testcase1_2.json

    [
      {
        "local": [
          {
            "user": {
              "name": "{0}"
            }
          },
          {
            "group": {
               "name": "group1",
               "domain":{
                 "name": "domain1"
               }
            }
          }
        ],
        "remote":[{
          "type": "openstack_user"
        },
        {
          "type": "Shib-Identity-Provider",
          "any_one_of":[
             "https://myidp.com:5000/v3/OS-FEDERATION/saml2/idp"
          ]
         }
        ]
       }
      {
        "local": [
          {
            "user": {
              "name": "{0}"
            }
          },
          {
            "group": {
               "name": "group2",
               "domain":{
                 "name": "domain2"
               }
            }
          }
        ],
        "remote":[{
          "type": "openstack_user"
        },
        {
          "type": "openstack_user_domain",
          "any_one_of": [
              "Default"
          ]
        },
        {
          "type": "Shib-Identity-Provider",
          "any_one_of":[
             "https://myidp.com:5000/v3/OS-FEDERATION/saml2/idp"
          ]
         }
        ]
       }
    ]
  4. Expected result: The federated user will scope to both project1 and project2.

Test Case 3: A federated user with a specific project in the identity provider maps to a specific group in the service provider

  1. On the identity provider side:

    hostname=myidp.com
    username=user4
    user_project_name=test1
  2. On the service provider side:

    group=group4
    group_domain_name=domain4
    'group4' scopes to 'project4'
  3. Mapping used:

    testcase1_3.json

    testcase1_3.json

    [
      {
        "local": [
          {
            "user": {
              "name": "{0}"
            }
          },
          {
            "group": {
               "name": "group4",
               "domain":{
                 "name": "domain4"
               }
            }
          }
        ],
        "remote":[{
          "type": "openstack_user"
        },
        {
          "type": "openstack_project",
          "any_one_of": [
              "test1"
          ]
        },
        {
          "type": "Shib-Identity-Provider",
          "any_one_of":[
             "https://myidp.com:5000/v3/OS-FEDERATION/saml2/idp"
          ]
         }
        ]
       },
      {
        "local": [
          {
            "user": {
              "name": "{0}"
            }
          },
          {
            "group": {
               "name": "group5",
               "domain":{
                 "name": "domain5"
               }
            }
          }
        ],
        "remote":[{
          "type": "openstack_user"
        },
        {
          "type": "openstack_roles",
          "not_any_of": [
              "member"
          ]
        },
        {
          "type": "Shib-Identity-Provider",
          "any_one_of":[
             "https://myidp.com:5000/v3/OS-FEDERATION/saml2/idp"
          ]
         }
        ]
       }
    ]
  4. Expected result: The federated user will scope to project4.

Test Case 4: A federated user with a specific role in the identity provider maps to a specific group in the service provider

  1. On the identity provider side:

    hostname=myidp.com, username=user5, role_name=member
  2. On the service provider side:

    group=group5, group_domain_name=domain5, 'group5' scopes to 'project5'
  3. Mapping used:

    testcase1_3.json

    testcase1_3.json

    [
      {
        "local": [
          {
            "user": {
              "name": "{0}"
            }
          },
          {
            "group": {
               "name": "group4",
               "domain":{
                 "name": "domain4"
               }
            }
          }
        ],
        "remote":[{
          "type": "openstack_user"
        },
        {
          "type": "openstack_project",
          "any_one_of": [
              "test1"
          ]
        },
        {
          "type": "Shib-Identity-Provider",
          "any_one_of":[
             "https://myidp.com:5000/v3/OS-FEDERATION/saml2/idp"
          ]
         }
        ]
       },
      {
        "local": [
          {
            "user": {
              "name": "{0}"
            }
          },
          {
            "group": {
               "name": "group5",
               "domain":{
                 "name": "domain5"
               }
            }
          }
        ],
        "remote":[{
          "type": "openstack_user"
        },
        {
          "type": "openstack_roles",
          "not_any_of": [
              "member"
          ]
        },
        {
          "type": "Shib-Identity-Provider",
          "any_one_of":[
             "https://myidp.com:5000/v3/OS-FEDERATION/saml2/idp"
          ]
         }
        ]
       }
    ]
  4. Expected result: The federated user will scope to project5.

Test Case 5: Retain the previous scope for a federated user

  1. On the identity provider side:

    hostname=myidp.com, username=user1, user_domain_name=Default
  2. On the service provider side:

    group=group1, group_domain_name=domain1, 'group1' scopes to 'project1'
  3. Mapping used:

    testcase1_1.json

    testcase1_1.json

    [
      {
        "local": [
          {
            "user": {
              "name": "{0}"
            }
          },
          {
            "group": {
               "name": "group1",
               "domain":{
                 "name": "domain1"
               }
            }
          }
        ],
        "remote":[{
          "type": "openstack_user"
        },
        {
          "type": "Shib-Identity-Provider",
          "any_one_of":[
             "https://myidp.com:5000/v3/OS-FEDERATION/saml2/idp"
          ]
         }
        ]
       }
    ]
  4. Expected result: The federated user will scope to project1. Later, we would like to scope federated users who have the default domain in the identity provider to project2 in addition to project1.

  5. On the identity provider side:

    hostname=myidp.com, username=user1, user_domain_name=Default
  6. On the service provider side:

    group=group1
    group_domain_name=domain1
    'group1' scopes to 'project1' group=group2
    group_domain_name=domain2
    'group2' scopes to 'project2'
  7. Mapping used:

    testcase1_2.json

    testcase1_2.json

    [
      {
        "local": [
          {
            "user": {
              "name": "{0}"
            }
          },
          {
            "group": {
               "name": "group1",
               "domain":{
                 "name": "domain1"
               }
            }
          }
        ],
        "remote":[{
          "type": "openstack_user"
        },
        {
          "type": "Shib-Identity-Provider",
          "any_one_of":[
             "https://myidp.com:5000/v3/OS-FEDERATION/saml2/idp"
          ]
         }
        ]
       }
      {
        "local": [
          {
            "user": {
              "name": "{0}"
            }
          },
          {
            "group": {
               "name": "group2",
               "domain":{
                 "name": "domain2"
               }
            }
          }
        ],
        "remote":[{
          "type": "openstack_user"
        },
        {
          "type": "openstack_user_domain",
          "any_one_of": [
              "Default"
          ]
        },
        {
          "type": "Shib-Identity-Provider",
          "any_one_of":[
             "https://myidp.com:5000/v3/OS-FEDERATION/saml2/idp"
          ]
         }
        ]
       }
    ]
  8. Expected result: The federated user will scope to project1 and project2.

Test Case 6: Scope a federated user to a domain

  1. On the identity provider side:

    hostname=myidp.com, username=user1
  2. On the service provider side:

    group=group1, group_domain_name=domain1, 'group1' scopes to 'project1'
  3. Mapping used:

    testcase1_1.json

    testcase1_1.json

    [
      {
        "local": [
          {
            "user": {
              "name": "{0}"
            }
          },
          {
            "group": {
               "name": "group1",
               "domain":{
                 "name": "domain1"
               }
            }
          }
        ],
        "remote":[{
          "type": "openstack_user"
        },
        {
          "type": "Shib-Identity-Provider",
          "any_one_of":[
             "https://myidp.com:5000/v3/OS-FEDERATION/saml2/idp"
          ]
         }
        ]
       }
    ]
  4. Expected result:

    • The federated user will scope to project1.

    • User uses CLI/Curl to assign any existing role to group1 on domain1.

    • User uses CLI/Curl to remove project1 scope from group1.

  5. Final result: The federated user will scope to domain1.

Test Case 7: Test five remote attributes for mapping

  1. Test all five different remote attributes, as follows, with similar test cases as noted previously.

    • openstack_user

    • openstack_user_domain

    • openstack_roles

    • openstack_project

    • openstack_project_domain

    The attribute openstack_user does not make much sense for testing because it is mapped only to a specific username. The preceding test cases have already covered the attributes openstack_user_domain, openstack_roles, and openstack_project.

Note that similar tests have also been run for two identity providers with one service provider, and for one identity provider with two service providers.

5.10.6 Known Issues and Limitations Edit source

Keep the following points in mind:

  • When a user is disabled in the identity provider, the issued federated token from the service provider still remains valid until the token is expired based on the keystone expiration setting.

  • An already issued federated token will retain its scope until its expiration. Any changes in the mapping on the service provider will not impact the scope of an already issued federated token. For example, if an already issued federated token was mapped to group1 that has scope on project1, and mapping is changed to group2 that has scope on project2, the prevously issued federated token still has scope on project1.

  • Access to service provider resources is provided only through the python-keystone CLI client or the keystone API. No horizon web interface support is currently available.

  • Domains, projects, groups, roles, and quotas are created per the service provider cloud. Support for federated projects, groups, roles, and quotas is currently not available.

  • keystone-to-keystone federation and WebSSO cannot be configured by putting both sets of configuration attributes in the same config file; they will overwrite each other. Consequently, they need to be configured individually.

  • Scoping the federated user to a domain is not supported by default in the playbook. Please follow the steps at Section 5.10.7, “Scope Federated User to Domain”.

5.10.7 Scope Federated User to Domain Edit source

Use the following steps to scope a federated user to a domain:

  1. On the IdP side, set hostname=myidp.com and username=user1.

  2. On the service provider side, set: group=group1, group_domain_name=domain1, group1 scopes to project1.

  3. Mapping used: testcase1_1.json.

    testcase1_1.json

    [
      {
        "local": [
          {
            "user": {
              "name": "{0}"
            }
          },
          {
            "group": {
               "name": "group1",
               "domain":{
                 "name": "domain1"
               }
            }
          }
        ],
        "remote":[{
          "type": "openstack_user"
        },
        {
          "type": "Shib-Identity-Provider",
          "any_one_of":[
             "https://myidp.com:5000/v3/OS-FEDERATION/saml2/idp"
          ]
         }
        ]
       }
    ]
  4. Expected result: The federated user will scope to project1. Use CLI/Curl to assign any existing role to group1 on domain1. Use CLI/Curl to remove project1 scope from group1.

  5. Result: The federated user will scope to domain1.

5.11 Configuring Web Single Sign-On Edit source

Important
Important

The external-name in ~/openstack/my_cloud/definition/data/network_groups.yml must be set to a valid DNS-resolvable FQDN.

This topic explains how to implement web single sign-on.

5.11.1 What is WebSSO? Edit source

WebSSO, or web single sign-on, is a method for web browsers to receive current authentication information from an identity provider system without requiring a user to log in again to the application displayed by the browser. Users initially access the identity provider web page and supply their credentials. If the user successfully authenticates with the identity provider, the authentication credentials are then stored in the user’s web browser and automatically provided to all web-based applications, such as the horizon dashboard in SUSE OpenStack Cloud 9. If users have not yet authenticated with an identity provider or their credentials have timed out, they are automatically redirected to the identity provider to renew their credentials.

5.11.2 Limitations Edit source

  • The WebSSO function supports only horizon web authentication. It is not supported for direct API or CLI access.

  • WebSSO works only with Fernet token provider. See Section 5.8.4, “Fernet Tokens”.

  • The SUSE OpenStack Cloud WebSSO function was tested with Microsoft Active Directory Federation Services (AD FS). The instructions provided are pertinent to ADFS and are intended to provide a sample configuration for deploying WebSSO with an external identity provider. If you have a different identity provider such as Ping Identity or IBM Tivoli, consult with those vendors for specific instructions for those products.

  • The SUSE OpenStack Cloud WebSSO function with OpenID method was tested with Google OAuth 2.0 APIs, which conform to the OpenID Connect specification. The interaction between Keystone and the external Identity Provider (IdP) is handled by the Apache2 auth_openidc module. Please consult with the specific OpenID Connect vendor on whether they support auth_openidc.

  • Both SAML and OpenID methods are supported for WebSSO federation in SUSE OpenStack Cloud 9 .

  • WebSSO has a change password option in User Settings, but note that this function is not accessible for users authenticating with external systems such as LDAP or SAML Identity Providers.

5.11.3 Enabling WebSSO Edit source

SUSE OpenStack Cloud 9 provides WebSSO support for the horizon web interface. This support requires several configuration steps including editing the horizon configuration file as well as ensuring that the correct keystone authentication configuration is enabled to receive the authentication assertions provided by the identity provider.

WebSSO support both SAML and OpenID methods. The following workflow depicts how horizon and keystone support WebSSO via SAML method if no current authentication assertion is available.

  1. horizon redirects the web browser to the keystone endpoint.

  2. keystone automatically redirects the web browser to the correct identity provider authentication web page based on the keystone configuration file.

  3. The user authenticates with the identity provider.

  4. The identity provider automatically redirects the web browser back to the keystone endpoint.

  5. keystone generates the required Javascript code to POST a token back to horizon.

  6. keystone automatically redirects the web browser back to horizon and the user can then access projects and resources assigned to the user.

The following diagram provides more details on the WebSSO authentication workflow.

Note that the horizon dashboard service never talks directly to the keystone identity service until the end of the sequence, after the federated unscoped token negotiation has completed. The browser interacts with the horizon dashboard service, the keystone identity service, and ADFS on their respective public endpoints.

The following sequence of events is depicted in the diagram.

  1. The user's browser reaches the horizon dashboard service's login page. The user selects ADFS login from the drop-down menu.

  2. The horizon dashboard service issues an HTTP Redirect (301) to redirect the browser to the keystone identity service's (public) SAML2 Web SSO endpoint (/auth/OS-FEDERATION/websso/saml2). The endpoint is protected by Apache mod_shib (shibboleth).

  3. The browser talks to the keystone identity service. Because the user's browser does not have an active session with AD FS, the keystone identity service issues an HTTP Redirect (301) to the browser, along with the required SAML2 request, to the ADFS endpoint.

  4. The browser talks to AD FS. ADFS returns a login form. The browser presents it to the user.

  5. The user enters credentials (such as username and password) and submits the form to AD FS.

  6. Upon successful validation of the user's credentials, ADFS issues an HTTP Redirect (301) to the browser, along with the SAML2 assertion, to the keystone identity service's (public) SAML2 endpoint (/auth/OS-FEDERATION/websso/saml2).

  7. The browser talks to the keystone identity service. the keystone identity service validates the SAML2 assertion and issues a federated unscoped token. the keystone identity service returns JavaScript code to be executed by the browser, along with the federated unscoped token in the headers.

  8. Upon execution of the JavaScript code, the browser is redirected to the horizon dashboard service with the federated unscoped token in the header.

  9. The browser talks to the horizon dashboard service with the federated unscoped token.

  10. With the unscoped token, the horizon dashboard service talks to the keystone identity service's (internal) endpoint to get a list of projects the user has access to.

  11. The horizon dashboard service rescopes the token to the first project in the list. At this point, the user is successfully logged in.

The sequence of events for WebSSO using OpenID method is similar to SAML method.

5.11.4 Prerequisites Edit source

5.11.4.1 WebSSO Using SAML Method Edit source

5.11.4.1.1 Creating ADFS metadata Edit source

For information about creating Active Directory Federation Services metadata, see the section To create edited ADFS 2.0 metadata with an added scope element of https://technet.microsoft.com/en-us/library/gg317734.

  1. On the ADFS computer, use a browser such as Internet Explorer to view https://<adfs_server_hostname>/FederationMetadata/2007-06/FederationMetadata.xml.

  2. On the File menu, click Save as, and then navigate to the Windows desktop and save the file with the name adfs_metadata.xml. Make sure to change the Save as type drop-down box to All Files (*.*).

  3. Use Windows Explorer to navigate to the Windows desktop, right-click adfs_metadata.xml, and then click Edit.

  4. In Notepad, insert the following XML in the first element. Before editing, the EntityDescriptor appears as follows:

    <EntityDescriptor ID="abc123" entityID=http://WIN-CAICP35LF2I.vlan44.domain/adfs/services/trust xmlns="urn:oasis:names:tc:SAML:2.0:metadata" >

    After editing, it should look like this:

    <EntityDescriptor ID="abc123" entityID="http://WIN-CAICP35LF2I.vlan44.domain/adfs/services/trust" xmlns="urn:oasis:names:tc:SAML:2.0:metadata" xmlns:shibmd="urn:mace:shibboleth:metadata:1.0">
  5. In Notepad, on the Edit menu, click Find. In Find what, type IDPSSO, and then click Find Next.

  6. Insert the following XML in this section: Before editing, the IDPSSODescriptor appears as follows:

    <IDPSSODescriptor protocolSupportEnumeration="urn:oasis:names:tc:SAML:2.0:protocol"><KeyDescriptor use="encryption">

    After editing, it should look like this:

    <IDPSSODescriptor protocolSupportEnumeration="urn:oasis:names:tc:SAML:2.0:protocol"><Extensions><shibmd:Scope regexp="false">vlan44.domain</shibmd:Scope></Extensions><KeyDescriptor use="encryption">
  7. Delete the metadata document signature section of the file (the bold text shown in the following code). Because you have edited the document, the signature will now be invalid. Before editing the signature appears as follows:

    <EntityDescriptor ID="abc123" entityID="http://FSWEB.contoso.com/adfs/services/trust" xmlns="urn:oasis:names:tc:SAML:2.0:metadata" xmlns:shibmd="urn:mace:shibboleth:metadata:1.0">
    <ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#">
        SIGNATURE DATA
    </ds:Signature>
    <RoleDescriptor xsi:type=…>

    After editing it should look like this:

    <EntityDescriptor ID="abc123" entityID="http://FSWEB.contoso.com/adfs/services/trust" xmlns="urn:oasis:names:tc:SAML:2.0:metadata" xmlns:shibmd="urn:mace:shibboleth:metadata:1.0">
    <RoleDescriptor xsi:type=…>
  8. Save and close adfs_metadata.xml.

  9. Copy adfs_metadata.xml to the Cloud Lifecycle Manager node and place it into /var/lib/ardana/openstack/my_cloud/config/keystone/ directory and put it under revision control.

    ardana > cd ~/openstack
    ardana > git checkout site
    ardana > git add my_cloud/config/keystone/adfs_metadata.xml
    ardana > git commit -m "Add ADFS metadata file for WebSSO authentication"
5.11.4.1.2 Setting Up WebSSO Edit source

Start by creating a config file adfs_config.yml with the following parameters and place it in the /var/lib/ardana/openstack/my_cloud/config/keystone/ directory on your Cloud Lifecycle Manager node.

keystone_trusted_idp: adfs
keystone_sp_conf:
    idp_metadata_file: /var/lib/ardana/openstack/my_cloud/config/keystone/adfs_metadata.xml
    shib_sso_application_entity_id: http://sp_uri_entityId
    shib_sso_idp_entity_id: http://default_idp_uri_entityId
    target_domain:
        name: domain1
        description: my domain
    target_project:
        name: project1
        description: my project
    target_group:
        name: group1
        description: my group
    role:
        name: service
    identity_provider:
        id: adfs_idp1
        description: This is the ADFS identity provider.
    mapping:
        id: mapping1
        rules_file: /var/lib/ardana/openstack/my_cloud/config/keystone/adfs_mapping.json
    protocol:
        id: saml2
    attribute_map:
        -
          name: http://schemas.xmlsoap.org/claims/Group
          id: ADFS_GROUP
        -
          name: urn:oid:1.3.6.1.4.1.5923.1.1.1.6
          id: ADFS_LOGIN

A sample config file like this exists in roles/KEY-API/files/samples/websso/keystone_configure_adfs_sample.yml. Here are some detailed descriptions for each of the config options:

keystone_trusted_idp: A flag to indicate if this configuration is used for WebSSO or K2K. The value can be either 'adfs' or 'k2k'.
keystone_sp_conf:
    shib_sso_idp_entity_id: The ADFS URI used as an entity Id to identity the IdP.
    shib_sso_application_entity_id: The Service Provider URI used as a entity Id. It can be any URI here for Websso as long as it is unique to the SP.
    target_domain: A domain where the group will be created from.
        name: Any domain name. If it does not exist, it will be created or be updated.
        description: Any description.
    target_project: A project scope that the group has.
        name: Any project name. If it does not exist, it will be created or be updated.
        description: Any description.
    target_group: A group will be created from 'target_domain'.
        name: Any group name. If it does not exist, it will be created or be updated.
        description: Any description.
    role: A role will be assigned on 'target_project'. This role impacts the idp user scoped token permission at sp side.
        name: It has to be an existing role.
    idp_metadata_file: A reference to the ADFS metadata file that validates the SAML2 assertion.
    identity_provider: An ADFS IdP
        id: Any Id. If it does not exist, it will be created or be updated. This Id needs to be shared with the client so that the right mapping will be selected.
        description: Any description.
    mapping: A mapping in json format that maps a federated user to a corresponding group.
        id: Any Id. If it does not exist, it will be created or be updated.
        rules_file: A reference to the file that has the mapping in json.
    protocol: The supported federation protocol.
        id: 'saml2' is the only supported protocol for Websso.
    attribute_map: A shibboleth mapping defined additional attributes to map the attributes from the SAML2 assertion to the Websso mapping that SP understands.
        -
          name: An attribute name from the SAML2 assertion.
          id: An Id that the above name will be mapped to.
  1. Create a mapping file, adfs_mapping.json, that is referenced from the preceding config file in /var/lib/ardana/openstack/my_cloud/config/keystone/.

         rules_file: /var/lib/ardana/openstack/my_cloud/config/keystone/adfs_mapping.json.

    The following is an example of the mapping file, existing in roles/KEY-API/files/samples/websso/adfs_sp_mapping.json:

    [
                 {
                   "local": [{
                         "user": {
                             "name": "{0}"
                         }
                     }],
                     "remote": [{
                         "type": "ADFS_LOGIN"
                     }]
                  },
                  {
                    "local": [{
                        "group": {
                            "id": "GROUP_ID"
                        }
                    }],
                    "remote": [{
                        "type": "ADFS_GROUP",
                    "any_one_of": [
                        "Domain Users"
                        ]
                    }]
                  }
     ]

    You can find more details about how the WebSSO mapping works at http://docs.openstack.org. Also see Section 5.11.4.1.3, “Mapping rules” for more information.

  2. Add adfs_config.yml and adfs_mapping.json to revision control.

    ardana > cd ~/openstack
    ardana > git checkout site
    ardana > git add my_cloud/config/keystone/adfs_config.yml
    ardana > git add my_cloud/config/keystone/adfs_mapping.json
    ardana > git commit -m "Add ADFS config and mapping."
  3. Go to ~/scratch/ansible/next/ardana/ansible and run the following playbook to enable WebSSO in the keystone identity service:

    ansible-playbook -i hosts/verb_hosts keystone-reconfigure.yml -e@/var/lib/ardana/openstack/my_cloud/config/keystone/adfs_config.yml
  4. Enable WebSSO in the horizon dashboard service by setting horizon_websso_enabled flag to True in roles/HZN-WEB/defaults/main.yml and then run the horizon-reconfigure playbook:

    ardana > ansible-playbook -i hosts/verb_hosts horizon-reconfigure.yml
5.11.4.1.3 Mapping rules Edit source

One IdP-SP has only one mapping. The last mapping that the customer configures will be the one used and will overwrite the old mapping setting. Therefore, if the example mapping adfs_sp_mapping.json is used, the following behavior is expected because it maps the federated user only to the one group configured in keystone_configure_adfs_sample.yml.

  • Configure domain1/project1/group1, mapping1; websso login horizon, see project1;

  • Then reconfigure: domain1/project2/group1. mapping1, websso login horizon, see project1 and project2;

  • Reconfigure: domain3/project3/group3; mapping1, websso login horizon, only see project3; because now the IDP mapping maps the federated user to group3, which only has priviliges on project3.

If you need a more complex mapping, you can use a custom mapping file, which needs to be specified in keystone_configure_adfs_sample.yml -> rules_file.

You can use different attributes of the ADFS user in order to map to different or multiple groups.

An example of a more complex mapping file is adfs_sp_mapping_multiple_groups.json, as follows.

adfs_sp_mapping_multiple_groups.json

[
  {
    "local": [
      {
        "user": {
          "name": "{0}"
        }
      },
      {
        "group": {
           "name": "group1",
           "domain":{
             "name": "domain1"
           }
        }
      }
    ],
    "remote":[{
      "type": "ADFS_LOGIN"
    },
    {
      "type": "ADFS_GROUP",
      "any_one_of":[
         "Domain Users"
      ]
     }
    ]
   },
  {
    "local": [
      {
        "user": {
          "name": "{0}"
        }
      },
      {
        "group": {
           "name": "group2",
           "domain":{
             "name": "domain2"
           }
        }
      }
    ],
    "remote":[{
      "type": "ADFS_LOGIN"
    },
    {
      "type": "ADFS_SCOPED_AFFILIATION",
      "any_one_of": [
          "member@contoso.com"
      ]
    },
    ]
   }
]

The adfs_sp_mapping_multiple_groups.json must be run together with keystone_configure_mutiple_groups_sample.yml, which adds a new attribute for the shibboleth mapping. That file is as follows:

keystone_configure_mutiple_groups_sample.yml

#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
#
---

keystone_trusted_idp: adfs
keystone_sp_conf:
    identity_provider:
        id: adfs_idp1
        description: This is the ADFS identity provider.
    idp_metadata_file: /var/lib/ardana/openstack/my_cloud/config/keystone/adfs_metadata.xml

    shib_sso_application_entity_id: http://blabla
    shib_sso_idp_entity_id: http://WIN-CAICP35LF2I.vlan44.domain/adfs/services/trust

    target_domain:
        name: domain2
        description: my domain

    target_project:
        name: project6
        description: my project

    target_group:
        name: group2
        description: my group

    role:
        name: admin

    mapping:
        id: mapping1
        rules_file: /var/lib/ardana/openstack/my_cloud/config/keystone/adfs_sp_mapping_multiple_groups.json

    protocol:
        id: saml2

    attribute_map:
        -
          name: http://schemas.xmlsoap.org/claims/Group
          id: ADFS_GROUP
        -
          name: urn:oid:1.3.6.1.4.1.5923.1.1.1.6
          id: ADFS_LOGIN
        -
          name: urn:oid:1.3.6.1.4.1.5923.1.1.1.9
          id: ADFS_SCOPED_AFFILIATION

5.11.4.2 Setting up the ADFS server as the identity provider Edit source

For ADFS to be able to communicate with the keystone identity service, you need to add the keystone identity service as a trusted relying party for ADFS and also specify the user attributes that you want to send to the keystone identity service when users authenticate via WebSSO.

For more information, see the Microsoft ADFS wiki, section "Step 2: Configure ADFS 2.0 as the identity provider and shibboleth as the Relying Party".

Log in to the ADFS server.

Add a relying party using metadata

  1. From Server Manager Dashboard, click Tools on the upper right, then ADFS Management.

  2. Right-click ADFS, and then select Add Relying Party Trust.

  3. Click Start, leave the already selected option Import data about the relying party published online or on a local network.

  4. In the Federation metadata address field, type <keystone_publicEndpoint>/Shibboleth.sso/Metadata (your keystone identity service Metadata endpoint), and then click Next. You can also import metadata from a file. Create a file with the content of the result of the following curl command

    curl <keystone_publicEndpoint>/Shibboleth.sso/Metadata

    and then choose this file for importing the metadata for the relying party.

  5. In the Specify Display Name page, choose a proper name to identify this trust relationship, and then click Next.

  6. On the Choose Issuance Authorization Rules page, leave the default Permit all users to access the relying party selected, and then click Next.

  7. Click Next, and then click Close.

Edit claim rules for relying party trust

  1. The Edit Claim Rules dialog box should already be open. If not, In the ADFS center pane, under Relying Party Trusts, right-click your newly created trust, and then click Edit Claim Rules.

  2. On the Issuance Transform Rules tab, click Add Rule.

  3. On the Select Rule Template page, select Send LDAP Attributes as Claims, and then click Next.

  4. On the Configure Rule page, in the Claim rule name box, type Get Data.

  5. In the Attribute Store list, select Active Directory.

  6. In the Mapping of LDAP attributes section, create the following mappings.

    LDAP AttributeOutgoing Claim Type
    Token-Groups – Unqualified NamesGroup
    User-Principal-NameUPN
  7. Click Finish.

  8. On the Issuance Transform Rules tab, click Add Rule.

  9. On the Select Rule Template page, select Send Claims Using a Custom Rule, and then click Next.

  10. In the Configure Rule page, in the Claim rule name box, type Transform UPN to epPN.

  11. In the Custom Rule window, type or copy and paste the following:

    c:[Type == "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/upn"]
    => issue(Type = "urn:oid:1.3.6.1.4.1.5923.1.1.1.6", Value = c.Value, Properties["http://schemas.xmlsoap.org/ws/2005/05/identity/claimproperties/attributename"] = "urn:oasis:names:tc:SAML:2.0:attrname-format:uri");
  12. Click Finish.

  13. On the Issuance Transform Rules tab, click Add Rule.

  14. On the Select Rule Template page, select Send Claims Using a Custom Rule, and then click Next.

  15. On the Configure Rule page, in the Claim rule name box, type Transform Group to epSA.

  16. In the Custom Rule window, type or copy and paste the following:

    c:[Type == "http://schemas.xmlsoap.org/claims/Group", Value == "Domain Users"]
    => issue(Type = "urn:oid:1.3.6.1.4.1.5923.1.1.1.9", Value = "member@contoso.com", Properties["http://schemas.xmlsoap.org/ws/2005/05/identity/claimproperties/attributename"] = "urn:oasis:names:tc:SAML:2.0:attrname-format:uri");
  17. Click Finish, and then click OK.

This list of Claim Rules is just an example and can be modified or enhanced based on the customer's necessities and ADFS setup specifics.

Create a sample user on the ADFS server

  1. From the Server Manager Dashboard, click Tools on the upper right, then Active Directory Users and Computer.

  2. Right click User, then New, and then User.

  3. Follow the on-screen instructions.

You can test the horizon dashboard service "Login with ADFS" by opening a browser at the horizon dashboard service URL and choose Authenticate using: ADFS Credentials. You should be redirected to the ADFS login page and be able to log into the horizon dashboard service with your ADFS credentials.

5.11.5 WebSSO Using OpenID Method Edit source

The interaction between Keystone and the external Identity Provider (IdP) is handled by the Apache2 auth_openidc module.

There are two steps to enable the feature.

  1. Configure Keystone with the required OpenID Connect provider information.

  2. Create the Identity Provider, protocol, and mapping in Keystone, using OpenStack Command Line Tool.

5.11.5.1 Configuring Keystone Edit source

  1. Log in to the Cloud Lifecycle Manager node and edit the ~/openstack/my_cloud/config/keystone/keystone_deploy_config.yml file with the "keystone_openid_connect_conf" variable. For example:

    keystone_openid_connect_conf:
        identity_provider: google
        response_type: id_token
        scope: "openid email profile"
        metadata_url: https://accounts.google.com/.well-known/openid-configuration
        client_id: [Replace with your client ID]
        client_secret: [Replace with your client secret]
        redirect_uri: https://www.myenterprise.com:5000/v3/OS-FEDERATION/identity_providers/google/protocols/openid/auth
        crypto_passphrase: ""

    Where:

    • identity_provider: name of the OpenID Connect identity provider. This must be the same as the identity provider to be created in Keystone using OpenStack Command Line Tool. For example, if the identity provider is foo, we must create the identity provider with the name. For example:

      openstack identity provider create foo
    • response_type: corresponding to auth_openidc OIDCResponseType. In most cases, it should be "id_token".

    • scope: corresponding to auth_openidc OIDCScope.

    • metadata_url: corresponding to auth_openidc OIDCProviderMetadataURL.

    • client_id: corresponding to auth_openidc OIDCClientID.

    • client_secret: corresponding to auth_openidc OIDCClientSecret.

    • redirect_uri: corresponding to auth_openidc OIDCRedirectURI. This must be the Keystone public endpoint for given OpenID Connect identity provider. i.e. "https://keystone-public-endpoint.foo.com/v3/OS-FEDERATION/identity_providers/foo/protocols/openid/auth".

      Warning
      Warning

      Some OpenID Connect IdPs such as Google require the hostname in the "redirect_uri" to be a public FQDN. In that case, the hostname in Keystone public endpoint must also be a public FQDN and must match the one specified in the "redirect_uri".

    • crypto_passphrase: corresponding to auth_openidc OIDCCryptoPassphrase. If left blank, a random cryto passphrase will be generated.

  2. Commit the changes to your local git repository.

    cd ~/openstack/ardana/ansible
    git add -A
    git commit -m "add OpenID Connect configuration"
  3. Run keystone-reconfigure Ansible playbook.

    cd ~/scratch/ansible/next/ardana/ansible
    ansible-playbook -i hosts/verb_hosts keystone-reconfigure.yml

5.11.5.2 Configure Horizon Edit source

Complete the following steps to configure horizon to support WebSSO with OpenID method.

  1. Edit the ~/openstack/ardana/ansible/roles/HZN-WEB/defaults/main.yml file and set the following parameter to True.

    horizon_websso_enabled: True
  2. Locate the last line in the ~/openstack/ardana/ansible/roles/HZN-WEB/defaults/main.yml file. The default configuration for this line should look like the following:

    horizon_websso_choices:
      - {protocol: saml2, description: "ADFS Credentials"}
    • If your cloud does not have AD FS enabled, then replace the preceding horizon_websso_choices: parameter with the following.

      - {protocol: openid, description: "OpenID Connect"}

      The resulting block should look like the following.

      horizon_websso_choices:
          - {protocol: openid, description: "OpenID Connect"}
    • If your cloud does have ADFS enabled, then simply add the following parameter to the horizon_websso_choices: section. Do not replace the default parameter, add the following line to the existing block.

      - {protocol: saml2, description: "ADFS Credentials"}

      If your cloud has ADFS enabled, the final block of your ~/openstack/ardana/ansible/roles/HZN-WEB/defaults/main.yml should have the following entries.

      horizon_websso_choices:
          - {protocol: openid, description: "OpenID Connect"}
          - {protocol: saml2, description: "ADFS Credentials"}
  3. Run the following commands to add your changes to the local git repository, and reconfigure the horizon service, enabling the changes made in Step 1:

    cd ~/openstack
    git add -A
    git commit -m "Configured WebSSO using OpenID Connect"
    cd ~/openstack/ardana/ansible/
    ansible-playbook -i hosts/localhost config-processor-run.yml
    ansible-playbook -i hosts/localhost ready-deployment.yml
    cd ~/scratch/ansible/next/ardana/ansible
    ansible-playbook -i hosts/verb_hosts horizon-reconfigure.yml

5.11.5.3 Create Identity Provider, Protocol, and Mapping Edit source

To fully enable OpenID Connect, Identity Provider, Protocol, and Mapping for the given IdP must be created in Keystone. This is done by using the OpenStack Command Line Tool using the Keystone admin credential.

  1. Log in to the Cloud Lifecycle Manager node and source keystone.osrc file.

    source ~/keystone.osrc
  2. Create the Identity Provider. For example:

    openstack identity provider create foo
    Warning
    Warning

    The name of the Identity Provider must be exactly the same as the "identity_provider" attribute given when configuring Keystone in the previous section.

  3. Next, create the Mapping for the Identity Provider. Prior to creating the Mapping, one must fully grasp the intricacies of Mapping Combinations as it may have profound security implications if done incorrectly. Here's an example of a mapping file.

    [
        {
            "local": [
                {
                    "user": {
                        "name": "{0}",
                        "email": "{1}",
                        "type": "ephemeral"
                     },
                     "group": {
                        "domain": {
                            "name": "Default"
                        },
                        "name": "openidc_demo"
                    }
                 }
             ],
             "remote": [
                 {
                     "type": "REMOTE_USER"
                 },
                 {
                     "type": "HTTP_OIDC_EMAIL"
                 }
    
            ]
        }
    ]

    Once the mapping file is created, now create the Mapping resource in Keystone. For example:

    openstack mapping create --rule oidc_mapping.json oidc_mapping
  4. Lastly, create the Protocol for the Identity Provider and its mapping. For OpenID Connect, the protocol name must be openid. For example:

    openstack federation protocol create --identity-provider google --mapping oidc_mapping openid

5.12 Identity Service Notes and Limitations Edit source

5.12.1 Notes Edit source

This topic describes limitations of and important notes pertaining to the identity service. Domains

  • Domains can be created and managed by the horizon web interface, keystone API and OpenStackClient CLI.

  • The configuration of external authentication systems requires the creation and usage of Domains.

  • All configurations are managed by creating and editing specific configuration files.

  • End users can authenticate to a particular project and domain via the horizon web interface, keystone API and OpenStackClient CLI.

  • A new horizon login page that requires a Domain entry is now installed by default.

keystone-to-keystone Federation

  • keystone-to-keystone (K2K) Federation provides the ability to authenticate once with one cloud and then use these credentials to access resources on other federated clouds.

  • All configurations are managed by creating and editing specific configuration files.

Multi-Factor Authentication (MFA)

  • The keystone architecture provides support for MFA deployments.

  • MFA provides the ability to deploy non-password based authentication; for example: token providing hardware and text messages.

Hierarchical Multitenancy

  • Provides the ability to create sub-projects within a Domain-Project hierarchy.

Hash Algorithm Configuration

  • The default hash algorithm is bcrypt, which has a built-in limitation of 72 characters. As keystone defaults to a secret length of 86 characters, customers may choose to change the keystone hash algorithm to one that supports the full length of their secret.

  • Process for changing the hash algorithm configuration:

    1. Update the identity section of keystone.conf.j2 to reference the desired algorithm

      [identity]
      password_hash_algorithm=pbkdf2_sha512
    2. commit the changes

    3. run the keystone-redeploy.yml playbook

      ansible-playbook -i hosts/verb_hosts keystone_redeploy.yml
    4. verify that existing users retain access by logging into Horizon

5.12.2 Limitations Edit source

Authentication with external authentication systems (LDAP, Active Directory (AD) or Identity Providers)

  • No horizon web portal support currently exists for the creation and management of external authentication system configurations.

Integration with LDAP services SUSE OpenStack Cloud 9 domain-specific configuration:

  • No Global User Listing: Once domain-specific driver configuration is enabled, listing all users and listing all groups are not supported operations. Those calls require a specific domain filter and a domain-scoped token for the target domain.

  • You cannot have both a file store and a database store for domain-specific driver configuration in a single identity service instance. Once a database store is enabled within the identity service instance, any file store will be ignored, and vice versa.

  • The identity service allows a list limit configuration to globally set the maximum number of entities that will be returned in an identity collection per request but it does not support per-domain list limit setting at this time.

  • Each time a new domain is configured with LDAP integration the single CA file gets overwritten. Ensure that you place certs for all the LDAP back-end domains in the cacert parameter. Detailed CA file inclusion instructions are provided in the comments of the sample YAML configuration file keystone_configure_ldap_my.yml (see Section 5.9.2, “Set up domain-specific driver configuration - file store”).

  • LDAP is only supported for identity operations (reading users and groups from LDAP).

  • keystone assignment operations from LDAP records such as managing or assigning roles and projects, are not currently supported.

  • The SUSE OpenStack Cloud 'default' domain is pre-configured to store service account users and is authenticated locally against the identity service. Domains configured for external LDAP integration are non-default domains.

  • When using the current OpenStackClient CLI you must use the user ID rather than the user name when working with a non-default domain.

  • Each LDAP connection with the identity service is for read-only operations. Configurations that require identity service write operations (to create users, groups, etc.) are not currently supported.

  • LDAP is only supported for identity operations (reading users and groups from LDAP). keystone assignment operations from LDAP records such as managing or assigning roles and projects, are not currently supported.

  • When using the current OpenStackClient CLI you must use the user ID rather than the user name when working with a non-default domain.

SUSE OpenStack Cloud 9 API-based domain-specific configuration management

  • No GUI dashboard for domain-specific driver configuration management

  • API-based Domain specific config does not check for type of option.

  • API-based Domain specific config does not check for option values supported.

  • API-based Domain config method does not provide retrieval of default values of domain-specific configuration options.

  • Status: Domain-specific driver configuration database store is a non-core feature for SUSE OpenStack Cloud 9.

5.12.3 keystone-to-keystone federation Edit source

  • When a user is disabled in the identity provider, the issued federated token from the service provider still remains valid until the token is expired based on the keystone expiration setting.

  • An already issued federated token will retain its scope until its expiration. Any changes in the mapping on the service provider will not impact the scope of an already issued federated token. For example, if an already issued federated token was mapped to group1 that has scope on project1, and mapping is changed to group2 that has scope on project2, the prevously issued federated token still has scope on project1.

  • Access to service provider resources is provided only through the python-keystone CLI client or the keystone API. No horizon web interface support is currently available.

  • Domains, projects, groups, roles, and quotas are created per the service provider cloud. Support for federated projects, groups, roles, and quotas is currently not available.

  • keystone-to-keystone federation and WebSSO cannot be configured by putting both sets of configuration attributes in the same config file; they will overwrite each other. Consequently, they need to be configured individually.

  • Scoping the federated user to a domain is not supported by default in the playbook. To enable it, see the steps in Section 5.10.7, “Scope Federated User to Domain”.

  • No horizon web portal support currently exists for the creation and management of federation configurations.

  • All end user authentication is available only via the keystone API and OpenStackClient CLI.

  • Additional information can be found at http://docs.openstack.org.

WebSSO

  • The WebSSO function supports only horizon web authentication. It is not supported for direct API or CLI access.

  • WebSSO works only with Fernet token provider. See Section 5.8.4, “Fernet Tokens”.

  • The SUSE OpenStack Cloud WebSSO function with SAML method was tested with Microsoft Active Directory Federation Services (ADFS). The instructions provided are pertinent to ADFS and are intended to provide a sample configuration for deploying WebSSO with an external identity provider. If you have a different identity provider such as Ping Identity or IBM Tivoli, consult with those vendors for specific instructions for those products.

  • The SUSE OpenStack Cloud WebSSO function with OpenID method was tested with Google OAuth 2.0 APIs, which conform to the OpenID Connect specification. The interaction between keystone and the external Identity Provider (IdP) is handled by the Apache2 auth_openidc module. Please consult with the specific OpenID Connect vendor on whether they support auth_openidc

  • Both SAML and OpenID methods are supported for WebSSO federation in SUSE OpenStack Cloud 9 .

  • WebSSO has a change password option in User Settings, but note that this function is not accessible for users authenticating with external systems such as LDAP or SAML Identity Providers.

Multi-factor authentication (MFA)

Hierarchical multitenancy

Missing quota information for compute resources

Note
Note

An error message that will appear in the default horizon page if you are running a swift-only deployment (no Compute service). In this configuration, you will not see any quota information for Compute resources and will see the following error message:

The Compute service is not installed or is not configured properly. No information is available for Compute resources. This error message is expected as no Compute service is configured for this deployment. Please ignore the error message.

The following is the benchmark of the performance that is based on 150 concurrent requests and run for 10 minute periods of stable load time.

Operation In SUSE OpenStack Cloud 9 (secs/request)In SUSE OpenStack Cloud 9 3.0 (secs/request)
Token Creation 0.860.42
Token Validation0.470.41

Considering that token creation operations do not happen as frequently as token validation operations, you are likely to experience less of a performance problem regardless of the extended time for token creation.

5.12.4 System cron jobs need setup Edit source

keystone relies on two cron jobs to periodically clean up expired tokens and for token revocation. The following is how the cron jobs appear on the system:

1 1 * * * /opt/stack/service/keystone/venv/bin/keystone-manage token_flush
1 1,5,10,15,20 * * * /opt/stack/service/keystone/venv/bin/revocation_cleanup.sh

By default, the two cron jobs are enabled on controller node 1 only, not on the other two nodes. When controller node 1 is down or has failed for any reason, these two cron jobs must be manually set up on one of the other two nodes.

6 Managing Compute Edit source

Information about managing and configuring the Compute service.

6.1 Managing Compute Hosts using Aggregates and Scheduler Filters Edit source

OpenStack nova has the concepts of availability zones and host aggregates that enable you to segregate your compute hosts. Availability zones are used to specify logical separation within your cloud based on the physical isolation or redundancy you have set up. Host aggregates are used to group compute hosts together based upon common features, such as operation system. For more information, read this topic.

OpenStack nova has the concepts of availability zones and host aggregates that enable you to segregate your Compute hosts. Availability zones are used to specify logical separation within your cloud based on the physical isolation or redundancy you have set up. Host aggregates are used to group compute hosts together based upon common features, such as operation system. For more information, see Scaling and Segregating your Cloud.

The nova scheduler also has a filter scheduler, which supports both filtering and weighting to make decisions on where new compute instances should be created. For more information, see Filter Scheduler and Scheduling.

This document is going to show you how to set up both a nova host aggregate and configure the filter scheduler to further segregate your compute hosts.

6.1.1 Creating a nova Aggregate Edit source

These steps will show you how to create a nova aggregate and how to add a compute host to it. You can run these steps on any machine that contains the OpenStackClient that also has network access to your cloud environment. These requirements are met by the Cloud Lifecycle Manager.

  1. Log in to the Cloud Lifecycle Manager.

  2. Source the administrative creds:

    ardana > source ~/service.osrc
  3. List your current nova aggregates:

    ardana > openstack aggregate list
  4. Create a new nova aggregate with this syntax:

    ardana > openstack aggregate create AGGREGATE-NAME

    If you wish to have the aggregate appear as an availability zone, then specify an availability zone with this syntax:

    ardana > openstack aggregate create AGGREGATE-NAME AVAILABILITY-ZONE-NAME

    So, for example, if you wish to create a new aggregate for your SUSE Linux Enterprise compute hosts and you wanted that to show up as the SLE availability zone, you could use this command:

    ardana > openstack aggregate create SLE SLE

    This would produce an output similar to this:

    +----+------+-------------------+-------+------------------+
    | Id | Name | Availability Zone | Hosts | Metadata
    +----+------+-------------------+-------+--------------------------+
    | 12 | SLE  | SLE               |       | 'availability_zone=SLE'
    +----+------+-------------------+-------+--------------------------+
  5. Next, you need to add compute hosts to this aggregate so you can start by listing your current hosts. You can view the current list of hosts running running the compute service like this:

    ardana > openstack hypervisor list
  6. You can then add host(s) to your aggregate with this syntax:

    ardana > openstack aggregate add host AGGREGATE-NAME HOST
  7. Then you can confirm that this has been completed by listing the details of your aggregate:

    openstack aggregate show AGGREGATE-NAME

    You can also list out your availability zones using this command:

    ardana > openstack availability zone list

6.1.2 Using nova Scheduler Filters Edit source

The nova scheduler has two filters that can help with differentiating between different compute hosts that we'll describe here.

FilterDescription
AggregateImagePropertiesIsolation

Isolates compute hosts based on image properties and aggregate metadata. You can use commas to specify multiple values for the same property. The filter will then ensure at least one value matches.

AggregateInstanceExtraSpecsFilter

Checks that the aggregate metadata satisfies any extra specifications associated with the instance type. This uses aggregate_instance_extra_specs

Note
Note

For details about other available filters, see Filter Scheduler.

Using the AggregateImagePropertiesIsolation Filter

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the ~/openstack/my_cloud/config/nova/nova.conf.j2 file and add AggregateImagePropertiesIsolation to the scheduler_filters section. Example below, in bold:

    # Scheduler
    ...
    scheduler_available_filters = nova.scheduler.filters.all_filters
    scheduler_default_filters = AvailabilityZoneFilter,RetryFilter,ComputeFilter,
     DiskFilter,RamFilter,ImagePropertiesFilter,ServerGroupAffinityFilter,
     ServerGroupAntiAffinityFilter,ComputeCapabilitiesFilter,NUMATopologyFilter,
     AggregateImagePropertiesIsolation
    ...

    Optionally, you can also add these lines:

    aggregate_image_properties_isolation_namespace = <a prefix string>
    aggregate_image_properties_isolation_separator = <a separator character>

    (defaults to .)

    If these are added, the filter will only match image properties starting with the name space and separator - for example, setting to my_name_space and : would mean the image property my_name_space:image_type=SLE matches metadata image_type=SLE, but an_other=SLE would not be inspected for a match at all.

    If these are not added all image properties will be matched against any similarly named aggregate metadata.

  3. Add image properties to images that should be scheduled using the above filter

  4. Commit the changes to git:

    ardana > git add -A
    ardana > git commit -a -m "editing nova schedule filters"
  5. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  6. Run the ready deployment playbook:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  7. Run the nova reconfigure playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts nova-reconfigure.yml

Using the AggregateInstanceExtraSpecsFilter Filter

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the ~/openstack/my_cloud/config/nova/nova.conf.j2 file and add AggregateInstanceExtraSpecsFilter to the scheduler_filters section. Example below, in bold:

    # Scheduler
    ...
    scheduler_available_filters = nova.scheduler.filters.all_filters
     scheduler_default_filters = AvailabilityZoneFilter,RetryFilter,ComputeFilter,
     DiskFilter,RamFilter,ImagePropertiesFilter,ServerGroupAffinityFilter,
     ServerGroupAntiAffinityFilter,ComputeCapabilitiesFilter,NUMATopologyFilter,
     AggregateInstanceExtraSpecsFilter
    ...
  3. There is no additional configuration needed because the following is true:

    1. The filter assumes : is a separator

    2. The filter will match all simple keys in extra_specs plus all keys with a separator if the prefix is aggregate_instance_extra_specs - for example, image_type=SLE and aggregate_instance_extra_specs:image_type=SLE will both be matched against aggregate metadata image_type=SLE

  4. Add extra_specs to flavors that should be scheduled according to the above.

  5. Commit the changes to git:

    ardana > git add -A
    ardana > git commit -a -m "Editing nova scheduler filters"
  6. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  7. Run the ready deployment playbook:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  8. Run the nova reconfigure playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts novan-reconfigure.yml

6.2 Using Flavor Metadata to Specify CPU Model Edit source

Libvirt is a collection of software used in OpenStack to manage virtualization. It has the ability to emulate a host CPU model in a guest VM. In SUSE OpenStack Cloud nova, the ComputeCapabilitiesFilter limits this ability by checking the exact CPU model of the compute host against the requested compute instance model. It will only pick compute hosts that have the cpu_model requested by the instance model, and if the selected compute host does not have that cpu_model, the ComputeCapabilitiesFilter moves on to find another compute host that matches, if possible. Selecting an unavailable vCPU model may cause nova to fail with no valid host found.

To assist, there is a nova scheduler filter that captures cpu_models as a subset of a particular CPU family. The filter determines if the host CPU model is capable of emulating the guest CPU model by maintaining the mapping of the vCPU models and comparing it with the host CPU model.

There is a limitation when a particular cpu_model is specified with hw:cpu_model via a compute flavor: the cpu_mode will be set to custom. This mode ensures that a persistent guest virtual machine will see the same hardware no matter what host physical machine the guest virtual machine is booted on. This allows easier live migration of virtual machines. Because of this limitation, only some of the features of a CPU are exposed to the guest. Requesting particular CPU features is not supported.

6.2.1 Editing the flavor metadata in the horizon dashboard Edit source

These steps can be used to edit a flavor's metadata in the horizon dashboard to add the extra_specs for a cpu_model:

  1. Access the horizon dashboard and log in with admin credentials.

  2. Access the Flavors menu by (A) clicking on the menu button, (B) navigating to the Admin section, and then (C) clicking on Flavors:

  3. In the list of flavors, choose the flavor you wish to edit and click on the entry under the Metadata column:

    Note
    Note

    You can also create a new flavor and then choose that one to edit.

  4. In the Custom field, enter hw:cpu_model and then click on the + (plus) sign to continue:

  5. Then you will want to enter the CPU model into the field that you wish to use and then click Save:

6.3 Forcing CPU and RAM Overcommit Settings Edit source

SUSE OpenStack Cloud supports overcommitting of CPU and RAM resources on compute nodes. Overcommitting is a technique of allocating more virtualized CPUs and/or memory than there are physical resources.

The default settings for this are:

SettingDefault ValueDescription
cpu_allocation_ratio16

Virtual CPU to physical CPU allocation ratio which affects all CPU filters. This configuration specifies a global ratio for CoreFilter. For AggregateCoreFilter, it will fall back to this configuration value if no per-aggregate setting found.

Note
Note

This can be set per-compute, or if set to 0.0, the value set on the scheduler node(s) will be used and defaulted to 16.0.

ram_allocation_ratio1.0

Virtual RAM to physical RAM allocation ratio which affects all RAM filters. This configuration specifies a global ratio for RamFilter. For AggregateRamFilter, it will fall back to this configuration value if no per-aggregate setting found.

Note
Note

This can be set per-compute, or if set to 0.0, the value set on the scheduler node(s) will be used and defaulted to 1.5.

disk_allocation_ratio1.0

This is the virtual disk to physical disk allocation ratio used by the disk_filter.py script to determine if a host has sufficient disk space to fit a requested instance. A ratio greater than 1.0 will result in over-subscription of the available physical disk, which can be useful for more efficiently packing instances created with images that do not use the entire virtual disk,such as sparse or compressed images. It can be set to a value between 0.0 and 1.0 in order to preserve a percentage of the disk for uses other than instances.

Note
Note

This can be set per-compute, or if set to 0.0, the value set on the scheduler node(s) will be used and defaulted to 1.0.

6.3.1 Changing the overcommit ratios for your entire environment Edit source

If you wish to change the CPU and/or RAM overcommit ratio settings for your entire environment then you can do so via your Cloud Lifecycle Manager with these steps.

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the nova configuration settings located in this file:

    ~/openstack/my_cloud/config/nova/nova.conf.j2
  3. Add or edit the following lines to specify the ratios you wish to use:

    cpu_allocation_ratio = 16
    ram_allocation_ratio = 1.0
  4. Commit your configuration to the Git repository (Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”), as follows:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git add -A
    ardana > git commit -m "setting nova overcommit settings"
  5. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  6. Update your deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  7. Run the nova reconfigure playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts nova-reconfigure.yml

6.4 Enabling the Nova Resize and Migrate Features Edit source

The nova resize and migrate features are disabled by default. If you wish to utilize these options, these steps will show you how to enable it in your cloud.

The two features below are disabled by default:

These two features are disabled by default because they require passwordless SSH access between Compute hosts with the user having access to the file systems to perform the copy.

6.4.1 Enabling Nova Resize and Migrate Edit source

If you wish to enable these features, use these steps on your lifecycle manager. This will deploy a set of public and private SSH keys to the Compute hosts, allowing the nova user SSH access between each of your Compute hosts.

  1. Log in to the Cloud Lifecycle Manager.

  2. Run the nova reconfigure playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts nova-reconfigure.yml --extra-vars nova_migrate_enabled=true
  3. To ensure that the resize and migration options show up in the horizon dashboard, run the horizon reconfigure playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts horizon-reconfigure.yml

6.4.2 Disabling Nova Resize and Migrate Edit source

This feature is disabled by default. However, if you have previously enabled it and wish to re-disable it, you can use these steps on your lifecycle manager. This will remove the set of public and private SSH keys that were previously added to the Compute hosts, removing the nova users SSH access between each of your Compute hosts.

  1. Log in to the Cloud Lifecycle Manager.

  2. Run the nova reconfigure playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts nova-reconfigure.yml --extra-vars nova_migrate_enabled=false
  3. To ensure that the resize and migrate options are removed from the horizon dashboard, run the horizon reconfigure playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts horizon-reconfigure.yml

6.5 Enabling ESX Compute Instance(s) Resize Feature Edit source

The resize of ESX compute instance is disabled by default. If you want to utilize this option, these steps will show you how to configure and enable it in your cloud.

The following feature is disabled by default:

  • Resize - this feature allows you to change the size of a Compute instance by changing its flavor. See the OpenStack User Guide for more details on its use.

6.5.1 Procedure Edit source

If you want to configure and re-size ESX compute instance(s), perform the following steps:

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the ~ /openstack/my_cloud/config/nova/nova.conf.j2 to add the following parameter under Policy:

    # Policy
    allow_resize_to_same_host=True
  3. Commit your configuration:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git add -A
    ardana > git commit -m "<commit message>"
  4. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  5. Update your deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml

    By default the nova resize feature is disabled. To enable nova resize, refer to Section 6.4, “Enabling the Nova Resize and Migrate Features”.

    By default an ESX console log is not set up. For more details about Hypervisor setup, refer to the OpenStack documentation.

6.6 GPU passthrough Edit source

GPU passthrough for SUSE OpenStack Cloud provides the nova instance direct access to the GPU device for increased performance.

This section demonstrates the steps to pass through a Nvidia GPU card supported by SUSE OpenStack Cloud,

Note
Note

Resizing the VM to the same host with the same PCI card is not supported with PCI passthrough.

The following steps are necessary to leverage PCI passthrough on a SUSE OpenStack Cloud 9 Compute Node: preparing the Compute Node, preparing nova via the input model updates and glance. Ensure you follow the below procedures in sequence:

Procedure 6.1: Preparing the Compute Node
  1. There should be no kernel drivers or binaries with direct access to the PCI device. If there are kernel modules, ensure they are blacklisted.

    For example, it is common to have a nouveau driver from when the node was installed. This driver is a graphics driver for Nvidia-based GPUs. It must be blacklisted as shown in this example:

    ardana > echo 'blacklist nouveau' >> /etc/modprobe.d/nouveau-default.conf

    The file location and its contents are important, however the name of the file is your choice. Other drivers can be blacklisted in the same manner, including Nvidia drivers.

  2. On the host, iommu_groups is necessary and may already be enabled. To check if IOMMU is enabled, run the following commands:

    root #  virt-host-validate
            .....
            QEMU: Checking if IOMMU is enabled by kernel
            : WARN (IOMMU appears to be disabled in kernel. Add intel_iommu=on to kernel cmdline arguments)
            .....

    To modify the kernel command line as suggested in the warning, edit /etc/default/grub and append intel_iommu=on to the GRUB_CMDLINE_LINUX_DEFAULT variable. Run:

    root #  update-bootloader

    Reboot to enable iommu_groups.

  3. After the reboot, check that IOMMU is enabled:

    root # virt-host-validate
            .....
            QEMU: Checking if IOMMU is enabled by kernel
            : PASS
            .....
  4. Confirm IOMMU groups are available by finding the group associated with your PCI device (for example Nvidia GPU):

    ardana > lspci -nn | grep -i nvidia
            84:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100 PCIe 16GB] [10de:1db4] (rev
            a1)

    In this example, 84:00.0 is the address of the PCI device. The vendorID is 10de. The product ID is 1db4.

  5. Confirm that the devices are available for passthrough:

    ardana > ls -ld /sys/kernel/iommu_groups/*/devices/*84:00.?/
            drwxr-xr-x 3 root root 0 Nov 19 17:00 /sys/kernel/iommu_groups/56/devices/0000:84:00.0/

6.6.1 Preparing nova via the input model updates Edit source

To implement the required configuration, log into the Cloud Lifecycle Manager node and update the Cloud Lifecycle Manager model files to enable GPU passthrough for compute nodes.

Edit servers.yml

Add the pass-through section after the definition of servers section in the servers.yml file. The following example shows only the relevant sections:

        ---
        product:
        version: 2

        baremetal:
        netmask: 255.255.255.0
        subnet: 192.168.100.0


        servers:
        .
        .
        .
        .

          - id: compute-0001
            ip-addr: 192.168.75.5
            role: COMPUTE-ROLE
            server-group: RACK3
            nic-mapping: HP-DL360-4PORT
            ilo-ip: ****
            ilo-user: ****
            ilo-password: ****
            mac-addr: ****
          .
          .
          .

          - id: compute-0008
            ip-addr: 192.168.75.7
            role: COMPUTE-ROLE
            server-group: RACK2
            nic-mapping: HP-DL360-4PORT
            ilo-ip: ****
            ilo-user: ****
            ilo-password: ****
            mac-addr: ****

        pass-through:
          servers:
            - id: compute-0001
              data:
                gpu:
                  - vendor_id: 10de
                    product_id: 1db4
                    bus_address: 0000:84:00.0
                    pf_mode: type-PCI
                    name: a1
                  - vendor_id: 10de
                    product_id: 1db4
                    bus_address: 0000:85:00.0
                    pf_mode: type-PCI
                    name: b1
            - id: compute-0008
              data:
                gpu:
                  - vendor_id: 10de
                    product_id: 1db4
                    pf_mode: type-PCI
                    name: c1
  1. Check out the site branch of the local git repository and change to the correct directory:

    ardana > cd ~/openstack
            ardana > git checkout site
            ardana > cd ~/openstack/my_cloud/definition/data/
  2. Open the file containing the servers list, for example servers.yml, with your chosen editor. Save the changes to the file and commit to the local git repository:

    ardana > git add -A

    Confirm that the changes to the tree are relevant changes and commit:

    ardana > git status
            ardana > git commit -m "your commit message goes here in quotes"
  3. Enable your changes by running the necessary playbooks:

    ardana > cd ~/openstack/ardana/ansible
            ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
            ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
            ardana > cd ~/scratch/ansible/next/ardana/ansible

    If you are enabling GPU passthrough for your compute nodes during your initial installation, run the following command:

    ardana > ansible-playbook -i hosts/verb_hosts site.yml

    If you are enabling GPU passthrough for your compute nodes post-installation, run the following command:

    ardana > ansible-playbook -i hosts/verb_hosts nova-reconfigure.yml

The above procedure updates the configuration for the nova api, nova compute and scheduler as defined in https://docs.openstack.org/nova/rocky/admin/pci-passthrough.html.

The following is the PCI configuration for the compute0001 node using the above example post-playbook run:

        [pci]
        passthrough_whitelist = [{"address": "0000:84:00.0"}, {"address": "0000:85:00.0"}]
        alias = {"vendor_id": "10de", "name": "a1", "device_type": "type-PCI", "product_id": "1db4"}
        alias = {"vendor_id": "10de", "name": "b1", "device_type": "type-PCI", "product_id": "1db4"}

The following is the PCI configuration for compute0008 node using the above example post-playbook run:

        [pci]
        passthrough_whitelist = [{"vendor_id": "10de", "product_id": "1db4"}]
        alias = {"vendor_id": "10de", "name": "c1", "device_type": "type-PCI", "product_id": "1db4"}
Note
Note

After running the site.yml playbook above, reboot the compute nodes that are configured with Intel PCI devices.

6.6.2 Create a flavor Edit source

For GPU passthrough, set the pci_passthrough:alias property. You can do so for an existing flavor or create a new flavor as shown in the example below:

        ardana > openstack flavor create --ram 8192 --disk 100 --vcpu 8 gpuflavor
        ardana > openstack flavor set gpuflavor --property "pci_passthrough:alias"="a1:1"

Here the a1 references the alias name as provided in the model while the 1 tells nova that a single GPU should be assigned.

Boot an instance using the flavor created above:

         ardana > openstack server create --flavor gpuflavor --image sles12sp4 --key-name key --nic net-id=$net_id gpu-instance-1

6.7 Configuring the Image Service Edit source

The Image service, based on OpenStack glance, works out of the box and does not need any special configuration. However, we show you how to enable glance image caching as well as how to configure your environment to allow the glance copy-from feature if you choose to do so. A few features detailed below will require some additional configuration if you choose to use them.

Warning
Warning

glance images are assigned IDs upon creation, either automatically or specified by the user. The ID of an image should be unique, so if a user assigns an ID which already exists, a conflict (409) will occur.

This only becomes a problem if users can publicize or share images with others. If users can share images AND cannot publicize images then your system is not vulnerable. If the system has also been purged (via glance-manage db purge) then it is possible for deleted image IDs to be reused.

If deleted image IDs can be reused then recycling of public and shared images becomes a possibility. This means that a new (or modified) image can replace an old image, which could be malicious.

If this is a problem for you, please contact Sales Engineering.

6.7.1 How to enable glance image caching Edit source

In SUSE OpenStack Cloud 9, by default, the glance image caching option is not enabled. You have the option to have image caching enabled and these steps will show you how to do that.

The main benefits to using image caching is that it will allow the glance service to return the images faster and it will cause less load on other services to supply the image.

In order to use the image caching option you will need to supply a logical volume for the service to use for the caching.

If you wish to use the glance image caching option, you will see the section below in your ~/openstack/my_cloud/definition/data/disks_controller.yml file. You will specify the mount point for the logical volume you wish to use for this.

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit your ~/openstack/my_cloud/definition/data/disks_controller.yml file and specify the volume and mount point for your glance-cache. Here is an example:

    # glance cache: if a logical volume with consumer usage glance-cache
    # is defined glance caching will be enabled. The logical volume can be
    # part of an existing volume group or a dedicated volume group.
     - name: glance-vg
       physical-volumes:
         - /dev/sdx
       logical-volumes:
         - name: glance-cache
           size: 95%
           mount: /var/lib/glance/cache
           fstype: ext4
           mkfs-opts: -O large_file
           consumer:
             name: glance-api
             usage: glance-cache

    If you are enabling image caching during your initial installation, prior to running site.yml the first time, then continue with the installation steps. However, if you are making this change post-installation then you will need to commit your changes with the steps below.

  3. Commit your configuration to the Git repository (Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”), as follows:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git add -A
    ardana > git commit -m "My config or other commit message"
  4. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  5. Update your deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  6. Run the glance reconfigure playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts glance-reconfigure.yml

An existing volume image cache is not properly deleted when cinder detects the source image has changed. After updating any source image, delete the cache volume so that the cache is refreshed.

The volume image cache must be deleted before trying to use the associated source image in any other volume operations. This includes creating bootable volumes or booting an instance with create volume enabled and the updated image as the source image.

6.7.2 Allowing the glance copy-from option in your environment Edit source

When creating images, one of the options you have is to copy the image from a remote location to your local glance store. You do this by specifying the --copy-from option when creating the image. To use this feature though you need to ensure the following conditions are met:

  • The server hosting the glance service must have network access to the remote location that is hosting the image.

  • There cannot be a proxy between glance and the remote location.

  • The glance v1 API must be enabled, as v2 does not currently support the copy-from function.

  • The http glance store must be enabled in the environment, following the steps below.

Enabling the HTTP glance Store

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the ~/openstack/my_cloud/config/glance/glance-api.conf.j2 file and add http to the list of glance stores in the [glance_store] section as seen below in bold:

    [glance_store]
    stores = {{ glance_stores }}, http
  3. Commit your configuration to the Git repository (Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”), as follows:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git add -A
    ardana > git commit -m "My config or other commit message"
  4. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  5. Update your deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  6. Run the glance reconfigure playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts glance-reconfigure.yml
  7. Run the horizon reconfigure playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts horizon-reconfigure.yml

7 Managing ESX Edit source

Information about managing and configuring the ESX service.

7.1 Networking for ESXi Hypervisor (OVSvApp) Edit source

To provide the network as a service for tenant VM's hosted on ESXi Hypervisor, a service VM called OVSvApp VM is deployed on each ESXi Hypervisor within a cluster managed by OpenStack nova, as shown in the following figure.

The OVSvApp VM runs SLES as a guest operating system, and has Open vSwitch 2.1.0 or above installed. It also runs an agent called OVSvApp agent, which is responsible for dynamically creating the port groups for the tenant VMs and manages OVS bridges, which contain the flows related to security groups and L2 networking.

To facilitate fault tolerance and mitigation of data path loss for tenant VMs, run the neutron-ovsvapp-agent-monitor process as part of the neutron-ovsvapp-agent service, responsible for monitoring the Open vSwitch module within the OVSvApp VM. It also uses a nginx server to provide the health status of the Open vSwitch module to the neutron server for mitigation actions. There is a mechanism to keep the neutron-ovsvapp-agent service alive through a systemd script.

When a OVSvApp Service VM crashes, an agent monitoring mechanism starts a cluster mitigation process. You can mitigate data path traffic loss for VMs on the failed ESX host in that cluster by putting the failed ESX host in the maintenance mode. This, in turn, triggers the vCenter DRS migrates tenant VMs to other ESX hosts within the same cluster. This ensures data path continuity of tenant VMs traffic.

View Cluster Mitigation

Important
Important

Install python-networking-vsphere so that neutron ovsvapp commands will work properly.

ardana > sudo zypper in python-networking-vsphere

An administrator can view cluster mitigation status using the following commands.

  • neutron ovsvapp-mitigated-cluster-list

    Lists all the clusters where at least one round of host mitigation has happened.

    Example:

    ardana > neutron ovsvapp-mitigated-cluster-list
    +----------------+--------------+-----------------------+---------------------------+
    | vcenter_id     | cluster_id   | being_mitigated       | threshold_reached         |
    +----------------+--------------+-----------------------+---------------------------+
    | vcenter1       | cluster1     | True                  | False                     |
    | vcenter2       | cluster2     | False                 | True                      |
    +---------------+------------+-----------------+------------------------------------+
  • neutron ovsvapp-mitigated-cluster-show --vcenter-id <VCENTER_ID> --cluster-id <CLUSTER_ID>

    Shows the status of a particular cluster.

    Example :

    ardana > neutron ovsvapp-mitigated-cluster-show --vcenter-id vcenter1 --cluster-id cluster1
    +---------------------------+-------------+
    | Field                     | Value       |
    +---------------------------+-------------+
    | being_mitigated           | True        |
    | cluster_id                | cluster1    |
    | threshold_reached         | False       |
    | vcenter_id                | vcenter1    |
    +---------------------------+-------------+

    There can be instances where a triggered mitigation may not succeed and the neutron server is not informed of such failure (for example, if the selected agent which had to mitigate the host, goes down before finishing the task). In this case, the cluster will be locked. To unlock the cluster for further mitigations, use the update command.

  • neutron ovsvapp-mitigated-cluster-update --vcenter-id <VCENTER_ID> --cluster-id <CLUSTER_ID>

    • Update the status of a mitigated cluster:

      Modify the values of being-mitigated from True to False to unlock the cluster.

      Example:

      ardana > neutron ovsvapp-mitigated-cluster-update --vcenter-id vcenter1 --cluster-id cluster1 --being-mitigated False
    • Update the threshold value:

      Update the threshold-reached value to True, if no further migration is required in the selected cluster.

      Example :

      ardana > neutron ovsvapp-mitigated-cluster-update --vcenter-id vcenter1 --cluster-id cluster1 --being-mitigated False --threshold-reached True

    Rest API

    • ardana > curl -i -X GET http://<ip>:9696/v2.0/ovsvapp_mitigated_clusters \
        -H "User-Agent: python-neutronclient" -H "Accept: application/json" -H \
        "X-Auth-Token: <token_id>"

7.1.1 More Information Edit source

For more information on the Networking for ESXi Hypervisor (OVSvApp), see the following references:

7.2 Validating the neutron Installation Edit source

You can validate that the ESX compute cluster is added to the cloud successfully using the following command:

# openstack network agent list

+------------------+----------------------+-----------------------+-------------------+-------+----------------+---------------------------+
| id               | agent_type           | host                  | availability_zone | alive | admin_state_up | binary                    |
+------------------+----------------------+-----------------------+-------------------+-------+----------------+---------------------------+
| 05ca6ef...999c09 | L3 agent             | doc-cp1-comp0001-mgmt | nova              | :-)   | True           | neutron-l3-agent          |
| 3b9179a...28e2ef | Metadata agent       | doc-cp1-comp0001-mgmt |                   | :-)   | True           | neutron-metadata-agent    |
| 4e8f84f...c9c58f | Metadata agent       | doc-cp1-comp0002-mgmt |                   | :-)   | True           | neutron-metadata-agent    |
| 55a5791...c17451 | L3 agent             | doc-cp1-c1-m1-mgmt    | nova              | :-)   | True           | neutron-vpn-agent         |
| 5e3db8f...87f9be | Open vSwitch agent   | doc-cp1-c1-m1-mgmt    |                   | :-)   | True           | neutron-openvswitch-agent |
| 6968d9a...b7b4e9 | L3 agent             | doc-cp1-c1-m2-mgmt    | nova              | :-)   | True           | neutron-vpn-agent         |
| 7b02b20...53a187 | Metadata agent       | doc-cp1-c1-m2-mgmt    |                   | :-)   | True           | neutron-metadata-agent    |
| 8ece188...5c3703 | Open vSwitch agent   | doc-cp1-comp0002-mgmt |                   | :-)   | True           | neutron-openvswitch-agent |
| 8fcb3c7...65119a | Metadata agent       | doc-cp1-c1-m1-mgmt    |                   | :-)   | True           | neutron-metadata-agent    |
| 9f48967...36effe | OVSvApp agent        | doc-cp1-comp0002-mgmt |                   | :-)   | True           | ovsvapp-agent             |
| a2a0b78...026da9 | Open vSwitch agent   | doc-cp1-comp0001-mgmt |                   | :-)   | True           | neutron-openvswitch-agent |
| a2fbd4a...28a1ac | DHCP agent           | doc-cp1-c1-m2-mgmt    | nova              | :-)   | True           | neutron-dhcp-agent        |
| b2428d5...ee60b2 | DHCP agent           | doc-cp1-c1-m1-mgmt    | nova              | :-)   | True           | neutron-dhcp-agent        |
| c0983a6...411524 | Open vSwitch agent   | doc-cp1-c1-m2-mgmt    |                   | :-)   | True           | neutron-openvswitch-agent |
| c32778b...a0fc75 | L3 agent             | doc-cp1-comp0002-mgmt | nova              | :-)   | True           | neutron-l3-agent          |
+------------------+----------------------+-----------------------+-------------------+-------+----------------+---------------------------+

7.3 Removing a Cluster from the Compute Resource Pool Edit source

7.3.1 Prerequisites Edit source

Write down the Hostname and ESXi configuration IP addresses of OVSvAPP VMs of that ESX cluster before deleting the VMs. These IP address and Hostname will be used to cleanup monasca alarm definitions.

Perform the following steps:

  1. Login to vSphere client.

  2. Select the ovsvapp node running on each ESXi host and click Summary tab as shown in the following example.

    Similarly you can retrieve the compute-proxy node information.

7.3.2 Removing an existing cluster from the compute resource pool Edit source

Perform the following steps to remove an existing cluster from the compute resource pool.

  1. Run the following command to check for the instances launched in that cluster:

    # openstack server list --host <hostname>
    +--------------------------------------+------+--------+------------+-------------+------------------+
    | ID                                   | Name | Status | Task State | Power State | Networks         |
    +--------------------------------------+------+--------+------------+-------------+------------------+
    | 80e54965-758b-425e-901b-9ea756576331 | VM1  | ACTIVE | -          | Running     | private=10.0.0.2 |
    +--------------------------------------+------+--------+------------+-------------+------------------+

    where:

    • hostname: Specifies hostname of the compute proxy present in that cluster.

  2. Delete all instances spawned in that cluster:

    # openstack server delete <server> [<server ...>]

    where:

    • server: Specifies the name or ID of server (s)

    OR

    Migrate all instances spawned in that cluster.

    # openstack server migrate <server>
  3. Run the following playbooks for stop the Compute (nova) and Networking (neutron) services:

    ardana > ansible-playbook -i hosts/verb_hosts nova-stop --limit <hostname>;
    ardana > ansible-playbook -i hosts/verb_hosts neutron-stop --limit <hostname>;

    where:

    • hostname: Specifies hostname of the compute proxy present in that cluster.

7.3.3 Cleanup monasca-agent for OVSvAPP Service Edit source

Perform the following procedure to cleanup monasca agents for ovsvapp-agent service.

  1. If monasca-API is installed on different node, copy the service.orsc from Cloud Lifecycle Manager to monasca API server.

    scp service.orsc $USER@ardana-cp1-mtrmon-m1-mgmt:
  2. SSH to monasca API server. You must SSH to each monasca API server for cleanup.

    For example:

    ssh ardana-cp1-mtrmon-m1-mgmt
  3. Edit /etc/monasca/agent/conf.d/host_alive.yaml file to remove the reference to the OVSvAPP you removed. This requires sudo access.

    sudo vi /etc/monasca/agent/conf.d/host_alive.yaml

    A sample of host_alive.yaml:

    - alive_test: ping
      built_by: HostAlive
      host_name: esx-cp1-esx-ovsvapp0001-mgmt
      name: esx-cp1-esx-ovsvapp0001-mgmt ping
      target_hostname: esx-cp1-esx-ovsvapp0001-mgmt

    where HOST_NAME and TARGET_HOSTNAME is mentioned at the DNS name field at the vSphere client. (Refer to Section 7.3.1, “Prerequisites”).

  4. After removing the reference on each of the monasca API servers, restart the monasca-agent on each of those servers by executing the following command.

    tux > sudo service openstack-monasca-agent restart
  5. With the OVSvAPP references removed and the monasca-agent restarted, you can delete the corresponding alarm to complete the cleanup process. We recommend using the monasca CLI which is installed on each of your monasca API servers by default. Execute the following command from the monasca API server (for example: ardana-cp1-mtrmon-mX-mgmt).

    monasca alarm-list --metric-name host_alive_status --metric-dimensions hostname=<ovsvapp deleted>

    For example: You can execute the following command to get the alarm ID, if the OVSvAPP appears as a preceding example.

    monasca alarm-list --metric-name host_alive_status --metric-dimensions hostname=MCP-VCP-cpesx-esx-ovsvapp0001-mgmt
    +--------------------------------------+--------------------------------------+-----------------------+-------------------+-------------------------------------------+----------+-------+-----------------+------+--------------------------+--------------------------+--------------------------+
    | id                                   | alarm_definition_id                  | alarm_definition_name | metric_name       | metric_dimensions                         | severity | state | lifecycle_state | link | state_updated_timestamp  | updated_timestamp        | created_timestamp        |
    +--------------------------------------+--------------------------------------+-----------------------+-------------------+-------------------------------------------+----------+-------+-----------------+------+--------------------------+--------------------------+--------------------------+
    | cfc6bfa4-2485-4319-b1e5-0107886f4270 | cca96c53-a927-4b0a-9bf3-cb21d28216f3 | Host Status           | host_alive_status | service: system                           | HIGH     | OK    | None            | None | 2016-10-27T06:33:04.256Z | 2016-10-27T06:33:04.256Z | 2016-10-23T13:41:57.258Z |
    |                                      |                                      |                       |                   | cloud_name: entry-scale-kvm-esx-mml       |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | test_type: ping                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | hostname: ardana-cp1-esx-ovsvapp0001-mgmt |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | control_plane: control-plane-1            |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | cluster: mtrmon                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | observer_host: ardana-cp1-mtrmon-m1-mgmt  |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       | host_alive_status | service: system                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | cloud_name: entry-scale-kvm-esx-mml       |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | test_type: ping                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | hostname: ardana-cp1-esx-ovsvapp0001-mgmt |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | control_plane: control-plane-1            |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | cluster: mtrmon                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | observer_host: ardana-cp1-mtrmon-m3-mgmt  |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       | host_alive_status | service: system                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | cloud_name: entry-scale-kvm-esx-mml       |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | test_type: ping                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | hostname: ardana-cp1-esx-ovsvapp0001-mgmt |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | control_plane: control-plane-1            |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | cluster: mtrmon                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | observer_host: ardana-cp1-mtrmon-m2-mgmt  |          |       |                 |      |                          |                          |                          |
    +--------------------------------------+--------------------------------------+-----------------------+-------------------+-------------------------------------------+----------+-------+-----------------+------+--------------------------+--------------------------+--------------------------+
  6. Delete the monasca alarm.

    monasca alarm-delete <alarm ID>

    For example:

    monasca alarm-delete cfc6bfa4-2485-4319-b1e5-0107886f4270Successfully deleted alarm

    After deleting the alarms and updating the monasca-agent configuration, those alarms will be removed from the Operations Console UI. You can login to Operations Console and view the status.

7.3.4 Removing the Compute Proxy from Monitoring Edit source

Once you have removed the Compute proxy, the alarms against them will still trigger. Therefore to resolve this, you must perform the following steps.

  1. SSH to monasca API server. You must SSH to each monasca API server for cleanup.

    For example:

    ssh ardana-cp1-mtrmon-m1-mgmt
  2. Edit /etc/monasca/agent/conf.d/host_alive.yaml file to remove the reference to the Compute proxy you removed. This requires sudo access.

    sudo vi /etc/monasca/agent/conf.d/host_alive.yaml

    A sample of host_alive.yaml file.

    - alive_test: ping
      built_by: HostAlive
      host_name: MCP-VCP-cpesx-esx-comp0001-mgmt
      name: MCP-VCP-cpesx-esx-comp0001-mgmt ping
  3. Once you have removed the references on each of your monasca API servers, execute the following command to restart the monasca-agent on each of those servers.

    tux > sudo service openstack-monasca-agent restart
  4. With the Compute proxy references removed and the monasca-agent restarted, delete the corresponding alarm to complete this process. complete the cleanup process. We recommend using the monasca CLI which is installed on each of your monasca API servers by default.

    monasca alarm-list --metric-dimensions hostname= <compute node deleted>

    For example: You can execute the following command to get the alarm ID, if the Compute proxy appears as a preceding example.

    monasca alarm-list --metric-dimensions hostname=ardana-cp1-comp0001-mgmt
  5. Delete the monasca alarm

    monasca alarm-delete <alarm ID>

7.3.5 Cleaning the monasca Alarms Related to ESX Proxy and vCenter Cluster Edit source

Perform the following procedure:

  1. Using the ESX proxy hostname, execute the following command to list all alarms.

    monasca alarm-list --metric-dimensions hostname=COMPUTE_NODE_DELETED

    where COMPUTE_NODE_DELETED - hostname is taken from the vSphere client (refer to Section 7.3.1, “Prerequisites”).

    Note
    Note

    Make a note of all the alarm IDs that are displayed after executing the preceding command.

    For example, the compute proxy hostname is MCP-VCP-cpesx-esx-comp0001-mgmt.

    monasca alarm-list --metric-dimensions hostname=MCP-VCP-cpesx-esx-comp0001-mgmt
    ardana@R28N6340-701-cp1-c1-m1-mgmt:~$ monasca alarm-list --metric-dimensions hostname=R28N6340-701-cp1-esx-comp0001-mgmt
    +--------------------------------------+--------------------------------------+------------------------+------------------------+--------------------------------------------------+----------+-------+-----------------+------+--------------------------+--------------------------+--------------------------+
    | id                                   | alarm_definition_id                  | alarm_definition_name  | metric_name            | metric_dimensions                                | severity | state | lifecycle_state | link | state_updated_timestamp  | updated_timestamp        | created_timestamp        |
    +--------------------------------------+--------------------------------------+------------------------+------------------------+--------------------------------------------------+----------+-------+-----------------+------+--------------------------+--------------------------+--------------------------+
    | 02342bcb-da81-40db-a262-09539523c482 | 3e302297-0a36-4f0e-a1bd-03402b937a4e | HTTP Status            | http_status            | service: compute                                 | HIGH     | OK    | None            | None | 2016-11-11T06:58:11.717Z | 2016-11-11T06:58:11.717Z | 2016-11-10T08:55:45.136Z |
    |                                      |                                      |                        |                        | cloud_name: entry-scale-esx-kvm                  |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                        |                        | url: https://10.244.209.9:8774                   |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                        |                        | hostname: R28N6340-701-cp1-esx-comp0001-mgmt     |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                        |                        | component: nova-api                              |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                        |                        | control_plane: control-plane-1                   |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                        |                        | cluster: esx-compute                             |          |       |                 |      |                          |                          |                          |
    | 04cb36ce-0c7c-4b4c-9ebc-c4011e2f6c0a | 15c593de-fa54-4803-bd71-afab95b980a4 | Disk Usage             | disk.space_used_perc   | mount_point: /proc/sys/fs/binfmt_misc            | HIGH     | OK    | None            | None | 2016-11-10T08:52:52.886Z | 2016-11-10T08:52:52.886Z | 2016-11-10T08:51:29.197Z |
    |                                      |                                      |                        |                        | service: system                                  |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                        |                        | cloud_name: entry-scale-esx-kvm                  |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                        |                        | hostname: R28N6340-701-cp1-esx-comp0001-mgmt     |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                        |                        | control_plane: control-plane-1                   |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                        |                        | cluster: esx-compute                             |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                        |                        | device: systemd-1                                |          |       |                 |      |                          |                          |                          |
    +--------------------------------------+--------------------------------------+------------------------+------------------------+--------------------------------------------------+----------+-------+-----------------+------+--------------------------+--------------------------+--------------------------+
  2. Delete the alarm using the alarm IDs.

    monasca alarm-delete <alarm ID>

    Perform this step for all alarm IDs listed from the preceding step (Step 1).

    For example:

    monasca alarm-delete 1cc219b1-ce4d-476b-80c2-0cafa53e1a12

7.4 Removing an ESXi Host from a Cluster Edit source

This topic describes how to remove an existing ESXi host from a cluster and clean up of services for OVSvAPP VM.

Note
Note

Before performing this procedure, wait until VCenter migrates all the tenant VMs to other active hosts in that same cluster.

7.4.1 Prerequisite Edit source

Write down the Hostname and ESXi configuration IP addresses of OVSvAPP VMs of that ESX cluster before deleting the VMs. These IP address and Hostname will be used to clean up monasca alarm definitions.

  1. Login to vSphere client.

  2. Select the ovsvapp node running on the ESXi host and click Summary tab.

7.4.2 Procedure Edit source

  1. Right-click and put the host in the maintenance mode. This will automatically migrate all the tenant VMs except OVSvApp.

  2. Cancel the maintenance mode task.

  3. Right-click the ovsvapp VM (IP Address) node, select Power, and then click Power Off.

  4. Right-click the node and then click Delete from Disk.

  5. Right-click the Host, and then click Enter Maintenance Mode.

  6. Disconnect the VM. Right-click the VM, and then click Disconnect.

The ESXi node is removed from the vCenter.

7.4.3 Clean up neutron-agent for OVSvAPP Service Edit source

After removing ESXi node from a vCenter, perform the following procedure to clean up neutron agents for ovsvapp-agent service.

  1. Login to Cloud Lifecycle Manager.

  2. Source the credentials.

    ardana > source service.osrc
  3. Execute the following command.

    ardana > openstack network agent list | grep <OVSvapp hostname>

    For example:

    ardana > openstack network agent list | grep MCP-VCP-cpesx-esx-ovsvapp0001-mgmt
    | 92ca8ada-d89b-43f9-b941-3e0cd2b51e49 | OVSvApp Agent      | MCP-VCP-cpesx-esx-ovsvapp0001-mgmt |                   | :-)   | True           | ovsvapp-agent             |
  4. Delete the OVSvAPP agent.

    ardana > openstack network agent delete <Agent -ID>

    For example:

    ardana > openstack network agent delete 92ca8ada-d89b-43f9-b941-3e0cd2b51e49

If you have more than one host, perform the preceding procedure for all the hosts.

7.4.4 Clean up monasca-agent for OVSvAPP Service Edit source

Perform the following procedure to clean up monasca agents for ovsvapp-agent service.

  1. If monasca-API is installed on different node, copy the service.orsc from Cloud Lifecycle Manager to monasca API server.

    ardana > scp service.orsc $USER@ardana-cp1-mtrmon-m1-mgmt:
  2. SSH to monasca API server. You must SSH to each monasca API server for cleanup.

    For example:

    ardana > ssh ardana-cp1-mtrmon-m1-mgmt
  3. Edit /etc/monasca/agent/conf.d/host_alive.yaml file to remove the reference to the OVSvAPP you removed. This requires sudo access.

    sudo vi /etc/monasca/agent/conf.d/host_alive.yaml

    A sample of host_alive.yaml:

    - alive_test: ping
      built_by: HostAlive
      host_name: MCP-VCP-cpesx-esx-ovsvapp0001-mgmt
      name: MCP-VCP-cpesx-esx-ovsvapp0001-mgmt ping
      target_hostname: MCP-VCP-cpesx-esx-ovsvapp0001-mgmt

    where host_name and target_hostname are mentioned at the DNS name field at the vSphere client. (Refer to Section 7.4.1, “Prerequisite”).

  4. After removing the reference on each of the monasca API servers, restart the monasca-agent on each of those servers by executing the following command.

    tux > sudo service openstack-monasca-agent restart
  5. With the OVSvAPP references removed and the monasca-agent restarted, you can delete the corresponding alarm to complete the cleanup process. We recommend using the monasca CLI which is installed on each of your monasca API servers by default. Execute the following command from the monasca API server (for example: ardana-cp1-mtrmon-mX-mgmt).

    ardana > monasca alarm-list --metric-name host_alive_status --metric-dimensions hostname=<ovsvapp deleted>

    For example: You can execute the following command to get the alarm ID, if the OVSvAPP appears as a preceding example.

    ardana > monasca alarm-list --metric-name host_alive_status --metric-dimensions hostname=MCP-VCP-cpesx-esx-ovsvapp0001-mgmt
    +--------------------------------------+--------------------------------------+-----------------------+-------------------+-------------------------------------------+----------+-------+-----------------+------+--------------------------+--------------------------+--------------------------+
    | id                                   | alarm_definition_id                  | alarm_definition_name | metric_name       | metric_dimensions                         | severity | state | lifecycle_state | link | state_updated_timestamp  | updated_timestamp        | created_timestamp        |
    +--------------------------------------+--------------------------------------+-----------------------+-------------------+-------------------------------------------+----------+-------+-----------------+------+--------------------------+--------------------------+--------------------------+
    | cfc6bfa4-2485-4319-b1e5-0107886f4270 | cca96c53-a927-4b0a-9bf3-cb21d28216f3 | Host Status           | host_alive_status | service: system                           | HIGH     | OK    | None            | None | 2016-10-27T06:33:04.256Z | 2016-10-27T06:33:04.256Z | 2016-10-23T13:41:57.258Z |
    |                                      |                                      |                       |                   | cloud_name: entry-scale-kvm-esx-mml       |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | test_type: ping                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | hostname: ardana-cp1-esx-ovsvapp0001-mgmt |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | control_plane: control-plane-1            |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | cluster: mtrmon                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | observer_host: ardana-cp1-mtrmon-m1-mgmt  |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       | host_alive_status | service: system                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | cloud_name: entry-scale-kvm-esx-mml       |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | test_type: ping                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | hostname: ardana-cp1-esx-ovsvapp0001-mgmt |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | control_plane: control-plane-1            |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | cluster: mtrmon                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | observer_host: ardana-cp1-mtrmon-m3-mgmt  |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       | host_alive_status | service: system                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | cloud_name: entry-scale-kvm-esx-mml       |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | test_type: ping                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | hostname: ardana-cp1-esx-ovsvapp0001-mgmt |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | control_plane: control-plane-1            |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | cluster: mtrmon                           |          |       |                 |      |                          |                          |                          |
    |                                      |                                      |                       |                   | observer_host: ardana-cp1-mtrmon-m2-mgmt  |          |       |                 |      |                          |                          |                          |
    +--------------------------------------+--------------------------------------+-----------------------+-------------------+-------------------------------------------+----------+-------+-----------------+------+--------------------------+--------------------------+--------------------------+
  6. Delete the monasca alarm.

    ardana > monasca alarm-delete <alarm ID>

    For example:

    ardana > monasca alarm-delete cfc6bfa4-2485-4319-b1e5-0107886f4270Successfully deleted alarm

    After deleting the alarms and updating the monasca-agent configuration, those alarms will be removed from the Operations Console UI. You can login to Operations Console and view the status.

7.4.5 Clean up the entries of OVSvAPP VM from /etc/host Edit source

Perform the following procedure to clean up the entries of OVSvAPP VM from /etc/hosts.

  1. Login to Cloud Lifecycle Manager.

  2. Edit /etc/host.

    ardana > vi /etc/host

    For example: MCP-VCP-cpesx-esx-ovsvapp0001-mgmt VM is present in the /etc/host.

    192.168.86.17    MCP-VCP-cpesx-esx-ovsvapp0001-mgmt
  3. Delete the OVSvAPP entries from /etc/hosts.

7.4.6 Remove the OVSVAPP VM from the servers.yml and pass_through.yml files and run the Configuration Processor Edit source

Complete these steps from the Cloud Lifecycle Manager to remove the OVSvAPP VM:

  1. Log in to the Cloud Lifecycle Manager

  2. Edit servers.yml file to remove references to the OVSvAPP VM(s) you want to remove:

    ~/openstack/my_cloud/definition/data/servers.yml

    For example:

    - ip-addr:192.168.86.17
      server-group: AZ1    role:
      OVSVAPP-ROLE    id:
      6afaa903398c8fc6425e4d066edf4da1a0f04388
  3. Edit ~/openstack/my_cloud/definition/data/pass_through.yml file to remove the OVSvAPP VM references using the server-id above section to find the references.

    - data:
      vmware:
      vcenter_cluster: Clust1
      cluster_dvs_mapping: 'DC1/host/Clust1:TRUNK-DVS-Clust1'
      esx_hostname: MCP-VCP-cpesx-esx-ovsvapp0001-mgmt
      vcenter_id: 0997E2ED9-5E4F-49EA-97E6-E2706345BAB2
    id: 6afaa903398c8fc6425e4d066edf4da1a0f04388
  4. Commit the changes to git:

    ardana > git commit -a -m "Remove ESXi host <name>"
  5. Run the configuration processor. You may want to use the remove_deleted_servers and free_unused_addresses switches to free up the resources when running the configuration processor. See Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 7 “Other Topics”, Section 7.3 “Persisted Data” for more details.

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml -e remove_deleted_servers="y" -e free_unused_addresses="y"
  6. Update your deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml

7.4.7 Clean Up nova Agent for ESX Proxy Edit source

  1. Log in to the Cloud Lifecycle Manager

  2. Source the credentials.

    ardana > source service.osrc
  3. Find the nova ID for ESX Proxy with openstack compute service list.

  4. Delete the ESX Proxy service.

    ardana > openstack compute service delete
        ESX_PROXY_ID

If you have more than one host, perform the preceding procedure for all the hosts.

7.4.8 Clean Up monasca Agent for ESX Proxy Edit source

  1. Using the ESX proxy hostname, execute the following command to list all alarms.

    ardana > monasca alarm-list --metric-dimensions hostname=COMPUTE_NODE_DELETED

    where COMPUTE_NODE_DELETED - hostname is taken from the vSphere client (refer to Section 7.3.1, “Prerequisites”).

    Note
    Note

    Make a note of all the alarm IDs that are displayed after executing the preceding command.

  2. Delete the ESX Proxy alarm using the alarm IDs.

    monasca alarm-delete <alarm ID>

    This step has to be performed for all alarm IDs listed with the monasca alarm-list command.

7.4.9 Clean Up ESX Proxy Entries in /etc/host Edit source

  1. Log in to the Cloud Lifecycle Manager

  2. Edit the /etc/hosts file, removing ESX Proxy entries.

7.4.10 Remove ESX Proxy from servers.yml and pass_through.yml files; run the Configuration Processor Edit source

  1. Log in to the Cloud Lifecycle Manager

  2. Edit servers.yml file to remove references to ESX Proxy:

    ~/openstack/my_cloud/definition/data/servers.yml
  3. Edit ~/openstack/my_cloud/definition/data/pass_through.yml file to remove the ESX Proxy references using the server-id fromm the servers.yml file.

  4. Commit the changes to git:

    git commit -a -m "Remove ESX Proxy references"
  5. Run the configuration processor. You may want to use the remove_deleted_servers and free_unused_addresses switches to free up the resources when running the configuration processor. See Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 7 “Other Topics”, Section 7.3 “Persisted Data” for more details.

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml \
    -e remove_deleted_servers="y" -e free_unused_addresses="y"
  6. Update your deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml

7.4.11 Remove Distributed Resource Scheduler (DRS) Rules Edit source

Perform the following procedure to remove DRS rules, which is added by OVSvAPP installer to ensure that OVSvAPP does not get migrated to other hosts.

  1. Login to vCenter.

  2. Right click on cluster and select Edit settings.

    A cluster settings page appears.

  3. Click DRS Groups Manager on the left hand side of the pop-up box. Select the group which is created for deleted OVSvAPP and click Remove.

  4. Click Rules on the left hand side of the pop-up box and select the checkbox for deleted OVSvAPP and click Remove.

  5. Click OK.

7.5 Configuring Debug Logging Edit source

7.5.1 To Modify the OVSVAPP VM Log Level Edit source

To change the OVSVAPP log level to DEBUG, do the following:

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the file below:

    ~/openstack/ardana/ansible/roles/neutron-common/templates/ovsvapp-agent-logging.conf.j2
  3. Set the logging level value of the logger_root section to DEBUG, like this:

    [logger_root]
    qualname: root
    handlers: watchedfile, logstash
    level: DEBUG
  4. Commit your configuration to the Git repository (Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”), as follows:

    cd ~/openstack/ardana/ansible
    git add -A
    git commit -m "My config or other commit message"
  5. Run the configuration processor:

    cd ~/openstack/ardana/ansible
    ansible-playbook -i hosts/localhost config-processor-run.yml
  6. Update your deployment directory:

    cd ~/openstack/ardana/ansible
    ansible-playbook -i hosts/localhost ready-deployment.yml
  7. Deploy your changes:

    cd ~/scratch/ansible/next/hos/ansible
    ansible-playbook -i hosts/verb_hosts neutron-reconfigure.yml

7.5.2 To Enable OVSVAPP Service for Centralized Logging Edit source

To enable OVSVAPP Service for centralized logging:

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the file below:

    ~/openstack/my_cloud/config/logging/vars/neutron-ovsvapp-clr.yml
  3. Set the value of centralized_logging to true as shown in the following sample:

    logr_services:
      neutron-ovsvapp:
        logging_options:
        - centralized_logging:
            enabled: true
            format: json
            ...
  4. Commit your configuration to the Git repository (Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”), as follows:

    cd ~/openstack/ardana/ansible
    git add -A
    git commit -m "My config or other commit message"
  5. Run the configuration processor:

    cd ~/openstack/ardana/ansible
    ansible-playbook -i hosts/localhost config-processor-run.yml
  6. Update your deployment directory:

    cd ~/openstack/ardana/ansible
    ansible-playbook -i hosts/localhost ready-deployment.yml
  7. Deploy your changes, specifying the hostname for your OVSAPP host:

    cd ~/scratch/ansible/next/ardana/ansible
    ansible-playbook -i hosts/verb_hosts neutron-reconfigure.yml --limit <hostname>

    The hostname of the node can be found in the list generated from the output of the following command:

    grep hostname ~/openstack/my_cloud/info/server_info.yml

7.6 Making Scale Configuration Changes Edit source

This procedure describes how to make the recommended configuration changes to achieve 8,000 virtual machine instances.

Note
Note

In a scale environment for ESX computes, the configuration of vCenter Proxy VM has to be increased to 8 vCPUs and 16 GB RAM. By default it is 4 vCPUs and 4 GB RAM.

  1. Change the directory. The nova.conf.j2 file is present in following directories:

    cd ~/openstack/ardana/ansible/roles/nova-common/templates
  2. Edit the DEFAULT section in the nova.conf.j2 file as below:

    [DEFAULT]
    rpc_responce_timeout = 180
    server_down_time = 300
    report_interval = 30
  3. Commit your configuration:

    cd ~/openstack/ardana/ansible
    git add -A
    git commit -m "<commit message>"
  4. Prepare your environment for deployment:

    ansible-playbook -i hosts/localhost ready-deployment.yml;
    cd ~/scratch/ansible/next/ardana/ansible;
  5. Execute the nova-reconfigure playbook:

    ansible-playbook -i hosts/verb_hosts nova-reconfigure.yml

7.7 Monitoring vCenter Clusters Edit source

Remote monitoring of activated ESX cluster is enabled through vCenter Plugin of monasca. The monasca-agent running in each ESX Compute proxy node is configured with the vcenter plugin, to monitor the cluster.

Alarm definitions are created with the default threshold values and whenever the threshold limit breaches respective alarms (OK/ALARM/UNDETERMINED) are generated.

The configuration file details is given below:

init_config: {}
instances:
  - vcenter_ip: <vcenter-ip>
      username: <vcenter-username>
      password: <center-password>
      clusters: <[cluster list]>

Metrics List of metrics posted to monasca by vCenter Plugin are listed below:

  • vcenter.cpu.total_mhz

  • vcenter.cpu.used_mhz

  • vcenter.cpu.used_perc

  • vcenter.cpu.total_logical_cores

  • vcenter.mem.total_mb

  • vcenter.mem.used_mb

  • vcenter.mem.used_perc

  • vcenter.disk.total_space_mb

  • vcenter.disk.total_used_space_mb

  • vcenter.disk.total_used_space_perc

monasca measurement-list --dimensions esx_cluster_id=domain-c7.D99502A9-63A8-41A2-B3C3-D8E31B591224 vcenter.disk.total_used_space_mb 2016-08-30T11:20:08

+----------------------------------------------+----------------------------------------------------------------------------------------------+-----------------------------------+------------------+-----------------+
| name                                         | dimensions                                                                                   | timestamp                         | value            | value_meta      |
+----------------------------------------------+----------------------------------------------------------------------------------------------+-----------------------------------+------------------+-----------------+
| vcenter.disk.total_used_space_mb             | vcenter_ip: 10.1.200.91                                                                      | 2016-08-30T11:20:20.703Z          | 100371.000       |                 |
|                                              | esx_cluster_id: domain-c7.D99502A9-63A8-41A2-B3C3-D8E31B591224                               | 2016-08-30T11:20:50.727Z          | 100371.000       |                 |
|                                              | hostname: MCP-VCP-cpesx-esx-comp0001-mgmt                                                    | 2016-08-30T11:21:20.707Z          | 100371.000       |                 |
|                                              |                                                                                              | 2016-08-30T11:21:50.700Z          | 100371.000       |                 |
|                                              |                                                                                              | 2016-08-30T11:22:20.700Z          | 100371.000       |                 |
|                                              |                                                                                              | 2016-08-30T11:22:50.700Z          | 100371.000       |                 |
|                                              |                                                                                              | 2016-08-30T11:23:20.620Z          | 100371.000       |                 |
+----------------------------------------------+-----------------------------------------------------------------------------------------------+-----------------------------------+------------------+-----------------+

Dimensions

Each metric will have the dimension as below

vcenter_ip

FQDN/IP Address of the registered vCenter

server esx_cluster_id

clusterName.vCenter-id, as seen in the openstack hypervisor list

hostname

ESX compute proxy name

Alarms

Alarms are created for monitoring cpu, memory and disk usages for each activated clusters. The alarm definitions details are

NameExpressionSeverityMatch_by
ESX cluster CPU Usageavg(vcenter.cpu.used_perc) > 90 times 3Highesx_cluster_id
ESX cluster Memory Usageavg(vcenter.mem.used_perc) > 90 times 3Highesx_cluster_id
ESX cluster Disk Usagevcenter.disk.total_used_space_perc > 90Highesx_cluster_id

7.8 Monitoring Integration with OVSvApp Appliance Edit source

7.8.1 Processes Monitored with monasca-agent Edit source

Using the monasca agent, the following services are monitored on the OVSvApp appliance:

  • neutron_ovsvapp_agent service - This is the neutron agent which runs in the appliance which will help enable networking for the tenant virtual machines.

  • Openvswitch - This service is used by the neutron_ovsvapp_agent service for enabling the datapath and security for the tenant virtual machines.

  • Ovsdb-server - This service is used by the neutron_ovsvapp_agent service.

If any of the above three processes fail to run on the OVSvApp appliance it will lead to network disruption for the tenant virtual machines. This is why they are monitored.

The monasca-agent periodically reports the status of these processes and metrics data ('load' - cpu.load_avg_1min, 'process' - process.pid_count, 'memory' - mem.usable_perc, 'disk' - disk.space_used_perc, 'cpu' - cpu.idle_perc for examples) to the monasca server.

7.8.2 How It Works Edit source

Once the vApp is configured and up, the monasca-agent will attempt to register with the monasca server. After successful registration, the monitoring begins on the processes listed above and you will be able to see status updates on the server side.

The monasca-agent monitors the processes at the system level so, in the case of failures of any of the configured processes, updates should be seen immediately from monasca.

To check the events from the server side, log into the Operations Console.

8 Managing Block Storage Edit source

Information about managing and configuring the Block Storage service.

8.1 Managing Block Storage using Cinder Edit source

SUSE OpenStack Cloud Block Storage volume operations use the OpenStack cinder service to manage storage volumes, which includes creating volumes, attaching/detaching volumes to nova instances, creating volume snapshots, and configuring volumes.

SUSE OpenStack Cloud supports the following storage back ends for block storage volumes and backup datastore configuration:

  • Volumes

    • SUSE Enterprise Storage; for more information, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 35 “Integrations”, Section 35.3 “SUSE Enterprise Storage Integration”.

    • 3PAR FC or iSCSI; for more information, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 35 “Integrations”, Section 35.1 “Configuring for 3PAR Block Storage Backend”.

  • Backup

    • swift

8.1.1 Setting Up Multiple Block Storage Back-ends Edit source

SUSE OpenStack Cloud supports setting up multiple block storage backends and multiple volume types.

Whether you have a single or multiple block storage back-ends defined in your cinder.conf.j2 file, you can create one or more volume types using the specific attributes associated with the back-end. You can find details on how to do that for each of the supported back-end types here:

  • Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 35 “Integrations”, Section 35.3 “SUSE Enterprise Storage Integration”

  • Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 35 “Integrations”, Section 35.1 “Configuring for 3PAR Block Storage Backend”

8.1.2 Creating a Volume Type for your Volumes Edit source

Creating volume types allows you to create standard specifications for your volumes.

Volume types are used to specify a standard Block Storage back-end and collection of extra specifications for your volumes. This allows an administrator to give its users a variety of options while simplifying the process of creating volumes.

The tasks involved in this process are:

8.1.2.1 Create a Volume Type for your Volumes Edit source

The default volume type will be thin provisioned and will have no fault tolerance (RAID 0). You should configure cinder to fully provision volumes, and you may want to configure fault tolerance. Follow the instructions below to create a new volume type that is fully provisioned and fault tolerant:

Perform the following steps to create a volume type using the horizon GUI:

  1. Log in to the horizon dashboard.

  2. Ensure that you are scoped to your admin Project. Then under the Admin menu in the navigation pane, click on Volumes under the System subheading.

  3. Select the Volume Types tab and then click the Create Volume Type button to display a dialog box.

  4. Enter a unique name for the volume type and then click the Create Volume Type button to complete the action.

The newly created volume type will be displayed in the Volume Types list confirming its creation.

Important
Important

You must set a default_volume_type in cinder.conf.j2, whether it is default_type or one you have created. For more information, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 35 “Integrations”, Section 35.1 “Configuring for 3PAR Block Storage Backend”, Section 35.1.4 “Configure 3PAR FC as a Cinder Backend”.

8.1.2.2 Associate the Volume Type to the Back-end Edit source

After the volume type(s) have been created, you can assign extra specification attributes to the volume types. Each Block Storage back-end option has unique attributes that can be used.

To map a volume type to a back-end, do the following:

  1. Log into the horizon dashboard.

  2. Ensure that you are scoped to your admin Project (for more information, see Section 5.10.7, “Scope Federated User to Domain”. Then under the Admin menu in the navigation pane, click on Volumes under the System subheading.

  3. Click the Volume Type tab to list the volume types.

  4. In the Actions column of the Volume Type you created earlier, click the drop-down option and select View Extra Specs which will bring up the Volume Type Extra Specs options.

  5. Click the Create button on the Volume Type Extra Specs screen.

  6. In the Key field, enter one of the key values in the table in the next section. In the Value box, enter its corresponding value. Once you have completed that, click the Create button to create the extra volume type specs.

Once the volume type is mapped to a back-end, you can create volumes with this volume type.

8.1.2.3 Extra Specification Options for 3PAR Edit source

3PAR supports volumes creation with additional attributes. These attributes can be specified using the extra specs options for your volume type. The administrator is expected to define appropriate extra spec for 3PAR volume type as per the guidelines provided at http://docs.openstack.org/liberty/config-reference/content/hp-3par-supported-ops.html.

The following cinder Volume Type extra-specs options enable control over the 3PAR storage provisioning type:

KeyValueDescription
volume_backend_namevolume backend name

The name of the back-end to which you want to associate the volume type, which you also specified earlier in the cinder.conf.j2 file.

hp3par:provisioning (optional)thin, full, or dedup 

For more information, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 35 “Integrations”, Section 35.1 “Configuring for 3PAR Block Storage Backend”.

8.1.3 Managing cinder Volume and Backup Services Edit source

Important
Important: Use Only When Needed

If the host running the cinder-volume service fails for any reason, it should be restarted as quickly as possible. Often, the host running cinder services also runs high availability (HA) services such as MariaDB and RabbitMQ. These HA services are at risk while one of the nodes in the cluster is down. If it will take a significant amount of time to recover the failed node, then you may migrate the cinder-volume service and its backup service to one of the other controller nodes. When the node has been recovered, you should migrate the cinder-volume service and its backup service to the original (default) node.

The cinder-volume service and its backup service migrate as a pair. If you migrate the cinder-volume service, its backup service will also be migrated.

8.1.3.1 Migrating the cinder-volume service Edit source

The following steps will migrate the cinder-volume service and its backup service.

  1. Log in to the Cloud Lifecycle Manager node.

  2. Determine the host index numbers for each of your control plane nodes. This host index number will be used in a later step. They can be obtained by running this playbook:

    cd ~/scratch/ansible/next/ardana/ansible
    ansible-playbook -i hosts/verb_hosts cinder-show-volume-hosts.yml

    Here is an example snippet showing the output of a single three node control plane, with the host index numbers in bold:

    TASK: [_CND-CMN | show_volume_hosts | Show cinder Volume hosts index and hostname] ***
    ok: [ardana-cp1-c1-m1] => (item=(0, 'ardana-cp1-c1-m1')) => {
        "item": [
            0,
            "ardana-cp1-c1-m1"
        ],
        "msg": "Index 0 Hostname ardana-cp1-c1-m1"
    }
    ok: [ardana-cp1-c1-m1] => (item=(1, 'ardana-cp1-c1-m2')) => {
        "item": [
            1,
            "ardana-cp1-c1-m2"
        ],
        "msg": "Index 1 Hostname ardana-cp1-c1-m2"
    }
    ok: [ardana-cp1-c1-m1] => (item=(2, 'ardana-cp1-c1-m3')) => {
        "item": [
            2,
            "ardana-cp1-c1-m3"
        ],
        "msg": "Index 2 Hostname ardana-cp1-c1-m3"
    }
  3. Locate the control plane fact file for the control plane you need to migrate the service from. It will be located in the following directory:

    /etc/ansible/facts.d/

    These fact files use the following naming convention:

    cinder_volume_run_location_<control_plane_name>.fact
  4. Edit the fact file to include the host index number of the control plane node you wish to migrate the cinder-volume services to. For example, if they currently reside on your first controller node, host index 0, and you wish to migrate them to your second controller, you would change the value in the fact file to 1.

  5. If you are using data encryption on your Cloud Lifecycle Manager, ensure you have included the encryption key in your environment variables. For more information see Book “Security Guide”, Chapter 10 “Encryption of Passwords and Sensitive Data”.

    export HOS_USER_PASSWORD_ENCRYPT_KEY=<encryption key>
  6. After you have edited the control plane fact file, run the cinder volume migration playbook for the control plane nodes involved in the migration. At minimum this includes the one to start cinder-volume manager on and the one on which to stop it:

    cd ~/scratch/ansible/next/ardana/ansible
    ansible-playbook -i hosts/verb_hosts cinder-migrate-volume.yml --limit=<limit_pattern1,limit_pattern2>
    Note
    Note

    <limit_pattern> is the pattern used to limit the hosts that are selected to those within a specific control plane. For example, with the nodes in the snippet shown above, --limit=>ardana-cp1-c1-m1,ardana-cp1-c1-m2<

  7. Even though the playbook summary reports no errors, you may disregard informational messages such as:

    msg: Marking ardana_notify_cinder_restart_required to be cleared from the fact cache
  8. Ensure that once your maintenance or other tasks are completed that you migrate the cinder-volume services back to their original node using these same steps.

9 Managing Object Storage Edit source

Information about managing and configuring the Object Storage service.

The Object Storage service may be deployed in a full-fledged manner, with proxy nodes engaging rings for managing the accounts, containers, and objects being stored. Or, it may simply be deployed as a front-end to SUSE Enterprise Storage, offering Object Storage APIs with an external back-end.

In the former case, managing your Object Storage environment includes tasks related to ensuring your swift rings stay balanced, and that and other topics are discussed in more detail in this section. swift includes many commands and utilities for these purposes.

When used as a front-end to SUSE Enterprise Storage, many swift constructs such as rings and ring balancing, replica dispersion, etc. do not apply, as swift itself is not responsible for the mechanics of object storage.

9.1 Running the swift Dispersion Report Edit source

swift contains a tool called swift-dispersion-report that can be used to determine whether your containers and objects have three replicas like they are supposed to. This tool works by populating a percentage of partitions in the system with containers and objects (using swift-dispersion-populate) and then running the report to see if all the replicas of these containers and objects are in the correct place. For a more detailed explanation of this tool in Openstack swift, please see OpenStack swift - Administrator's Guide.

9.1.1 Configuring the swift dispersion populate Edit source

Once a swift system has been fully deployed in SUSE OpenStack Cloud 9, you can setup the swift-dispersion-report using the default parameters found in ~/openstack/ardana/ansible/roles/swift-dispersion/templates/dispersion.conf.j2. This populates 1% of the partitions on the system and if you are happy with this figure, please proceed to step 2 below. Otherwise, follow step 1 to edit the configuration file.

  1. If you wish to change the dispersion coverage percentage, then connect to the Cloud Lifecycle Manager server and change the value of dispersion_coverage in the ~/openstack/ardana/ansible/roles/swift-dispersion/templates/dispersion.conf.j2 file to the value you wish to use. In the example below we have altered the file to create 5% dispersion:

    ...
    [dispersion]
    auth_url = {{ keystone_identity_uri }}/v3
    auth_user = {{ swift_dispersion_tenant }}:{{ swift_dispersion_user }}
    auth_key = {{ swift_dispersion_password  }}
    endpoint_type = {{ endpoint_type }}
    auth_version = {{ disp_auth_version }}
    # Set this to the percentage coverage. We recommend a value
    # of 1%. You can increase this to get more coverage. However, if you
    # decrease the value, the dispersion containers and objects are
    # not deleted.
    dispersion_coverage = 5.0
  2. Commit your configuration to the Git repository (Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”), as follows:

    ardana > git add -A
    ardana > git commit -m "My config or other commit message"
  3. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  4. Update your deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  5. Reconfigure the swift servers:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-reconfigure.yml
  6. Run this playbook to populate your swift system for the health check:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-dispersion-populate.yml

9.1.2 Running the swift dispersion report Edit source

Check the status of the swift system by running the swift dispersion report with this playbook:

ardana > cd ~/scratch/ansible/next/ardana/ansible
ardana > ansible-playbook -i hosts/verb_hosts swift-dispersion-report.yml

The output of the report will look similar to this:

TASK: [swift-dispersion | report | Display dispersion report results] *********
ok: [padawan-ccp-c1-m1-mgmt] => {
    "var": {
        "dispersion_report_result.stdout_lines": [
            "Using storage policy: General ",
            "",
            "[KQueried 40 containers for dispersion reporting, 0s, 0 retries",
            "100.00% of container copies found (120 of 120)",
            "Sample represents 0.98% of the container partition space",
            "",
            "[KQueried 40 objects for dispersion reporting, 0s, 0 retries",
            "There were 40 partitions missing 0 copies.",
            "100.00% of object copies found (120 of 120)",
            "Sample represents 0.98% of the object partition space"
        ]
    }
}
...

In addition to being able to run the report above, there will be a cron-job scheduled to run every 2 hours located on the primary proxy node of your cloud environment. It will run dispersion-report and save the results to the following location on its local filesystem:

/var/cache/swift/dispersion-report

When interpreting the results you get from this report, we recommend using swift Administrator's Guide - Cluster Health

9.2 Gathering Swift Data Edit source

The swift-recon command retrieves data from swift servers and displays the results. To use this command, log on as a root user to any node which is running the swift-proxy service.

9.2.1 Notes Edit source

For help with the swift-recon command you can use this:

tux > sudo swift-recon --help
Warning
Warning

The --driveaudit option is not supported.

Warning
Warning

SUSE OpenStack Cloud does not support ec_type isa_l_rs_vand and ec_num_parity_fragments greater than or equal to 5 in the storage-policy configuration. This particular policy is known to harm data durability.

9.2.2 Using the swift-recon Command Edit source

The following command retrieves and displays disk usage information:

tux > sudo swift-recon --diskusage

For example:

tux > sudo swift-recon --diskusage
===============================================================================
--> Starting reconnaissance on 3 hosts
===============================================================================
[2015-09-14 16:01:40] Checking disk usage now
Distribution Graph:
 10%    3 *********************************************************************
 11%    1 ***********************
 12%    2 **********************************************
Disk usage: space used: 13745373184 of 119927734272
Disk usage: space free: 106182361088 of 119927734272
Disk usage: lowest: 10.39%, highest: 12.96%, avg: 11.4613798613%
===============================================================================

In the above example, the results for several nodes are combined together. You can also view the results from individual nodes by adding the -v option as shown in the following example:

tux > sudo swift-recon --diskusage -v
===============================================================================
--> Starting reconnaissance on 3 hosts
===============================================================================
[2015-09-14 16:12:30] Checking disk usage now
-> http://192.168.245.3:6000/recon/diskusage: [{'device': 'disk1', 'avail': 17398411264, 'mounted': True, 'used': 2589544448, 'size': 19987955712}, {'device': 'disk0', 'avail': 17904222208, 'mounted': True, 'used': 2083733504, 'size': 19987955712}]
-> http://192.168.245.2:6000/recon/diskusage: [{'device': 'disk1', 'avail': 17769721856, 'mounted': True, 'used': 2218233856, 'size': 19987955712}, {'device': 'disk0', 'avail': 17793581056, 'mounted': True, 'used': 2194374656, 'size': 19987955712}]
-> http://192.168.245.4:6000/recon/diskusage: [{'device': 'disk1', 'avail': 17912147968, 'mounted': True, 'used': 2075807744, 'size': 19987955712}, {'device': 'disk0', 'avail': 17404235776, 'mounted': True, 'used': 2583719936, 'size': 19987955712}]
Distribution Graph:
 10%    3 *********************************************************************
 11%    1 ***********************
 12%    2 **********************************************
Disk usage: space used: 13745414144 of 119927734272
Disk usage: space free: 106182320128 of 119927734272
Disk usage: lowest: 10.39%, highest: 12.96%, avg: 11.4614140152%
===============================================================================

By default, swift-recon uses the object-0 ring for information about nodes and drives. For some commands, it is appropriate to specify account, container, or object to indicate the type of ring. For example, to check the checksum of the account ring, use the following:

tux > sudo swift-recon --md5 account
===============================================================================
--> Starting reconnaissance on 3 hosts
===============================================================================
[2015-09-14 16:17:28] Checking ring md5sums
3/3 hosts matched, 0 error[s] while checking hosts.
===============================================================================
[2015-09-14 16:17:28] Checking swift.conf md5sum
3/3 hosts matched, 0 error[s] while checking hosts.
===============================================================================

9.3 Gathering Swift Monitoring Metrics Edit source

The swiftlm-scan command is the mechanism used to gather metrics for the monasca system. These metrics are used to derive alarms. For a list of alarms that can be generated from this data, see Section 18.1.1, “Alarm Resolution Procedures”.

To view the metrics, use the swiftlm-scan command directly. Log on to the swift node as the root user. The following example shows the command and a snippet of the output:

tux > sudo swiftlm-scan --pretty
. . .
  {
    "dimensions": {
      "device": "sdc",
      "hostname": "padawan-ccp-c1-m2-mgmt",
      "service": "object-storage"
    },
    "metric": "swiftlm.swift.drive_audit",
    "timestamp": 1442248083,
    "value": 0,
    "value_meta": {
      "msg": "No errors found on device: sdc"
    }
  },
. . .
Note
Note

To make the JSON file easier to read, use the --pretty option.

The fields are as follows:

metric

Specifies the name of the metric.

dimensions

Provides information about the source or location of the metric. The dimensions differ depending on the metric in question. The following dimensions are used by swiftlm-scan:

  • service: This is always object-storage.

  • component: This identifies the component. For example, swift-object-server indicates that the metric is about the swift-object-server process.

  • hostname: This is the name of the node the metric relates to. This is not necessarily the name of the current node.

  • url: If the metric is associated with a URL, this is the URL.

  • port: If the metric relates to connectivity to a node, this is the port used.

  • device: This is the block device a metric relates to.

value

The value of the metric. For many metrics, this is simply the value of the metric. However, if the value indicates a status. If value_meta contains a msg field, the value is a status. The following status values are used:

  • 0 - no error

  • 1 - warning

  • 2 - failure

value_meta

Additional information. The msg field is the most useful of this information.

9.3.1 Optional Parameters Edit source

You can focus on specific sets of metrics by using one of the following optional parameters:

--replication

Checks replication and health status.

--file-ownership

Checks that swift owns its relevant files and directories.

--drive-audit

Checks for logged events about corrupted sectors (unrecoverable read errors) on drives.

--connectivity

Checks connectivity to various servers used by the swift system, including:

  • Checks this node can connect to all memcachd servers

  • Checks that this node can connect to the keystone service (only applicable if this is a proxy server node)

--swift-services

Check that the relevant swift processes are running.

--network-interface

Checks NIC speed and reports statistics for each interface.

--check-mounts

Checks that the node has correctly mounted drives used by swift.

--hpssacli

If this server uses a Smart Array Controller, this checks the operation of the controller and disk drives.

9.4 Using the swift Command-line Client (CLI) Edit source

OpenStackClient (OSC) is a command-line client for OpenStack with a uniform command structure for OpenStack services. Some swift commands do not have OSC equivalents. The swift utility (or swift CLI) is installed on the Cloud Lifecycle Manager node and also on all other nodes running the swift proxy service. To use this utility on the Cloud Lifecycle Manager, you can use the ~/service.osrc file as a basis and then edit it with the credentials of another user if you need to.

ardana > cp ~/service.osrc ~/swiftuser.osrc

Then you can use your preferred editor to edit swiftuser.osrc so you can authenticate using the OS_USERNAME, OS_PASSWORD, and OS_PROJECT_NAME you wish to use. For example, if you want use the demo user that is created automatically for you, it would look like this:

unset OS_DOMAIN_NAME
export OS_IDENTITY_API_VERSION=3
export OS_AUTH_VERSION=3
export OS_PROJECT_NAME=demo
export OS_PROJECT_DOMAIN_NAME=Default
export OS_USERNAME=demo
export OS_USER_DOMAIN_NAME=Default
export OS_PASSWORD=<password>
export OS_AUTH_URL=<auth_URL>
export OS_ENDPOINT_TYPE=internalURL
# OpenstackClient uses OS_INTERFACE instead of OS_ENDPOINT
export OS_INTERFACE=internal
export OS_CACERT=/etc/ssl/certs/ca-certificates.crt
export OS_COMPUTE_API_VERSION=2

You must use the appropriate password for the demo user and select the correct endpoint for the OS_AUTH_URL value, which should be in the ~/service.osrc file you copied.

You can then examine the following account data using this command:

ardana > openstack object store account show

Example showing an environment with no containers or objects:

ardana > openstack object store account show
        Account: AUTH_205804d000a242d385b8124188284998
     Containers: 0
        Objects: 0
          Bytes: 0
X-Put-Timestamp: 1442249536.31989
     Connection: keep-alive
    X-Timestamp: 1442249536.31989
     X-Trans-Id: tx5493faa15be44efeac2e6-0055f6fb3f
   Content-Type: text/plain; charset=utf-8

Use the following command to create a container:

ardana > openstack container create CONTAINER_NAME

Example, creating a container named documents:

ardana > openstack container create documents

The newly created container appears. But there are no objects:

ardana > openstack container show documents
         Account: AUTH_205804d000a242d385b8124188284998
       Container: documents
         Objects: 0
           Bytes: 0
        Read ACL:
       Write ACL:
         Sync To:
        Sync Key:
   Accept-Ranges: bytes
X-Storage-Policy: General
      Connection: keep-alive
     X-Timestamp: 1442249637.69486
      X-Trans-Id: tx1f59d5f7750f4ae8a3929-0055f6fbcc
    Content-Type: text/plain; charset=utf-8

Upload a document:

ardana > openstack object create CONTAINER_NAME FILENAME

Example:

ardana > openstack object create documents mydocument
mydocument

List objects in the container:

ardana > openstack object list CONTAINER_NAME

Example using a container called documents:

ardana > openstack object list documents
mydocument
Note
Note

This is a brief introduction to the swift CLI. Use the swift --help command for more information. You can also use the OpenStack CLI, see openstack -h for more information.

9.5 Managing swift Rings Edit source

swift rings are a machine-readable description of which disk drives are used by the Object Storage service (for example, a drive is used to store account or object data). Rings also specify the policy for data storage (for example, defining the number of replicas). The rings are automatically built during the initial deployment of your cloud, with the configuration provided during setup of the SUSE OpenStack Cloud Input Model. For more information, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 5 “Input Model”.

After successful deployment of your cloud, you may want to change or modify the configuration for swift. For example, you may want to add or remove swift nodes, add additional storage policies, or upgrade the size of the disk drives. For instructions, see Section 9.5.5, “Applying Input Model Changes to Existing Rings” and Section 9.5.6, “Adding a New Swift Storage Policy”.

Note
Note

The process of modifying or adding a configuration is similar to other configuration or topology changes in the cloud. Generally, you make the changes to the input model files at ~/openstack/my_cloud/definition/ on the Cloud Lifecycle Manager and then run Ansible playbooks to reconfigure the system.

Changes to the rings require several phases to complete, therefore, you may need to run the playbooks several times over several days.

The following topics cover ring management.

9.5.1 Rebalancing Swift Rings Edit source

The swift ring building process tries to distribute data evenly among the available disk drives. The data is stored in partitions. (For more information, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 11 “Modifying Example Configurations for Object Storage using Swift”, Section 11.10 “Understanding Swift Ring Specifications”.) If you, for example, double the number of disk drives in a ring, you need to move 50% of the partitions to the new drives so that all drives contain the same number of partitions (and hence same amount of data). However, it is not possible to move the partitions in a single step. It can take minutes to hours to move partitions from the original drives to their new drives (this process is called the replication process).

If you move all partitions at once, there would be a period where swift would expect to find partitions on the new drives, but the data has not yet replicated there so that swift could not return the data to the user. Therefore, swift will not be able to find all of the data in the middle of replication because some data has finished replication while other bits of data are still in the old locations and have not yet been moved. So it is considered best practice to move only one replica at a time. If the replica count is 3, you could first move 16.6% of the partitions and then wait until all data has replicated. Then move another 16.6% of partitions. Wait again and then finally move the remaining 16.6% of partitions. For any given object, only one of the replicas is moved at a time.

9.5.1.1 Reasons to Move Partitions Gradually Edit source

Due to the following factors, you must move the partitions gradually:

  • Not all devices are of the same size. SUSE OpenStack Cloud 9 automatically assigns different weights to drives so that smaller drives store fewer partitions than larger drives.

  • The process attempts to keep replicas of the same partition in different servers.

  • Making a large change in one step (for example, doubling the number of drives in the ring), would result in a lot of network traffic due to the replication process and the system performance suffers. There are two ways to mitigate this:

9.5.2 Using the Weight-Step Attributes to Prepare for Ring Changes Edit source

swift rings are built during a deployment and this process sets the weights of disk drives such that smaller disk drives have a smaller weight than larger disk drives. When making changes in the ring, you should limit the amount of change that occurs. SUSE OpenStack Cloud 9 does this by limiting the weights of the new drives to a smaller value and then building new rings. Once the replication process has finished, SUSE OpenStack Cloud 9 will increase the weight and rebuild rings to trigger another round of replication. (For more information, see Section 9.5.1, “Rebalancing Swift Rings”.)

In addition, you should become familiar with how the replication process behaves on your system during normal operation. Before making ring changes, use the swift-recon command to determine the typical oldest replication times for your system. For instructions, see Section 9.5.4, “Determining When to Rebalance and Deploy a New Ring”.

In SUSE OpenStack Cloud, the weight-step attribute is set in the ring specification of the input model. The weight-step value specifies a maximum value for the change of the weight of a drive in any single rebalance. For example, if you add a drive of 4TB, you would normally assign a weight of 4096. However, if the weight-step attribute is set to 1024 instead then when you add that drive the weight is initially set to 1024. The next time you rebalance the ring, the weight is set to 2048. The subsequent rebalance would then set the weight to the final value of 4096.

The value of the weight-step attribute is dependent on the size of the drives, number of the servers being added, and how experienced you are with the replication process. A common starting value is to use 20% of the size of an individual drive. For example, when adding X number of 4TB drives a value of 820 would be appropriate. As you gain more experience with your system, you may increase or reduce this value.

9.5.2.1 Setting the weight-step attribute Edit source

Perform the following steps to set the weight-step attribute:

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the ~/openstack/my_cloud/definition/data/swift/swift_config.yml file containing the ring-specifications for the account, container, and object rings.

    Add the weight-step attribute to the ring in this format:

    - name: account
      weight-step: WEIGHT_STEP_VALUE
      display-name: Account Ring
      min-part-hours: 16
      ...

    For example, to set weight-step to 820, add the attribute like this:

    - name: account
      weight-step: 820
      display-name: Account Ring
      min-part-hours: 16
      ...
  3. Repeat step 2 for the other rings, if necessary (container, object-0, etc).

  4. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  5. Use the playbook to create a deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  6. To complete the configuration, use the ansible playbooks documented in Section 9.5.3, “Managing Rings Using swift Playbooks”.

9.5.3 Managing Rings Using swift Playbooks Edit source

The following table describes how playbooks relate to ring management.

All of these playbooks will be run from the Cloud Lifecycle Manager from the ~/scratch/ansible/next/ardana/ansible directory.

PlaybookDescriptionNotes
swift-update-from-model-rebalance-rings.yml

There are two steps in this playbook:

  • Make delta

    It processes the input model and compares it against the existing rings. After comparison, it produces a list of differences between the input model and the existing rings. This is called the ring delta. The ring delta covers drives being added, drives being removed, weight changes, and replica count changes.

  • Rebalance

    The ring delta is then converted into a series of commands (such as add) to the swift-ring-builder program. Finally, the rebalance command is issued to the swift-ring-builder program.

This playbook performs its actions on the first node running the swift-proxy service. (For more information, see Section 18.6.2.4, “Identifying the Swift Ring Building Server”.) However, it also scans all swift nodes to find the size of disk drives.

If there are no changes in the ring delta, the rebalance command is still executed to rebalance the rings. If min-part-hours has not yet elapsed or if no partitions need to be moved, new rings are not written.

swift-compare-model-rings.yml

There are two steps in this playbook:

  • Make delta

    This is the same as described for swift-update-from-model-rebalance-rings.yml.

  • Report

    This prints a summary of the proposed changes that will be made to the rings (that is, what would happen if you rebalanced).

The playbook reports any issues or problems it finds with the input model.

This playbook can be useful to confirm that there are no errors in the input model. It also allows you to check that when you change the input model, that the proposed ring changes are as expected. For example, if you have added a server to the input model, but this playbook reports that no drives are being added, you should determine the cause.

There is troubleshooting information related to the information that you receive in this report that you can view on this page: Section 18.6.2.3, “Interpreting Swift Input Model Validation Errors”.

swift-deploy.yml

swift-deploy.yml is responsible for installing software and configuring swift on nodes. As part of installing and configuring, it runs the swift-update-from-model-rebalance-rings.yml and swift-reconfigure.yml playbooks.

This playbook is included in the ardana-deploy.yml and site.yml playbooks, so if you run either of those playbooks, the swift-deploy.yml playbook is also run.

swift-reconfigure.yml

swift-reconfigure.yml takes rings that the swift-update-from-model-rebalance-rings.yml playbook has changed and copies those rings to all swift nodes.

Every time that you directly use the swift-update-from-model-rebalance-rings.yml playbook, you must copy these rings to the system using the swift-reconfigure.yml playbook. If you forget and run swift-update-from-model-rebalance-rings.yml twice, the process may move two replicates of some partitions at the same time.

9.5.3.1 Optional Ansible variables related to ring management Edit source

The following optional variables may be specified when running the playbooks outlined above. They are specified using the --extra-vars option.

VariableDescription and Use
limit_ring

Limit changes to the named ring. Other rings will not be examined or updated. This option may be used with any of the swift playbooks. For example, to only update the object-1 ring, use the following command:

ardana > ansible-playbook -i hosts/verb_hosts swift-update-from-model-rebalance-rings.yml --extra-vars "limit-ring=object-1"
drive_detail

Used only with the swift-compare-model-rings.yml playbook. The playbook will include details of changes to every drive where the model and existing rings differ. If you omit the drive_detail variable, only summary information is provided. The following shows how to use the drive_detail variable:

ardana > ansible-playbook -i hosts/verb_hosts swift-compare-model-rings.yml --extra-vars "drive_detail=yes"

9.5.3.2 Interpreting the report from the swift-compare-model-rings.yml playbook Edit source

The swift-compare-model-rings.yml playbook compares the existing swift rings with the input model and prints a report telling you how the rings and the model differ. Specifically, it will tell you what actions will take place when you next run the swift-update-from-model-rebalance-rings.yml playbook (or a playbook such as ardana-deploy.yml that runs swift-update-from-model-rebalance-rings.yml).

The swift-compare-model-rings.yml playbook will make no changes, but is just an advisory report.

Here is an example output from the playbook. The report is between "report.stdout_lines" and "PLAY RECAP":

TASK: [swiftlm-ring-supervisor | validate-input-model | Print report] *********
ok: [ardana-cp1-c1-m1-mgmt] => {
    "var": {
        "report.stdout_lines": [
            "Rings:",
            "  ACCOUNT:",
            "    ring exists (minimum time to next rebalance: 8:07:33)",
            "    will remove 1 devices (18.00GB)",
            "    ring will be rebalanced",
            "  CONTAINER:",
            "    ring exists (minimum time to next rebalance: 8:07:35)",
            "    no device changes",
            "    ring will be rebalanced",
            "  OBJECT-0:",
            "    ring exists (minimum time to next rebalance: 8:07:34)",
            "    no device changes",
            "    ring will be rebalanced"
        ]
    }
}

The following describes the report in more detail:

MessageDescription

ring exists

The ring already exists on the system.

ring will be created

The ring does not yet exist on the system.

no device changes

The devices in the ring exactly match the input model. There are no servers being added or removed and the weights are appropriate for the size of the drives.

minimum time to next rebalance

If this time is 0:00:00, if you run one of the swift playbooks that update rings, the ring will be rebalanced.

If the time is non-zero, it means that not enough time has elapsed since the ring was last rebalanced. Even if you run a swift playbook that attempts to change the ring, the ring will not actually rebalance. This time is determined by the min-part-hours attribute.

set-weight ardana-ccp-c1-m1-mgmt:disk0:/dev/sdc 8.00 > 12.00 > 18.63

The weight of disk0 (mounted on /dev/sdc) on server ardana-ccp-c1-m1-mgmt is currently set to 8.0 but should be 18.83 given the size of the drive. However, in this example, we cannot go directly from 8.0 to 18.63 because of the weight-step attribute. Hence, the proposed weight change is from 8.0 to 12.0.

This information is only shown when you the drive_detail=yes argument when running the playbook.

will change weight on 12 devices (6.00TB)

The weight of 12 devices will be increased. This might happen for example, if a server had been added in a prior ring update. However, with use of the weight-step attribute, the system gradually increases the weight of these new devices. In this example, the change in weight represents 6TB of total available storage. For example, if your system currently has 100TB of available storage, when the weight of these devices is changed, there will be 106TB of available storage. If your system is 50% utilized, this means that when the ring is rebalanced, up to 3TB of data may be moved by the replication process. This is an estimate - in practice, because only one copy of a given replica is moved in any given rebalance, it may not be possible to move this amount of data in a single ring rebalance.

add: ardana-ccp-c1-m1-mgmt:disk0:/dev/sdc

The disk0 device will be added to the ardana-ccp-c1-m1-mgmt server. This happens when a server is added to the input model or if a disk model is changed to add additional devices.

This information is only shown when you the drive_detail=yes argument when running the playbook.

remove: ardana-ccp-c1-m1-mgmt:disk0:/dev/sdc

The device is no longer in the input model and will be removed from the ring. This happens if a server is removed from the model, a disk drive is removed from a disk model or the server is marked for removal using the pass-through feature.

This information is only shown when you the drive_detail=yes argument when running the playbook.

will add 12 devices (6TB)

There are 12 devices in the input model that have not yet been added to the ring. Usually this is because one or more servers have been added. In this example, this could be one server with 12 drives or two servers, each with 6 drives. The size in the report is the change in total available capacity. When the weight-step attribute is used, this may be a fraction of the total size of the disk drives. In this example, 6TB of capacity is being added. For example, if your system currently has 100TB of available storage, when these devices are added, there will be 106TB of available storage. If your system is 50% utilized, this means that when the ring is rebalanced, up to 3TB of data may be moved by the replication process. This is an estimate - in practice, because only one copy of a given replica is moved in any given rebalance, it may not be possible to move this amount of data in a single ring rebalance.

will remove 12 devices (6TB)

There are 12 devices in rings that no longer appear in the input model. Usually this is because one or more servers have been removed. In this example, this could be one server with 12 drives or two servers, each with 6 drives. The size in the report is the change in total removed capacity. In this example, 6TB of capacity is being removed. For example, if your system currently has 100TB of available storage, when these devices are removed, there will be 94TB of available storage. If your system is 50% utilized, this means that when the ring is rebalanced, approximately 3TB of data must be moved by the replication process.

min-part-hours will be changed

The min-part-hours attribute has been changed in the ring specification in the input model.

replica-count will be changed

The replica-count attribute has been changed in the ring specification in the input model.

ring will be rebalanced

This is always reported. Every time the swift-update-from-model-rebalance-rings.yml playbook is run, it will execute the swift-ring-builder rebalance command. This happens even if there were no input model changes. If the ring is already well balanced, the swift-ring-builder will not rewrite the ring.

9.5.4 Determining When to Rebalance and Deploy a New Ring Edit source

Before deploying a new ring, you must be sure the change that has been applied to the last ring is complete (that is, all the partitions are in their correct location). There are three aspects to this:

  • Is the replication system busy?

    You might want to postpone a ring change until after replication has finished. If the replication system is busy repairing a failed drive, a ring change will place additional load on the system. To check that replication has finished, use the swift-recon command with the --replication argument. (For more information, see Section 9.2, “Gathering Swift Data”.) The oldest completion time can indicate that the replication process is very busy. If it is more than 15 or 20 minutes then the object replication process are probably still very busy. The following example indicates that the oldest completion is 120 seconds, so that the replication process is probably not busy:

    root # swift-recon --replication
    ===============================================================================
    --> Starting reconnaissance on 3 hosts
    ===============================================================================
    [2015-10-02 15:31:45] Checking on replication
    [replication_time] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3
    Oldest completion was 2015-10-02 15:31:32 (120 seconds ago) by 192.168.245.4:6000.
    Most recent completion was 2015-10-02 15:31:43 (10 seconds ago) by 192.168.245.3:6000.
    ===============================================================================
  • Are there drive or server failures?

    A drive failure does not preclude deploying a new ring. In principle, there should be two copies elsewhere. However, another drive failure in the middle of replication might make data temporary unavailable. If possible, postpone ring changes until all servers and drives are operating normally.

  • Has min-part-hours elapsed?

    The swift-ring-builder will refuse to build a new ring until the min-part-hours has elapsed since the last time it built rings. You must postpone changes until this time has elapsed.

    You can determine how long you must wait by running the swift-compare-model-rings.yml playbook, which will tell you how long you until the min-part-hours has elapsed. For more details, see Section 9.5.3, “Managing Rings Using swift Playbooks”.

    You can change the value of min-part-hours. (For instructions, see Section 9.5.7, “Changing min-part-hours in Swift”).

  • Is the swift dispersion report clean?

    Run the swift-dispersion-report.yml playbook (as described in Section 9.1, “Running the swift Dispersion Report”) and examine the results. If the replication process has not yet replicated partitions that were moved to new drives in the last ring rebalance, the dispersion report will indicate that some containers or objects are missing a copy.

    For example:

    There were 462 partitions missing one copy.

    Assuming all servers and disk drives are operational, the reason for the missing partitions is that the replication process has not yet managed to copy a replica into the partitions.

    You should wait an hour and rerun the dispersion report process and examine the report. The number of partitions missing one copy should have reduced. Continue to wait until this reaches zero before making any further ring rebalances.

    Note
    Note

    It is normal to see partitions missing one copy if disk drives or servers are down. If all servers and disk drives are mounted, and you did not recently perform a ring rebalance, you should investigate whether there are problems with the replication process. You can use the Operations Console to investigate replication issues.

    Important
    Important

    If there are any partitions missing two copies, you must reboot or repair any failed servers and disk drives as soon as possible. Do not shutdown any swift nodes in this situation. Assuming a replica count of 3, if you are missing two copies you are in danger of losing the only remaining copy.

9.5.5 Applying Input Model Changes to Existing Rings Edit source

This page describes a general approach for making changes to your existing swift rings. This approach applies to actions such as adding and removing a server and replacing and upgrading disk drives, and must be performed as a series of phases, as shown below:

9.5.5.1 Changing the Input Model Configuration Files Edit source

The first step to apply new changes to the swift environment is to update the configuration files. Follow these steps:

  1. Log in to the Cloud Lifecycle Manager.

  2. Set the weight-step attribute, as needed, for the nodes you are altering. (For instructions, see Section 9.5.2, “Using the Weight-Step Attributes to Prepare for Ring Changes”).

  3. Edit the configuration files as part of the Input Model as appropriate. (For general information about the Input Model, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 6 “Configuration Objects”, Section 6.14 “Networks”. For more specific information about the swift parts of the configuration files, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 11 “Modifying Example Configurations for Object Storage using Swift”)

  4. Once you have completed all of the changes, commit your configuration to the local git repository. (For more information, seeBook “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”.) :

    ardana > git add -A
    root # git commit -m "commit message"
  5. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  6. Create a deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  7. Run the swift playbook that will validate your configuration files and give you a report as an output:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    root # ansible-playbook -i hosts/verb_hosts swift-compare-model-rings.yml
  8. Use the report to validate that the number of drives proposed to be added or deleted, or the weight change, is correct. Fix any errors in your input model. At this stage, no changes have been made to rings.

9.5.5.2 First phase of Ring Rebalance Edit source

To begin the rebalancing of the swift rings, follow these steps:

  1. After going through the steps in the section above, deploy your changes to all of the swift nodes in your environment by running this playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-deploy.yml
  2. Wait until replication has finished or min-part-hours has elapsed (whichever is longer). For more information, see Section 9.5.4, “Determining When to Rebalance and Deploy a New Ring”

9.5.5.3 Weight Change Phase of Ring Rebalance Edit source

At this stage, no changes have been made to the input model. However, when you set the weight-step attribute, the rings that were rebuilt in the previous rebalance phase have weights that are different than their target/final value. You gradually move to the target/final weight by rebalancing a number of times as described on this page. For more information about the weight-step attribute, see Section 9.5.2, “Using the Weight-Step Attributes to Prepare for Ring Changes”.

To begin the re-balancing of the rings, follow these steps:

  1. Rebalance the rings by running the playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-update-from-model-rebalance-rings.yml
  2. Run the reconfiguration:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-reconfigure.yml
  3. Wait until replication has finished or min-part-hours has elapsed (whichever is longer). For more information, see Section 9.5.4, “Determining When to Rebalance and Deploy a New Ring”

  4. Run the following command and review the report:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-compare-model-rings.yml --limit SWF*

    The following is an example of the output after executing the above command. In the example no weight changes are proposed:

    TASK: [swiftlm-ring-supervisor | validate-input-model | Print report] *********
    ok: [padawan-ccp-c1-m1-mgmt] => {
        "var": {
            "report.stdout_lines": [
                "Need to add 0 devices",
                "Need to remove 0 devices",
                "Need to set weight on 0 devices"
            ]
        }
    }
  5. When there are no proposed weight changes, you proceed to the final phase.

  6. If there are proposed weight changes repeat this phase again.

9.5.5.4 Final Rebalance Phase Edit source

The final rebalance phase moves all replicas to their final destination.

  1. Rebalance the rings by running the playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-update-from-model-rebalance-rings.yml | tee /tmp/rebalance.log
    Note
    Note

    The output is saved for later reference.

  2. Review the output from the previous step. If the output for all rings is similar to the following, the rebalance had no effect. That is, the rings are balanced and no further changes are needed. In addition, the ring files were not changed so you do not need to deploy them to the swift nodes:

    "Running: swift-ring-builder /etc/swiftlm/cloud1/cp1/builder_dir/account.builder rebalance 999",
          "NOTE: No partitions could be reassigned.",
          "Either none need to be or none can be due to min_part_hours [16]."

    The text No partitions could be reassigned indicates that no further rebalances are necessary. If this is true for all the rings, you have completed the final phase.

    Note
    Note

    You must have allowed enough time to elapse since the last rebalance. As mentioned in the above example, min_part_hours [16] means that you must wait at least 16 hours since the last rebalance. If not, you should wait until enough time has elapsed and repeat this phase.

  3. Run the swift-reconfigure.yml playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-reconfigure.yml
  4. Wait until replication has finished or min-part-hours has elapsed (whichever is longer). For more information see Section 9.5.4, “Determining When to Rebalance and Deploy a New Ring”

  5. Repeat the above steps until the ring is rebalanced.

9.5.5.5 System Changes that Change Existing Rings Edit source

There are many system changes ranging from adding servers to replacing drives, which might require you to rebuild and rebalance your rings.

Actions Process
Adding Servers(s)
Removing Server(s)

In SUSE OpenStack Cloud, when you remove servers from the input model, the disk drives are removed from the ring - the weight is not gradually reduced using the weight-step attribute.

  • Remove servers in phases:

    • This reduces the impact of the changes on your system.

    • If your rings use swift zones, ensure you remove the same number of servers for each zone at each phase.

Adding Disk Drive(s)
Replacing Disk Drive(s)

When a drive fails, replace it as soon as possible. Do not attempt to remove it from the ring - this creates operator overhead. swift will continue to store the correct number of replicas by handing off objects to other drives instead of the failed drive.

If the disk drives are of the same size as the original when the drive is replaced, no ring changes are required. You can confirm this by running the swift-update-from-model-rebalance-rings.yml playbook. It should report that no weight changes are needed.

For a single drive replacement, even if the drive is significantly larger than the original drives, you do not need to rebalance the ring (however, the extra space on the drive will not be used).

Upgrading Disk Drives

If the drives are different size (for example, you are upgrading your system), you can proceed as follows:

  • If not already done, set the weight-step attribute

  • Replace drives in phases:

    • Avoid replacing too many drives at once.

    • If your rings use swift zones, upgrade a number of drives in the same zone at the same time - not drives in several zones.

    • It is also safer to upgrade one server instead of drives in several servers at the same time.

    • Remember that the final size of all swift zones must be the same, so you may need to replace a small number of drives in one zone, then a small number in second zone, then return to the first zone and replace more drives, etc.

Removing Disk Drive(s)

When removing a disk drive from the input model, keep in mind that this drops the disk out of the ring without allowing Swift to move the data off it first. While it should be fine in a properly replicated healthy cluster, we do not recommend this approach. A better solution is to step down weight_step to 0 to allow Swift to move data.

9.5.6 Adding a New Swift Storage Policy Edit source

This page describes how to add an additional storage policy to an existing system. For an overview of storage policies, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 11 “Modifying Example Configurations for Object Storage using Swift”, Section 11.11 “Designing Storage Policies”.

To Add a Storage Policy

Perform the following steps to add the storage policy to an existing system.

  1. Log in to the Cloud Lifecycle Manager.

  2. Select a storage policy index and ring name.

    For example, if you already have object-0 and object-1 rings in your ring-specifications (usually in the ~/openstack/my_cloud/definition/data/swift/swift_config.yml file), the next index is 2 and the ring name is object-2.

  3. Select a user-visible name so that you can see when you examine container metadata or when you want to specify the storage policy used when you create a container. The name should be a single word (hyphen and dashes are allowed).

  4. Decide if this new policy will be the default for all new containers.

  5. Decide on other attributes such as partition-power and replica-count if you are using a standard replication ring. However, if you are using an erasure coded ring, you also need to decide on other attributes: ec-type, ec-num-data-fragments, ec-num-parity-fragments, and ec-object-segment-size. For more details on the required attributes, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 11 “Modifying Example Configurations for Object Storage using Swift”, Section 11.10 “Understanding Swift Ring Specifications”.

  6. Edit the ring-specifications attribute (usually in the ~/openstack/my_cloud/definition/data/swift/swift_config.yml file) and add the new ring specification. If this policy is to be the default storage policy for new containers, set the default attribute to yes.

    Note
    Note
    1. Ensure that only one object ring has the default attribute set to yes. If you set two rings as default, swift processes will not start.

    2. Do not specify the weight-step attribute for the new object ring. Since this is a new ring there is no need to gradually increase device weights.

  7. Update the appropriate disk model to use the new storage policy (for example, the data/disks_swobj.yml file). The following sample shows that the object-2 has been added to the list of existing rings that use the drives:

    disk-models:
    - name: SWOBJ-DISKS
      ...
      device-groups:
      - name: swobj
        devices:
           ...
        consumer:
            name: swift
            attrs:
                rings:
                - object-0
                - object-1
                - object-2
      ...
    Note
    Note

    You must use the new object ring on at least one node that runs the swift-object service. If you skip this step and continue to run the swift-compare-model-rings.yml or swift-deploy.yml playbooks, they will fail with an error There are no devices in this ring, or all devices have been deleted, as shown below:

    TASK: [swiftlm-ring-supervisor | build-rings | Build ring (make-delta, rebalance)] ***
    failed: [padawan-ccp-c1-m1-mgmt] => {"changed": true, "cmd": ["swiftlm-ring-supervisor", "--make-delta", "--rebalance"], "delta": "0:00:03.511929", "end": "2015-10-07 14:02:03.610226", "rc": 2, "start": "2015-10-07 14:02:00.098297", "warnings": []}
    ...
    Running: swift-ring-builder /etc/swiftlm/cloud1/cp1/builder_dir/object-2.builder rebalance 999
    ERROR: -------------------------------------------------------------------------------
    An error has occurred during ring validation. Common
    causes of failure are rings that are empty or do not
    have enough devices to accommodate the replica count.
    Original exception message:
    There are no devices in this ring, or all devices have been deleted
    -------------------------------------------------------------------------------
  8. Commit your configuration:

    ardana > git add -A
    ardana > git commit -m "commit message"
  9. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  10. Create a deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  11. Validate the changes by running the swift-compare-model-rings.yml playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-compare-model-rings.yml

    If any errors occur, correct them. For instructions, see Section 18.6.2.3, “Interpreting Swift Input Model Validation Errors”. Then, re-run steps 5 - 10.

  12. Create the new ring (for example, object-2). Then verify the swift service status and reconfigure the swift node to use a new storage policy, by running these playbooks:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-status.yml
    ardana > ansible-playbook -i hosts/verb_hosts swift-deploy.yml

After adding a storage policy, there is no need to rebalance the ring.

9.5.7 Changing min-part-hours in Swift Edit source

The min-part-hours parameter specifies the number of hours you must wait before swift will allow a given partition to be moved. In other words, it constrains how often you perform ring rebalance operations. Before changing this value, you should get some experience with how long it takes your system to perform replication after you make ring changes (for example, when you add servers).

See Section 9.5.4, “Determining When to Rebalance and Deploy a New Ring” for more information about determining when replication has completed.

9.5.7.1 Changing the min-part-hours Value Edit source

To change the min-part-hours value, following these steps:

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit your ~/openstack/my_cloud/definition/data/swift/swift_config.yml file and change the value(s) of min-part-hours for the rings you desire. The value is expressed in hours and a value of zero is not allowed.

  3. Commit your configuration to the local Git repository (Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”), as follows:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git add -A
    ardana > git commit -m "My config or other commit message"
  4. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  5. Update your deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  6. Apply the changes by running this playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-deploy.yml

9.5.8 Changing Swift Zone Layout Edit source

Before changing the number of swift zones or the assignment of servers to specific zones, you must ensure that your system has sufficient storage available to perform the operation. Specifically, if you are adding a new zone, you may need additional storage. There are two reasons for this:

  • You cannot simply change the swift zone number of disk drives in the ring. Instead, you need to remove the server(s) from the ring and then re-add the server(s) with a new swift zone number to the ring. At the point where the servers are removed from the ring, there must be sufficient spare capacity on the remaining servers to hold the data that was originally hosted on the removed servers.

  • The total amount of storage in each swift zone must be the same. This is because new data is added to each zone at the same rate. If one zone has a lower capacity than the other zones, once that zone becomes full, you cannot add more data to the system – even if there is unused space in the other zones.

As mentioned above, you cannot simply change the swift zone number of disk drives in an existing ring. Instead, you must remove and then re-add servers. This is a summary of the process:

  1. Identify appropriate server groups that correspond to the desired swift zone layout.

  2. Remove the servers in a server group from the rings. This process may be protracted, either by removing servers in small batches or by using the weight-step attribute so that you limit the amount of replication traffic that happens at once.

  3. Once all the targeted servers are removed, edit the swift-zones attribute in the ring specifications to add or remove a swift zone.

  4. Re-add the servers you had temporarily removed to the rings. Again you may need to do this in batches or rely on the weight-step attribute.

  5. Continue removing and re-adding servers until you reach your final configuration.

9.5.8.1 Process for Changing Swift Zones Edit source

This section describes the detailed process or reorganizing swift zones. As a concrete example, we assume we start with a single swift zone and the target is three swift zones. The same general process would apply if you were reducing the number of zones as well.

The process is as follows:

  1. Identify the appropriate server groups that represent the desired final state. In this example, we are going to change the swift zone layout as follows:

    Original LayoutTarget Layout
    swift-zones:
      - 1d: 1
        server-groups:
           - AZ1
           - AZ2
           - AZ3
    swift-zones:
       - 1d: 1
         server-groups:
            - AZ1
       - id: 2
            - AZ2
       - id: 3
            - AZ3

    The plan is to move servers from server groups AZ2 and AZ3 to a new swift zone number. The servers in AZ1 will remain in swift zone 1.

  2. If you have not already done so, consider setting the weight-step attribute as described in Section 9.5.2, “Using the Weight-Step Attributes to Prepare for Ring Changes”.

  3. Identify the servers in the AZ2 server group. You may remove all servers at once or remove them in batches. If this is the first time you have performed a major ring change, we suggest you remove one or two servers only in the first batch. When you see how long this takes and the impact replication has on your system you can then use that experience to decide whether you can remove a larger batch of servers, or increase or decrease the weight-step attribute for the next server-removal cycle. To remove a server, use steps 2-9 as described in Section 15.1.5.1.4, “Removing a Swift Node” ensuring that you do not remove the servers from the input model.

  4. This process may take a number of ring rebalance cycles until the disk drives are removed from the ring files. Once this happens, you can edit the ring specifications and add swift zone 2 as shown in this example:

    swift-zones:
      - id: 1
        server-groups:
          - AZ1
          - AZ3
      - id: 2
           - AZ2
  5. The server removal process in step #3 set the "remove" attribute in the pass-through attribute of the servers in server group AZ2. Edit the input model files and remove this pass-through attribute. This signals to the system that the servers should be used the next time we rebalance the rings (that is, the server should be added to the rings).

  6. Commit your configuration to the local Git repository (Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 22 “Using Git for Configuration Management”), as follows:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git add -A
    ardana > git commit -m "My config or other commit message"
  7. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  8. Use the playbook to create a deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  9. Rebuild and deploy the swift rings containing the re-added servers by running this playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-deploy.yml
  10. Wait until replication has finished. For more details, see Section 9.5.4, “Determining When to Rebalance and Deploy a New Ring”.

  11. You may need to continue to rebalance the rings. For instructions, see the "Final Rebalance Stage" steps at Section 9.5.5, “Applying Input Model Changes to Existing Rings”.

  12. At this stage, the servers in server group AZ2 are responsible for swift zone 2. Repeat the process in steps #3-9 to remove the servers in server group AZ3 from the rings and then re-add them to swift zone 3. The ring specifications for zones (step 4) should be as follows:

    swift-zones:
      - 1d: 1
        server-groups:
          - AZ1
      - id: 2
          - AZ2
      - id: 3
          - AZ3
  13. Once complete, all data should be dispersed (that is, each replica is located) in the swift zones as specified in the input model.

9.6 Configuring your swift System to Allow Container Sync Edit source

swift has a feature where all the contents of a container can be mirrored to another container through background synchronization. swift operators configure their system to allow/accept sync requests to/from other systems, and the user specifies where to sync their container to along with a secret synchronization key. For an overview of this feature, refer to OpenStack swift - Container to Container Synchronization.

9.6.1 Notes and limitations Edit source

The container synchronization is done as a background action. When you put an object into the source container, it will take some time before it becomes visible in the destination container. Storage services will not necessarily copy objects in any particular order, meaning they may be transferred in a different order to which they were created.

Container sync may not be able to keep up with a moderate upload rate to a container. For example, if the average object upload rate to a container is greater than one object per second, then container sync may not be able to keep the objects synced.

If container sync is enabled on a container that already has a large number of objects then container sync may take a long time to sync the data. For example, a container with one million 1KB objects could take more than 11 days to complete a sync.

You may operate on the destination container just like any other container -- adding or deleting objects -- including the objects that are in the destination container because they were copied from the source container. To decide how to handle object creation, replacement or deletion, the system uses timestamps to determine what to do. In general, the latest timestamp "wins". That is, if you create an object, replace it, delete it and the re-create it, the destination container will eventually contain the most recently created object. However, if you also create and delete objects in the destination container, you get some subtle behaviours as follows:

  • If an object is copied to the destination container and then deleted, it remains deleted in the destination even though there is still a copy in the source container. If you modify the object (replace or change its metadata) in the source container, it will reappear in the destination again.

  • The same applies to a replacement or metadata modification of an object in the destination container -- the object will remain as-is unless there is a replacement or modification in the source container.

  • If you replace or modify metadata of an object in the destination container and then delete it in the source container, it is not deleted from the destination. This is because your modified object has a later timestamp than the object you deleted in the source.

  • If you create an object in the source container and before the system has a chance to copy it to the destination, you also create an object of the same name in the destination, then the object in the destination is not overwritten by the source container's object.

Segmented objects

Segmented objects (objects larger than 5GB) will not work seamlessly with container synchronization. If the manifest object is copied to the destination container before the object segments, when you perform a GET operation on the manifest object, the system may fail to find some or all of the object segments. If your manifest and object segments are in different containers, do not forget that both containers must be synchonized and that the container name of the object segments must be the same on both source and destination.

9.6.2 Prerequisites Edit source

Container to container synchronization requires that SSL certificates are configured on both the source and destination systems. For more information on how to implement SSL, see Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 41 “Configuring Transport Layer Security (TLS)”.

9.6.3 Configuring container sync Edit source

Container to container synchronization requires that both the source and destination swift systems involved be configured to allow/accept this. In the context of container to container synchronization, swift uses the term cluster to denote a swift system. swift clusters correspond to Control Planes in OpenStack terminology.

Gather the public API endpoints for both swift systems

Gather information about the external/public URL used by each system, as follows:

  1. On the Cloud Lifecycle Manager of one system, get the public API endpoint of the system by running the following commands:

    ardana > source ~/service.osrc
    ardana > openstack endpoint list | grep swift

    The output of the command will look similar to this:

    ardana > openstack endpoint list | grep swift
    | 063a84b205c44887bc606c3ba84fa608 | region0 | swift           | object-store    | True    | admin     | https://10.13.111.176:8080/v1/AUTH_%(tenant_id)s |
    | 3c46a9b2a5f94163bb5703a1a0d4d37b | region0 | swift           | object-store    | True    | public    | https://10.13.120.105:8080/v1/AUTH_%(tenant_id)s |
    | a7b2f4ab5ad14330a7748c950962b188 | region0 | swift           | object-store    | True    | internal  | https://10.13.111.176:8080/v1/AUTH_%(tenant_id)s |

    The portion that you want is the endpoint up to, but not including, the AUTH part. It is bolded in the above example, https://10.13.120.105:8080/v1.

  2. Repeat these steps on the other swift system so you have both of the public API endpoints for them.

Validate connectivity between both systems

The swift nodes running the swift-container service must be able to connect to the public API endpoints of each other for the container sync to work. You can validate connectivity on each system using these steps.

For the sake of the examples, we will use the terms source and destination to notate the nodes doing the synchronization.

  1. Log in to a swift node running the swift-container service on the source system. You can determine this by looking at the service list in your ~/openstack/my_cloud/info/service_info.yml file for a list of the servers containing this service.

  2. Verify the SSL certificates by running this command against the destination swift server:

    echo | openssl s_client -connect PUBLIC_API_ENDPOINT:8080 -CAfile /etc/ssl/certs/ca-certificates.crt

    If the connection was successful you should see a return code of 0 (ok) similar to this:

    ...
    Timeout   : 300 (sec)
    Verify return code: 0 (ok)
  3. Also verify that the source node can connect to the destination swift system using this command:

    ardana > curl -k DESTINATION_IP OR HOSTNAME:8080/healthcheck

    If the connection was successful, you should see a response of OK.

  4. Repeat these verification steps on any system involved in your container synchronization setup.

Configure container to container synchronization

Both the source and destination swift systems must be configured the same way, using sync realms. For more details on how sync realms work, see OpenStack swift - Configuring Container Sync.

To configure one of the systems, follow these steps:

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the ~/openstack/my_cloud/config/swift/container-sync-realms.conf.j2 file and uncomment the sync realm section.

    Here is a sample showing this section in the file:

    #Add sync realms here, for example:
    # [realm1]
    # key = realm1key
    # key2 = realm1key2
    # cluster_name1 = https://host1/v1/
    # cluster_name2 = https://host2/v1/
  3. Add in the details for your source and destination systems. Each realm you define is a set of clusters that have agreed to allow container syncing between them. These values are case sensitive.

    Only one key is required. The second key is optional and can be provided to allow an operator to rotate keys if desired. The values for the clusters must contain the prefix cluster_ and will be populated with the public API endpoints for the systems.

  4. Commit the changes to git:

    ardana > git add -A
    ardana > git commit -a -m "Add node <name>"
  5. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  6. Update the deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  7. Run the swift reconfigure playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible/
    ardana > ansible-playbook -i hosts/verb_hosts swift-reconfigure.yml
  8. Run this command to validate that your container synchronization is configured:

    ardana > source ~/service.osrc
    ardana > swift capabilities

    Here is a snippet of the output showing the container sync information. This should be populated with your cluster names:

    ...
    Additional middleware: container_sync
     Options:
      realms: {u'INTRACLUSTER': {u'clusters': {u'THISCLUSTER': {}}}}
  9. Repeat these steps on any other swift systems that will be involved in your sync realms.

9.6.4 Configuring Intra Cluster Container Sync Edit source

It is possible to use the swift container sync functionality to sync objects between containers within the same swift system. swift is automatically configured to allow intra cluster container sync. Each swift PAC server will have an intracluster container sync realm defined in /etc/swift/container-sync-realms.conf.

For example:

# The intracluster realm facilitates syncing containers on this system
[intracluster]
key = lQ8JjuZfO
# key2 =
cluster_thiscluster = http://SWIFT-PROXY-VIP:8080/v1/

The keys defined in /etc/swift/container-sync-realms.conf are used by the container-sync daemon to determine trust. On top of this the containers that will be in sync will need a seperate shared key they both define in container metadata to establish their trust between each other.

  1. Create two containers, for example container-src and container-dst. In this example we will sync one way from container-src to container-dst.

    ardana > openstack container create container-src
    ardana > openstack container create container-dst
  2. Determine your swift account. In the following example it is AUTH_1234

    ardana > openstack container show
                                     Account: AUTH_1234
                                  Containers: 3
                                     Objects: 42
                                       Bytes: 21692421
    Containers in policy "erasure-code-ring": 3
       Objects in policy "erasure-code-ring": 42
         Bytes in policy "erasure-code-ring": 21692421
                                Content-Type: text/plain; charset=utf-8
                 X-Account-Project-Domain-Id: default
                                 X-Timestamp: 1472651418.17025
                                  X-Trans-Id: tx81122c56032548aeae8cd-0057cee40c
                               Accept-Ranges: bytes
  3. Configure container-src to sync to container-dst using a key specified by both containers. Replace KEY with your key.

    ardana > openstack container set -t '//intracluster/thiscluster/AUTH_1234/container-dst' -k 'KEY' container-src
  4. Configure container-dst to accept synced objects with this key

    ardana > openstack container set -k 'KEY' container-dst
  5. Upload objects to container-src. Within a number of minutes the objects should be automatically synced to container-dst.

Changing the intracluster realm key

The intracluster realm key used by container sync to sync objects between containers in the same swift system is automatically generated. The process for changing passwords is described in Section 5.7, “Changing Service Passwords”.

The steps to change the intracluster realm key are as follows.

  1. On the Cloud Lifecycle Manager create a file called ~/openstack/change_credentials/swift_data_metadata.yml with the contents included below. The consuming-cp and cp are the control plane name specified in ~/openstack/my_cloud/definition/data/control_plane.yml where the swift-container service is running.

    swift_intracluster_sync_key:
     metadata:
     - clusters:
       - swpac
       component: swift-container
       consuming-cp: control-plane-1
       cp: control-plane-1
     version: '2.0'
  2. Run the following commands

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  3. Reconfigure the swift credentials

    ardana > cd ~/scratch/ansible/next/ardana/ansible/
    ardana > ansible-playbook -i hosts/verb_hosts swift-reconfigure-credentials-change.yml
  4. Delete ~/openstack/change_credentials/swift_data_metadata.yml

    ardana > rm ~/openstack/change_credentials/swift_data_metadata.yml
  5. On a swift PAC server check that the intracluster realm key has been updated in /etc/swift/container-sync-realms.conf

    # The intracluster realm facilitates syncing containers on this system
    [intracluster]
    key = aNlDn3kWK
  6. Update any containers using the intracluster container sync to use the new intracluster realm key

    ardana > openstack container set -k 'aNlDn3kWK' container-src
    ardana > openstack container set -k 'aNlDn3kWK' container-dst

10 Managing Networking Edit source

Information about managing and configuring the Networking service.

10.1 SUSE OpenStack Cloud Firewall Edit source

Firewall as a Service (FWaaS) provides the ability to assign network-level, port security for all traffic entering an existing tenant network. More information on this service can be found in the public OpenStack documentation located at http://specs.openstack.org/openstack/neutron-specs/specs/api/firewall_as_a_service__fwaas_.html. The following documentation provides command-line interface example instructions for configuring and testing a SUSE OpenStack Cloud firewall. FWaaS can also be configured and managed by the horizon web interface.

With SUSE OpenStack Cloud, FWaaS is implemented directly in the L3 agent (neutron-l3-agent). However if VPNaaS is enabled, FWaaS is implemented in the VPNaaS agent (neutron-vpn-agent). Because FWaaS does not use a separate agent process or start a specific service, there currently are no monasca alarms for it.

If DVR is enabled, the firewall service currently does not filter traffic between OpenStack private networks, also known as east-west traffic and will only filter traffic from external networks, also known as north-south traffic.

Note
Note

The L3 agent must be restarted on each compute node hosting a DVR router when removing the FWaaS or adding a new FWaaS. This condition only applies when updating existing instances connected to DVR routers. For more information, see the upstream bug.

10.1.1 Overview of the SUSE OpenStack Cloud Firewall configuration Edit source

The following instructions provide information about how to identify and modify the overall SUSE OpenStack Cloud firewall that is configured in front of the control services. This firewall is administered only by a cloud admin and is not available for tenant use for private network firewall services.

During the installation process, the configuration processor will automatically generate "allow" firewall rules for each server based on the services deployed and block all other ports. These are populated in ~/openstack/my_cloud/info/firewall_info.yml, which includes a list of all the ports by network, including the addresses on which the ports will be opened. This is described in more detail in Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 5 “Input Model”, Section 5.2 “Concepts”, Section 5.2.10 “Networking”, Section 5.2.10.5 “Firewall Configuration”.

The firewall_rules.yml file in the input model allows you to define additional rules for each network group. You can read more about this in Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 6 “Configuration Objects”, Section 6.15 “Firewall Rules”.

The purpose of this document is to show you how to make post-installation changes to the firewall rules if the need arises.

Important
Important

This process is not to be confused with Firewall-as-a-Service, which is a separate service that enables the ability for SUSE OpenStack Cloud tenants to create north-south, network-level firewalls to provide stateful protection to all instances in a private, tenant network. This service is optional and is tenant-configured.

10.1.2 SUSE OpenStack Cloud 9 FWaaS Configuration Edit source

Check for an enabled firewall.

  1. You should check to determine if the firewall is enabled. The output of the openstack extension list should contain a firewall entry.

    openstack extension list
  2. Assuming the external network is already created by the admin, this command will show the external network.

    openstack network list

Create required assets.

Before creating firewalls, you will need to create a network, subnet, router, security group rules, start an instance and assign it a floating IP address.

  1. Create the network, subnet and router.

    openstack network create private
    openstack subnet create --name sub private 10.0.0.0/24 --gateway 10.0.0.1
    openstack router create router
    openstack router add subnet router sub
    openstack router set router ext-net
  2. Create security group rules. Security group rules filter traffic at VM level.

    openstack security group rule create default --protocol icmp
    openstack security group rule create default --protocol tcp --port-range-min 22 --port-range-max 22
    openstack security group rule create default --protocol tcp --port-range-min 80 --port-range-max 80
  3. Boot a VM.

    NET=$(openstack network list | awk '/private/ {print $2}')
    openstack server create --flavor 1 --image <image> --nic net-id=$NET vm1 --poll
  4. Verify if the instance is ACTIVE and is assigned an IP address.

    openstack server list
  5. Get the port id of the vm1 instance.

    fixedip=$(openstack server list | awk '/vm1/ {print $12}' | awk -F '=' '{print $2}' | awk -F ',' '{print $1}')
    vmportuuid=$(openstack port list | grep $fixedip | awk '{print $2}')
  6. Create and associate a floating IP address to the vm1 instance.

    openstack floating ip create ext-net --port-id $vmportuuid
  7. Verify if the floating IP is assigned to the instance. The following command should show an assigned floating IP address from the external network range.

    openstack server show vm1
  8. Verify if the instance is reachable from the external network. SSH into the instance from a node in (or has route to) the external network.

    ssh cirros@FIP-VM1
    password: <password>

Create and attach the firewall.

Note
Note

By default, an internal "drop all" rule is enabled in IP tables if none of the defined rules match the real-time data packets.

  1. Create new firewall rules using firewall-rule-create command and providing the protocol, action (allow, deny, reject) and name for the new rule.

    Firewall actions provide rules in which data traffic can be handled. An allow rule will allow traffic to pass through the firewall, deny will stop and prevent data traffic from passing through the firewall and reject will reject the data traffic and return a destination-unreachable response. Using reject will speed up failure detection time dramatically for legitimate users, since they will not be required to wait for retransmission timeouts or submit retries. Some customers should stick with deny where prevention of port scanners and similar methods may be attempted by hostile attackers. Using deny will drop all of the packets, making it more difficult for malicious intent. The firewall action, deny is the default behavior.

    The example below demonstrates how to allow icmp and ssh while denying access to http. See the OpenStackClient command-line reference at https://docs.openstack.org/python-openstackclient/rocky/ on additional options such as source IP, destination IP, source port and destination port.

    Note
    Note

    You can create a firewall rule with an identical name and each instance will have a unique id associated with the created rule, however for clarity purposes this is not recommended.

    neutron firewall-rule-create --protocol icmp --action allow --name allow-icmp
    neutron firewall-rule-create --protocol tcp --destination-port 80 --action deny --name deny-http
    neutron firewall-rule-create --protocol tcp --destination-port 22 --action allow --name allow-ssh
  2. Once the rules are created, create the firewall policy by using the firewall-policy-create command with the --firewall-rules option and rules to include in quotes, followed by the name of the new policy. The order of the rules is important.

    neutron firewall-policy-create --firewall-rules "allow-icmp deny-http allow-ssh" policy-fw
  3. Finish the firewall creation by using the firewall-create command, the policy name and the new name you want to give to your new firewall.

    neutron firewall-create policy-fw --name user-fw
  4. You can view the details of your new firewall by using the firewall-show command and the name of your firewall. This will verify that the status of the firewall is ACTIVE.

    neutron firewall-show user-fw

Verify the FWaaS is functional.

  1. Since allow-icmp firewall rule is set you can ping the floating IP address of the instance from the external network.

    ping <FIP-VM1>
  2. Similarly, you can connect via ssh to the instance due to the allow-ssh firewall rule.

    ssh cirros@<FIP-VM1>
    password: <password>
  3. Run a web server on vm1 instance that listens over port 80, accepts requests and sends a WELCOME response.

    $ vi webserv.sh
    
    #!/bin/bash
    
    MYIP=$(/sbin/ifconfig eth0|grep 'inet addr'|awk -F: '{print $2}'| awk '{print $1}');
    while true; do
      echo -e "HTTP/1.0 200 OK
    
    Welcome to $MYIP" | sudo nc -l -p 80
    done
    
    # Give it Exec rights
    $ chmod 755 webserv.sh
    
    # Execute the script
    $ ./webserv.sh
  4. You should expect to see curl fail over port 80 because of the deny-http firewall rule. If curl succeeds, the firewall is not blocking incoming http requests.

    curl -vvv <FIP-VM1>
Warning
Warning

When using reference implementation, new networks, FIPs and routers created after the Firewall creation will not be automatically updated with firewall rules. Thus, execute the firewall-update command by passing the current and new router Ids such that the rules are reconfigured across all the routers (both current and new).

For example if router-1 is created before and router-2 is created after the firewall creation

$ neutron firewall-update —router <router-1-id> —router <router-2-id> <firewall-name>

10.1.3 Making Changes to the Firewall Rules Edit source

  1. Log in to your Cloud Lifecycle Manager.

  2. Edit your ~/openstack/my_cloud/definition/data/firewall_rules.yml file and add the lines necessary to allow the port(s) needed through the firewall.

    In this example we are going to open up port range 5900-5905 to allow VNC traffic through the firewall:

      - name: VNC
        network-groups:
      - MANAGEMENT
        rules:
         - type: allow
           remote-ip-prefix:  0.0.0.0/0
           port-range-min: 5900
           port-range-max: 5905
           protocol: tcp
    Note
    Note

    The example above shows a remote-ip-prefix of 0.0.0.0/0 which opens the ports up to all IP ranges. To be more secure you can specify your local IP address CIDR you will be running the VNC connect from.

  3. Commit those changes to your local git:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git add -A
    ardana > git commit -m "firewall rule update"
  4. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  5. Create the deployment directory structure:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  6. Change to the deployment directory and run the osconfig-iptables-deploy.yml playbook to update your iptable rules to allow VNC:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts osconfig-iptables-deploy.yml

You can repeat these steps as needed to add, remove, or edit any of these firewall rules.

10.1.4 More Information Edit source

Firewalls are based in IPtable settings.

Each firewall that is created is known as an instance.

A firewall instance can be deployed on selected project routers. If no specific project router is selected, a firewall instance is automatically applied to all project routers.

Only 1 firewall instance can be applied to a project router.

Only 1 firewall policy can be applied to a firewall instance.

Multiple firewall rules can be added and applied to a firewall policy.

Firewall rules can be shared across different projects via the Share API flag.

Firewall rules supersede the Security Group rules that are applied at the Instance level for all traffic entering or leaving a private, project network.

For more information on the command-line interface (CLI) and firewalls, see the OpenStack networking command-line client reference: https://docs.openstack.org/python-openstackclient/rocky/

10.2 Using VPN as a Service (VPNaaS) Edit source

SUSE OpenStack Cloud 9 VPNaaS Configuration

This document describes the configuration process and requirements for the SUSE OpenStack Cloud 9 Virtual Private Network (VPN) as a Service (VPNaaS) module.

10.2.1 Prerequisites Edit source

  1. SUSE OpenStack Cloud must be installed.

  2. Before setting up VPNaaS, you will need to have created an external network and a subnet with access to the internet. Information on how to create the external network and subnet can be found in Section 10.2.4, “More Information”.

  3. You should assume 172.16.0.0/16 as the ext-net CIDR in this document.

10.2.2 Considerations Edit source

Using the neutron plugin-based VPNaaS causes additional processes to be run on the Network Service Nodes. One of these processes, the ipsec charon process from StrongSwan, runs as root and listens on an external network. A vulnerability in that process can lead to remote root compromise of the Network Service Nodes. If this is a concern customers should consider using a VPN solution other than the neutron plugin-based VPNaaS and/or deploying additional protection mechanisms.

10.2.3 Configuration Edit source

Setup Networks You can setup VPN as a Service (VPNaaS) by first creating networks, subnets and routers using the neutron command line. The VPNaaS module enables the ability to extend access between private networks across two different SUSE OpenStack Cloud clouds or between a SUSE OpenStack Cloud cloud and a non-cloud network. VPNaaS is based on the open source software application called StrongSwan. StrongSwan (more information available at http://www.strongswan.org/) is an IPsec implementation and provides basic VPN gateway functionality.

Note
Note

You can execute the included commands from any shell with access to the service APIs. In the included examples, the commands are executed from the lifecycle manager, however you could execute the commands from the controller node or any other shell with aforementioned service API access.

Note
Note

The use of floating IP's is not possible with the current version of VPNaaS when DVR is enabled. Ensure that no floating IP is associated to instances that will be using VPNaaS when using a DVR router. Floating IP associated to instances are ok when using CVR router.

  1. From the Cloud Lifecycle Manager, create first private network, subnet and router assuming that ext-net is created by admin.

    openstack network create privateA
    openstack subnet create --name subA privateA 10.1.0.0/24 --gateway 10.1.0.1
    openstack router create router1
    openstack router add subnet router1 subA
    openstack router set router1 ext-net
  2. Create second private network, subnet and router.

    openstack network create privateB
    openstack subnet create --name subB privateB 10.2.0.0/24 --gateway 10.2.0.1
    openstack router create router2
    openstack router add subnet router2 subB
    openstack router set router2 ext-net
Procedure 10.1: Starting Virtual Machines
  1. From the Cloud Lifecycle Manager run the following to start the virtual machines. Begin with adding secgroup rules for SSH and ICMP.

    openstack security group rule create default --protocol icmp
    openstack security group rule create default --protocol tcp --port-range-min 22 --port-range-max 22
  2. Start the virtual machine in the privateA subnet. Using nova images-list, use the image id to boot image instead of the image name. After executing this step, it is recommended that you wait approximately 10 seconds to allow the virtual machine to become active.

    NETA=$(openstack network list | awk '/privateA/ {print $2}')
    openstack server create --flavor 1 --image <id> --nic net-id=$NETA vm1
  3. Start the virtual machine in the privateB subnet.

    NETB=$(openstack network list | awk '/privateB/ {print $2}')
    openstack server create --flavor 1 --image <id> --nic net-id=$NETB vm2
  4. Verify private IP's are allocated to the respective vms. Take note of IP's for later use.

    openstack server show vm1
    openstack server show vm2
Procedure 10.2: Create VPN
  1. You can set up the VPN by executing the below commands from the lifecycle manager or any shell with access to the service APIs. Begin with creating the policies with vpn-ikepolicy-create and vpn-ipsecpolicy-create .

    neutron vpn-ikepolicy-create ikepolicy
    neutron vpn-ipsecpolicy-create ipsecpolicy
  2. Create the VPN service at router1.

    neutron vpn-service-create --name myvpnA --description "My vpn service" router1 subA
  3. Wait at least 5 seconds and then run ipsec-site-connection-create to create a ipsec-site connection. Note that --peer-address is the assign ext-net IP from router2 and --peer-cidr is subB cidr.

    neutron ipsec-site-connection-create --name vpnconnection1 --vpnservice-id myvpnA \
    --ikepolicy-id ikepolicy --ipsecpolicy-id ipsecpolicy --peer-address 172.16.0.3 \
    --peer-id 172.16.0.3 --peer-cidr 10.2.0.0/24 --psk secret
  4. Create the VPN service at router2.

    neutron vpn-service-create --name myvpnB --description "My vpn serviceB" router2 subB
  5. Wait at least 5 seconds and then run ipsec-site-connection-create to create a ipsec-site connection. Note that --peer-address is the assigned ext-net IP from router1 and --peer-cidr is subA cidr.

    neutron ipsec-site-connection-create --name vpnconnection2 --vpnservice-id myvpnB \
    --ikepolicy-id ikepolicy --ipsecpolicy-id ipsecpolicy --peer-address 172.16.0.2 \
    --peer-id 172.16.0.2 --peer-cidr 10.1.0.0/24 --psk secret
  6. On the Cloud Lifecycle Manager, run the ipsec-site-connection-list command to see the active connections. Be sure to check that the vpn_services are ACTIVE. You can check this by running vpn-service-list and then checking ipsec-site-connections status. You should expect that the time for both vpn-services and ipsec-site-connections to become ACTIVE could take as long as 1 to 3 minutes.

    neutron ipsec-site-connection-list
    +--------------------------------------+----------------+--------------+---------------+------------+-----------+--------+
    | id                                   | name           | peer_address | peer_cidrs    | route_mode | auth_mode | status |
    +--------------------------------------+----------------+--------------+---------------+------------+-----------+--------+
    | 1e8763e3-fc6a-444c-a00e-426a4e5b737c | vpnconnection2 | 172.16.0.2   | "10.1.0.0/24" | static     | psk       | ACTIVE |
    | 4a97118e-6d1d-4d8c-b449-b63b41e1eb23 | vpnconnection1 | 172.16.0.3   | "10.2.0.0/24" | static     | psk       | ACTIVE |
    +--------------------------------------+----------------+--------------+---------------+------------+-----------+--------+

Verify VPN In the case of non-admin users, you can verify the VPN connection by pinging the virtual machines.

  1. Check the VPN connections.

    Note
    Note

    vm1-ip and vm2-ip denotes private IP's for vm1 and vm2 respectively. The private IPs are obtained, as described in of Step 4. If you are unable to SSH to the private network due to a lack of direct access, the VM console can be accessed through horizon.

    ssh cirros@vm1-ip
    password: <password>
    
    # ping the private IP address of vm2
    ping ###.###.###.###
  2. In another terminal.

    ssh cirros@vm2-ip
    password: <password>
    
    # ping the private IP address of vm1
    ping ###.###.###.###
  3. You should see ping responses from both virtual machines.

As the admin user, you should check to make sure that a route exists between the router gateways. Once the gateways have been checked, packet encryption can be verified by using traffic analyzer (tcpdump) by tapping on the respective namespace (qrouter-* in case of non-DVR and snat-* in case of DVR) and tapping the right interface (qg-***).

Note
Note

When using DVR namespaces, all the occurrences of qrouter-xxxxxx in the following commands should be replaced with respective snat-xxxxxx.

  1. Check the if the route exists between two router gateways. You can get the right qrouter namespace id by executing sudo ip netns. Once you have the qrouter namespace id, you can get the interface by executing sudo ip netns qrouter-xxxxxxxx ip addr and from the result the interface can be found.

    sudo ip netns
    sudo ip netns exec qrouter-<router1 UUID> ping <router2 gateway>
    sudo ip netns exec qrouter-<router2 UUID> ping <router1 gateway>
  2. Initiate a tcpdump on the interface.

    sudo ip netns exec qrouter-xxxxxxxx tcpdump -i qg-xxxxxx
  3. Check the VPN connection.

    ssh cirros@vm1-ip
    password: <password>
    
    # ping the private IP address of vm2
    ping ###.###.###.###
  4. Repeat for other namespace and right tap interface.

    sudo ip netns exec qrouter-xxxxxxxx tcpdump -i qg-xxxxxx
  5. In another terminal.

    ssh cirros@vm2-ip
    password: <password>
    
    # ping the private IP address of vm1
    ping ###.###.###.###
  6. You will find encrypted packets containing ‘ESP’ in the tcpdump trace.

10.2.4 More Information Edit source

VPNaaS currently only supports Pre-shared Keys (PSK) security between VPN gateways. A different VPN gateway solution should be considered if stronger, certificate-based security is required.

For more information on the neutron command-line interface (CLI) and VPN as a Service (VPNaaS), see the OpenStack networking command-line client reference: https://docs.openstack.org/python-openstackclient/rocky/

For information on how to create an external network and subnet, see the OpenStack manual: http://docs.openstack.org/user-guide/dashboard_create_networks.html

10.3 DNS Service Overview Edit source

SUSE OpenStack Cloud DNS service provides multi-tenant Domain Name Service with REST API management for domain and records.

Warning
Warning

The DNS Service is not intended to be used as an internal or private DNS service. The name records in DNSaaS should be treated as public information that anyone could query. There are controls to prevent tenants from creating records for domains they do not own. TSIG provides a Transaction SIG nature to ensure integrity during zone transfer to other DNS servers.

10.3.1 For More Information Edit source

10.3.2 designate Initial Configuration Edit source

After the SUSE OpenStack Cloud installation has been completed, designate requires initial configuration to operate.

10.3.2.1 Identifying Name Server Public IPs Edit source

Depending on the back-end, the method used to identify the name servers' public IPs will differ.

10.3.2.1.1 InfoBlox Edit source

InfoBlox will act as your public name servers, consult the InfoBlox management UI to identify the IPs.

10.3.2.1.2 BIND Back-end Edit source

You can find the name server IPs in /etc/hosts by looking for the ext-api addresses, which are the addresses of the controllers. For example:

192.168.10.1 example-cp1-c1-m1-extapi
192.168.10.2 example-cp1-c1-m2-extapi
192.168.10.3 example-cp1-c1-m3-extapi
10.3.2.1.3 Creating Name Server A Records Edit source

Each name server requires a public name, for example ns1.example.com., to which designate-managed domains will be delegated. There are two common locations where these may be registered, either within a zone hosted on designate itself, or within a zone hosted on a external DNS service.

If you are using an externally managed zone for these names:

  • For each name server public IP, create the necessary A records in the external system.

If you are using a designate-managed zone for these names:

  1. Create the zone in designate which will contain the records:

    ardana > openstack zone create --email hostmaster@example.com example.com.
    +----------------+--------------------------------------+
    | Field          | Value                                |
    +----------------+--------------------------------------+
    | action         | CREATE                               |
    | created_at     | 2016-03-09T13:16:41.000000           |
    | description    | None                                 |
    | email          | hostmaster@example.com               |
    | id             | 23501581-7e34-4b88-94f4-ad8cec1f4387 |
    | masters        |                                      |
    | name           | example.com.                         |
    | pool_id        | 794ccc2c-d751-44fe-b57f-8894c9f5c842 |
    | project_id     | a194d740818942a8bea6f3674e0a3d71     |
    | serial         | 1457529400                           |
    | status         | PENDING                              |
    | transferred_at | None                                 |
    | ttl            | 3600                                 |
    | type           | PRIMARY                              |
    | updated_at     | None                                 |
    | version        | 1                                    |
    +----------------+--------------------------------------+
  2. For each name server public IP, create an A record. For example:

    ardana > openstack recordset create --records 192.168.10.1 --type A example.com. ns1.example.com.
    +-------------+--------------------------------------+
    | Field       | Value                                |
    +-------------+--------------------------------------+
    | action      | CREATE                               |
    | created_at  | 2016-03-09T13:18:36.000000           |
    | description | None                                 |
    | id          | 09e962ed-6915-441a-a5a1-e8d93c3239b6 |
    | name        | ns1.example.com.                     |
    | records     | 192.168.10.1                         |
    | status      | PENDING                              |
    | ttl         | None                                 |
    | type        | A                                    |
    | updated_at  | None                                 |
    | version     | 1                                    |
    | zone_id     | 23501581-7e34-4b88-94f4-ad8cec1f4387 |
    +-------------+--------------------------------------+
  3. When records have been added, list the record sets in the zone to validate:

    ardana > openstack recordset list example.com.
    +--------------+------------------+------+---------------------------------------------------+
    | id           | name             | type | records                                           |
    +--------------+------------------+------+---------------------------------------------------+
    | 2d6cf...655b | example.com.     | SOA  | ns1.example.com. hostmaster.example.com 145...600 |
    | 33466...bd9c | example.com.     | NS   | ns1.example.com.                                  |
    | da98c...bc2f | example.com.     | NS   | ns2.example.com.                                  |
    | 672ee...74dd | example.com.     | NS   | ns3.example.com.                                  |
    | 09e96...39b6 | ns1.example.com. | A    | 192.168.10.1                                      |
    | bca4f...a752 | ns2.example.com. | A    | 192.168.10.2                                      |
    | 0f123...2117 | ns3.example.com. | A    | 192.168.10.3                                      |
    +--------------+------------------+------+---------------------------------------------------+
  4. Contact your domain registrar requesting Glue Records to be registered in the com. zone for the nameserver and public IP address pairs above. If you are using a sub-zone of an existing company zone (for example, ns1.cloud.mycompany.com.), the Glue must be placed in the mycompany.com. zone.

10.3.2.1.4 For More Information Edit source

For additional DNS integration and configuration information, see the OpenStack designate documentation at https://docs.openstack.org/designate/rocky/.

For more information on creating servers, domains and examples, see the OpenStack REST API documentation at https://developer.openstack.org/api-ref/dns/.

10.3.3 DNS Service Monitoring Support Edit source

10.3.3.1 DNS Service Monitoring Support Edit source

Additional monitoring support for the DNS Service (designate) has been added to SUSE OpenStack Cloud.

In the Networking section of the Operations Console, you can see alarms for all of the DNS Services (designate), such as designate-zone-manager, designate-api, designate-pool-manager, designate-mdns, and designate-central after running designate-stop.yml.

You can run designate-start.yml to start the DNS Services back up and the alarms will change from a red status to green and be removed from the New Alarms panel of the Operations Console.

An example of the generated alarms from the Operations Console is provided below after running designate-stop.yml:

ALARM:  STATE:  ALARM ID:  LAST CHECK:  DIMENSION:
Process Check
0f221056-1b0e-4507-9a28-2e42561fac3e 2016-10-03T10:06:32.106Z hostname=ardana-cp1-c1-m1-mgmt,
service=dns,
cluster=cluster1,
process_name=designate-zone-manager,
component=designate-zone-manager,
control_plane=control-plane-1,
cloud_name=entry-scale-kvm

Process Check
50dc4c7b-6fae-416c-9388-6194d2cfc837 2016-10-03T10:04:32.086Z hostname=ardana-cp1-c1-m1-mgmt,
service=dns,
cluster=cluster1,
process_name=designate-api,
component=designate-api,
control_plane=control-plane-1,
cloud_name=entry-scale-kvm

Process Check
55cf49cd-1189-4d07-aaf4-09ed08463044 2016-10-03T10:05:32.109Z hostname=ardana-cp1-c1-m1-mgmt,
service=dns,
cluster=cluster1,
process_name=designate-pool-manager,
component=designate-pool-manager,
control_plane=control-plane-1,
cloud_name=entry-scale-kvm

Process Check
c4ab7a2e-19d7-4eb2-a9e9-26d3b14465ea 2016-10-03T10:06:32.105Z hostname=ardana-cp1-c1-m1-mgmt,
service=dns,
cluster=cluster1,
process_name=designate-mdns,
component=designate-mdns,
control_plane=control-plane-1,
cloud_name=entry-scale-kvm
HTTP Status
c6349bbf-4fd1-461a-9932-434169b86ce5 2016-10-03T10:05:01.731Z service=dns,
cluster=cluster1,
url=http://100.60.90.3:9001/,
hostname=ardana-cp1-c1-m3-mgmt,
component=designate-api,
control_plane=control-plane-1,
api_endpoint=internal,
cloud_name=entry-scale-kvm,
monitored_host_type=instance

Process Check
ec2c32c8-3b91-4656-be70-27ff0c271c89 2016-10-03T10:04:32.082Z hostname=ardana-cp1-c1-m1-mgmt,
service=dns,
cluster=cluster1,
process_name=designate-central,
component=designate-central,
control_plane=control-plane-1,
cloud_name=entry-scale-kvm

10.4 Networking Service Overview Edit source

SUSE OpenStack Cloud Networking is a virtual Networking service that leverages the OpenStack neutron service to provide network connectivity and addressing to SUSE OpenStack Cloud Compute service devices.

The Networking service also provides an API to configure and manage a variety of network services.

You can use the Networking service to connect guest servers or you can define and configure your own virtual network topology.

10.4.1 Installing the Networking Service Edit source

SUSE OpenStack Cloud Network Administrators are responsible for planning for the neutron Networking service, and once installed, to configure the service to meet the needs of their cloud network users.

10.4.2 Working with the Networking service Edit source

To perform tasks using the Networking service, you can use the dashboard, API or CLI.

10.4.3 Reconfiguring the Networking service Edit source

If you change any of the network configuration after installation, it is recommended that you reconfigure the Networking service by running the neutron-reconfigure playbook.

On the Cloud Lifecycle Manager:

ardana > cd ~/openstack/ardana/ansible
ardana > ansible-playbook -i hosts/verb_hosts neutron-reconfigure.yml

10.4.4 For more information Edit source

For information on how to operate your cloud we suggest you read the OpenStack Operations Guide. The Architecture section contains useful information about how an OpenStack Cloud is put together. However, SUSE OpenStack Cloud takes care of these details for you. The Operations section contains information on how to manage the system.

10.4.5 Neutron External Networks Edit source

10.4.5.1 External networks overview Edit source

This topic explains how to create a neutron external network.

External networks provide access to the internet.

The typical use is to provide an IP address that can be used to reach a VM from an external network which can be a public network like the internet or a network that is private to an organization.

10.4.5.2 Using the Ansible Playbook Edit source

This playbook will query the Networking service for an existing external network, and then create a new one if you do not already have one. The resulting external network will have the name ext-net with a subnet matching the CIDR you specify in the command below.

If you need to specify more granularity, for example specifying an allocation pool for the subnet, use the Section 10.4.5.3, “Using the python-neutronclient CLI”.

ardana > cd ~/scratch/ansible/next/ardana/ansible
ardana > ansible-playbook -i hosts/verb_hosts neutron-cloud-configure.yml -e EXT_NET_CIDR=<CIDR>

The table below shows the optional switch that you can use as part of this playbook to specify environment-specific information:

SwitchDescription

-e EXT_NET_CIDR=<CIDR>

Optional. You can use this switch to specify the external network CIDR. If you choose not to use this switch, or use a wrong value, the VMs will not be accessible over the network.

This CIDR will be from the EXTERNAL VM network.

10.4.5.3 Using the python-neutronclient CLI Edit source

For more granularity you can utilize the OpenStackClient tool to create your external network.

  1. Log in to the Cloud Lifecycle Manager.

  2. Source the Admin creds:

    ardana > source ~/service.osrc
  3. Create the external network and then the subnet using these commands below.

    Creating the network:

    ardana > openstack network create --router:external <external-network-name>

    Creating the subnet:

    ardana > openstack subnet create EXTERNAL-NETWORK-NAME CIDR --gateway GATEWAY --allocation-pool start=IP_START,end=IP_END [--disable-dhcp]

    Where:

    ValueDescription
    external-network-name

    This is the name given to your external network. This is a unique value that you will choose. The value ext-net is usually used.

    CIDR

    Use this switch to specify the external network CIDR. If you do not use this switch or use a wrong value, the VMs will not be accessible over the network.

    This CIDR will be from the EXTERNAL VM network.

    --gateway

    Optional switch to specify the gateway IP for your subnet. If this is not included, it will choose the first available IP.

    --allocation-pool start end

    Optional switch to specify start and end IP addresses to use as the allocation pool for this subnet.

    --disable-dhcp

    Optional switch if you want to disable DHCP on this subnet. If this is not specified, DHCP will be enabled.

10.4.5.4 Multiple External Networks Edit source

SUSE OpenStack Cloud provides the ability to have multiple external networks, by using the Network Service (neutron) provider networks for external networks. You can configure SUSE OpenStack Cloud to allow the use of provider VLANs as external networks by following these steps.

  1. Do NOT include the neutron.l3_agent.external_network_bridge tag in the network_groups definition for your cloud. This results in the l3_agent.ini external_network_bridge being set to an empty value (rather than the traditional br-ex).

  2. Configure your cloud to use provider VLANs, by specifying the provider_physical_network tag on one of the network_groups defined for your cloud.

    For example, to run provider VLANS over the EXAMPLE network group: (some attributes omitted for brevity)

    network-groups:
    
      - name: EXAMPLE
        tags:
          - neutron.networks.vlan:
              provider-physical-network: physnet1
  3. After the cloud has been deployed, you can create external networks using provider VLANs.

    For example, using the OpenStackClient:

    1. Create external network 1 on vlan101

      ardana > openstack network create --provider-network-type vlan
      --provider-physical-network physnet1 --provider-segment 101 --external ext-net1
    2. Create external network 2 on vlan102

      ardana > openstack network create --provider-network-type vlan
      --provider-physical-network physnet1 --provider-segment 102 --external ext-net2

10.4.6 Neutron Provider Networks Edit source

This topic explains how to create a neutron provider network.

A provider network is a virtual network created in the SUSE OpenStack Cloud cloud that is consumed by SUSE OpenStack Cloud services. The distinctive element of a provider network is that it does not create a virtual router; rather, it depends on L3 routing that is provided by the infrastructure.

A provider network is created by adding the specification to the SUSE OpenStack Cloud input model. It consists of at least one network and one or more subnets.

10.4.6.1 SUSE OpenStack Cloud input model Edit source

The input model is the primary mechanism a cloud admin uses in defining a SUSE OpenStack Cloud installation. It exists as a directory with a data subdirectory that contains YAML files. By convention, any service that creates a neutron provider network will create a subdirectory under the data directory and the name of the subdirectory shall be the project name. For example, the Octavia project will use neutron provider networks so it will have a subdirectory named 'octavia' and the config file that specifies the neutron network will exist in that subdirectory.

├── cloudConfig.yml
    ├── data
    │   ├── control_plane.yml
    │   ├── disks_compute.yml
    │   ├── disks_controller_1TB.yml
    │   ├── disks_controller.yml
    │   ├── firewall_rules.yml
    │   ├── net_interfaces.yml
    │   ├── network_groups.yml
    │   ├── networks.yml
    │   ├── neutron
    │   │   └── neutron_config.yml
    │   ├── nic_mappings.yml
    │   ├── server_groups.yml
    │   ├── server_roles.yml
    │   ├── servers.yml
    │   ├── swift
    │   │   └── swift_config.yml
    │   └── octavia
    │       └── octavia_config.yml
    ├── README.html
    └── README.md

10.4.6.2 Network/Subnet specification Edit source

The elements required in the input model for you to define a network are:

  • name

  • network_type

  • physical_network

Elements that are optional when defining a network are:

  • segmentation_id

  • shared

Required elements for the subnet definition are:

  • cidr

Optional elements for the subnet definition are:

  • allocation_pools which will require start and end addresses

  • host_routes which will require a destination and nexthop

  • gateway_ip

  • no_gateway

  • enable-dhcp

NOTE: Only IPv4 is supported at the present time.

10.4.6.3 Network details Edit source

The following table outlines the network values to be set, and what they represent.

AttributeRequired/optionalAllowed ValuesUsage
nameRequired  
network_typeRequiredflat, vlan, vxlanThe type of desired network
physical_networkRequiredValidName of physical network that is overlayed with the virtual network
segmentation_idOptionalvlan or vxlan rangesVLAN id for vlan or tunnel id for vxlan
sharedOptionalTrueShared by all projects or private to a single project

10.4.6.4 Subnet details Edit source

The following table outlines the subnet values to be set, and what they represent.

AttributeReq/OptAllowed ValuesUsage
cidrRequiredValid CIDR rangefor example, 172.30.0.0/24
allocation_poolsOptionalSee allocation_pools table below 
host_routesOptionalSee host_routes table below 
gateway_ipOptionalValid IP addrSubnet gateway to other nets
no_gatewayOptionalTrueNo distribution of gateway
enable-dhcpOptionalTrueEnable dhcp for this subnet

10.4.6.5 ALLOCATION_POOLS details Edit source

The following table explains allocation pool settings.

AttributeReq/OptAllowed ValuesUsage
startRequiredValid IP addrFirst ip address in pool
endRequiredValid IP addrLast ip address in pool

10.4.6.6 HOST_ROUTES details Edit source

The following table explains host route settings.

AttributeReq/OptAllowed ValuesUsage
destinationRequiredValid CIDRDestination subnet
nexthopRequiredValid IP addrHop to take to destination subnet
Note
Note

Multiple destination/nexthop values can be used.

10.4.6.7 Examples Edit source

The following examples show the configuration file settings for neutron and Octavia.

Octavia configuration

This file defines the mapping. It does not need to be edited unless you want to change the name of your VLAN.

Path: ~/openstack/my_cloud/definition/data/octavia/octavia_config.yml

---
  product:
    version: 2

  configuration-data:
    - name: OCTAVIA-CONFIG-CP1
      services:
        - octavia
      data:
        amp_network_name: OCTAVIA-MGMT-NET

neutron configuration

Input your network configuration information for your provider VLANs in neutron_config.yml found here:

~/openstack/my_cloud/definition/data/neutron/.

---
  product:
    version: 2

  configuration-data:
    - name:  NEUTRON-CONFIG-CP1
      services:
        - neutron
      data:
        neutron_provider_networks:
        - name: OCTAVIA-MGMT-NET
          provider:
            - network_type: vlan
              physical_network: physnet1
              segmentation_id: 2754
          cidr: 10.13.189.0/24
          no_gateway:  True
          enable_dhcp: True
          allocation_pools:
            - start: 10.13.189.4
              end: 10.13.189.252
          host_routes:
            # route to MANAGEMENT-NET
            - destination: 10.13.111.128/26
              nexthop:  10.13.189.5

10.4.6.8 Implementing your changes Edit source

  1. Commit the changes to git:

    ardana > git add -A
    ardana > git commit -a -m "configuring provider network"
  2. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  3. Update your deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  4. Then continue with your clean cloud installation.

  5. If you are only adding a neutron Provider network to an existing model, then run the neutron-deploy.yml playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts neutron-deploy.yml

10.4.6.9 Multiple Provider Networks Edit source

The physical network infrastructure must be configured to convey the provider VLAN traffic as tagged VLANs to the cloud compute nodes and network service network nodes. Configuration of the physical network infrastructure is outside the scope of the SUSE OpenStack Cloud 9 software.

SUSE OpenStack Cloud 9 automates the server networking configuration and the Network Service configuration based on information in the cloud definition. To configure the system for provider VLANs, specify the neutron.networks.vlan tag with a provider-physical-network attribute on one or more network groups. For example (some attributes omitted for brevity):

network-groups:

        - name: NET_GROUP_A
        tags:
        - neutron.networks.vlan:
        provider-physical-network: physnet1

        - name: NET_GROUP_B
        tags:
        - neutron.networks.vlan:
        provider-physical-network: physnet2

A network group is associated with a server network interface via an interface model. For example (some attributes omitted for brevity):

interface-models:
        - name: INTERFACE_SET_X
        network-interfaces:
        - device:
        name: bond0
        network-groups:
        - NET_GROUP_A
        - device:
        name: eth3
        network-groups:
        - NET_GROUP_B

A network group used for provider VLANs may contain only a single SUSE OpenStack Cloud network, because that VLAN must span all compute nodes and any Network Service network nodes/controllers (that is, it is a single L2 segment). The SUSE OpenStack Cloud network must be defined with tagged-vlan false, otherwise a Linux VLAN network interface will be created. For example:

networks:

        - name: NET_A
        tagged-vlan: false
        network-group: NET_GROUP_A

        - name: NET_B
        tagged-vlan: false
        network-group: NET_GROUP_B

When the cloud is deployed, SUSE OpenStack Cloud 9 will create the appropriate bridges on the servers, and set the appropriate attributes in the neutron configuration files (for example, bridge_mappings).

After the cloud has been deployed, create Network Service network objects for each provider VLAN. For example, using the Network Service CLI:

ardana > openstack network create --provider:network_type vlan --provider:physical_network physnet1 --provider-segment 101 mynet101
ardana > openstack network create --provider:network_type vlan --provider:physical_network physnet2 --provider-segment 234 mynet234

10.4.6.10 More Information Edit source

For more information on the Network Service command-line interface (CLI), see the OpenStack networking command-line client reference: http://docs.openstack.org/cli-reference/content/neutronclient_commands.html

10.4.7 Using IPAM Drivers in the Networking Service Edit source

This topic describes how to choose and implement an IPAM driver.

10.4.7.1 Selecting and implementing an IPAM driver Edit source

Beginning with the Liberty release, OpenStack networking includes a pluggable interface for the IP Address Management (IPAM) function. This interface creates a driver framework for the allocation and de-allocation of subnets and IP addresses, enabling the integration of alternate IPAM implementations or third-party IP Address Management systems.

There are three possible IPAM driver options:

  • Non-pluggable driver. This option is the default when the ipam_driver parameter is not specified in neutron.conf.

  • Pluggable reference IPAM driver. The pluggable IPAM driver interface was introduced in SUSE OpenStack Cloud 9 (OpenStack Liberty). It is a refactoring of the Kilo non-pluggable driver to use the new pluggable interface. The setting in neutron.conf to specify this driver is ipam_driver = internal.

  • Pluggable Infoblox IPAM driver. The pluggable Infoblox IPAM driver is a third-party implementation of the pluggable IPAM interface. the corresponding setting in neutron.conf to specify this driver is ipam_driver = networking_infoblox.ipam.driver.InfobloxPool.

    Note
    Note

    You can use either the non-pluggable IPAM driver or a pluggable one. However, you cannot use both.

10.4.7.2 Using the Pluggable reference IPAM driver Edit source

To indicate that you want to use the Pluggable reference IPAM driver, the only parameter needed is "ipam_driver." You can set it by looking for the following commented line in the neutron.conf.j2 template (ipam_driver = internal) uncommenting it, and committing the file. After following the standard steps to deploy neutron, neutron will be configured to run using the Pluggable reference IPAM driver.

As stated, the file you must edit is neutron.conf.j2 on the Cloud Lifecycle Manager in the directory ~/openstack/my_cloud/config/neutron. Here is the relevant section where you can see the ipam_driver parameter commented out:

[DEFAULT]
  ...
  l3_ha_net_cidr = 169.254.192.0/18

  # Uncomment the line below if the Reference Pluggable IPAM driver is to be used
  # ipam_driver = internal
  ...

After uncommenting the line ipam_driver = internal, commit the file using git commit from the openstack/my_cloud directory:

ardana > git commit -a -m 'My config for enabling the internal IPAM Driver'

Then follow the steps to deploy SUSE OpenStack Cloud in the Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 13 “Overview” appropriate to your cloud configuration.

Note
Note

Currently there is no migration path from the non-pluggable driver to a pluggable IPAM driver because changes are needed to database tables and neutron currently cannot make those changes.

10.4.7.3 Using the Infoblox IPAM driver Edit source

As suggested above, using the Infoblox IPAM driver requires changes to existing parameters in nova.conf and neutron.conf. If you want to use the infoblox appliance, you will need to add the "infoblox service-component" to the service-role containing the neutron API server. To use the infoblox appliance for IPAM, both the agent and the Infoblox IPAM driver are required. The infoblox-ipam-agent should be deployed on the same node where the neutron-server component is running. Usually this is a Controller node.

  1. Have the Infoblox appliance running on the management network (the Infoblox appliance admin or the datacenter administrator should know how to perform this step).

  2. Change the control plane definition to add infoblox-ipam-agent as a service in the controller node cluster (see change in bold). Make the changes in control_plane.yml found here: ~/openstack/my_cloud/definition/data/control_plane.yml

    ---
      product:
        version: 2
    
      control-planes:
        - name: ccp
          control-plane-prefix: ccp
     ...
          clusters:
            - name: cluster0
              cluster-prefix: c0
              server-role: ARDANA-ROLE
              member-count: 1
              allocation-policy: strict
              service-components:
                - lifecycle-manager
            - name: cluster1
              cluster-prefix: c1
              server-role: CONTROLLER-ROLE
              member-count: 3
              allocation-policy: strict
              service-components:
                - ntp-server
    ...
                - neutron-server
                - infoblox-ipam-agent
    ...
                - designate-client
                - bind
          resources:
            - name: compute
              resource-prefix: comp
              server-role: COMPUTE-ROLE
              allocation-policy: any
  3. Modify the ~/openstack/my_cloud/config/neutron/neutron.conf.j2 file on the controller node to comment and uncomment the lines noted below to enable use with the Infoblox appliance:

    [DEFAULT]
                ...
                l3_ha_net_cidr = 169.254.192.0/18
    
    
                # Uncomment the line below if the Reference Pluggable IPAM driver is to be used
                # ipam_driver = internal
    
    
                # Comment out the line below if the Infoblox IPAM Driver is to be used
                # notification_driver = messaging
    
                # Uncomment the lines below if the Infoblox IPAM driver is to be used
                ipam_driver = networking_infoblox.ipam.driver.InfobloxPool
                notification_driver = messagingv2
    
    
                # Modify the infoblox sections below to suit your cloud environment
    
                [infoblox]
                cloud_data_center_id = 1
                # This name of this section is formed by "infoblox-dc:<infoblox.cloud_data_center_id>"
                # If cloud_data_center_id is 1, then the section name is "infoblox-dc:1"
    
                [infoblox-dc:0]
                http_request_timeout = 120
                http_pool_maxsize = 100
                http_pool_connections = 100
                ssl_verify = False
                wapi_version = 2.2
                admin_user_name = admin
                admin_password = infoblox
                grid_master_name = infoblox.localdomain
                grid_master_host = 1.2.3.4
    
    
                [QUOTAS]
                ...
  4. Change nova.conf.j2 to replace the notification driver "messaging" to "messagingv2"

     ...
    
     # Oslo messaging
     notification_driver = log
    
     #  Note:
     #  If the infoblox-ipam-agent is to be deployed in the cloud, change the
     #  notification_driver setting from "messaging" to "messagingv2".
     notification_driver = messagingv2
     notification_topics = notifications
    
     # Policy
     ...
  5. Commit the changes:

    ardana > cd ~/openstack/my_cloud
    ardana > git commit –a –m 'My config for enabling the Infoblox IPAM driver'
  6. Deploy the cloud with the changes. Due to changes to the control_plane.yml, you will need to rerun the config-processor-run.yml playbook if you have run it already during the install process.

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts site.yml

10.4.7.4 Configuration parameters for using the Infoblox IPAM driver Edit source

Changes required in the notification parameters in nova.conf:

Parameter NameSection in nova.confDefault ValueCurrent Value Description
notify_on_state_changeDEFAULTNonevm_and_task_state

Send compute.instance.update notifications on instance state changes.

Vm_and_task_state means notify on vm and task state changes.

Infoblox requires the value to be vm_state (notify on vm state change).

Thus NO CHANGE is needed for infoblox

notification_topicsDEFAULTempty listnotifications

NO CHANGE is needed for infoblox.

The infoblox installation guide requires the notifications to be "notifications"

notification_driverDEFAULTNonemessaging

Change needed.

The infoblox installation guide requires the notification driver to be "messagingv2".

Changes to existing parameters in neutron.conf

Parameter NameSection in neutron.confDefault ValueCurrent Value Description
ipam_driverDEFAULTNone

None

(param is undeclared in neutron.conf)

Pluggable IPAM driver to be used by neutron API server.

For infoblox, the value is "networking_infoblox.ipam.driver.InfobloxPool"

notification_driverDEFAULTempty listmessaging

The driver used to send notifications from the neutron API server to the neutron agents.

The installation guide for networking-infoblox calls for the notification_driver to be "messagingv2"

notification_topicsDEFAULTNonenotifications

No change needed.

The row is here show the changes in the neutron parameters described in the installation guide for networking-infoblox

Parameters specific to the Networking Infoblox Driver. All the parameters for the Infoblox IPAM driver must be defined in neutron.conf.

Parameter NameSection in neutron.confDefault ValueDescription
cloud_data_center_idinfoblox0ID for selecting a particular grid from one or more grids to serve networks in the Infoblox back end
ipam_agent_workersinfoblox1Number of Infoblox IPAM agent works to run
grid_master_hostinfoblox-dc.<cloud_data_center_id>empty stringIP address of the grid master. WAPI requests are sent to the grid_master_host
ssl_verifyinfoblox-dc.<cloud_data_center_id>FalseEnsure whether WAPI requests sent over HTTPS require SSL verification
WAPI Versioninfoblox-dc.<cloud_data_center_id>1.4The WAPI version. Value should be 2.2.
admin_user_nameinfoblox-dc.<cloud_data_center_id>empty stringAdmin user name to access the grid master or cloud platform appliance
admin_passwordinfoblox-dc.<cloud_data_center_id>empty stringAdmin user password
http_pool_connectionsinfoblox-dc.<cloud_data_center_id>100 
http_pool_maxsizeinfoblox-dc.<cloud_data_center_id>100 
http_request_timeoutinfoblox-dc.<cloud_data_center_id>120 

The diagram below shows nova compute sending notification to the infoblox-ipam-agent

10.4.7.5 Limitations Edit source

  • There is no IPAM migration path from non-pluggable to pluggable IPAM driver (https://bugs.launchpad.net/neutron/+bug/1516156). This means there is no way to reconfigure the neutron database if you wanted to change neutron to use a pluggable IPAM driver. Unless you change the default of non-pluggable IPAM configuration to a pluggable driver at install time, you will have no other opportunity to make that change because reconfiguration of SUSE OpenStack Cloud 9from using the default non-pluggable IPAM configuration to SUSE OpenStack Cloud 9 using a pluggable IPAM driver is not supported.

  • Upgrade from previous versions of SUSE OpenStack Cloud to SUSE OpenStack Cloud 9 to use a pluggable IPAM driver is not supported.

  • The Infoblox appliance does not allow for overlapping IPs. For example, only one tenant can have a CIDR of 10.0.0.0/24.

  • The infoblox IPAM driver fails the creation of a subnet when a there is no gateway-ip supplied. For example, the command openstack subnet create ... --no-gateway ... will fail.

10.4.8 Configuring Load Balancing as a Service (LBaaS) Edit source

SUSE OpenStack Cloud 9 LBaaS Configuration

Load Balancing as a Service (LBaaS) is an advanced networking service that allows load balancing of multi-node environments. It provides the ability to spread requests across multiple servers thereby reducing the load on any single server. This document describes the installation steps and the configuration for LBaaS v2.

Warning
Warning

The LBaaS architecture is based on a driver model to support different load balancers. LBaaS-compatible drivers are provided by load balancer vendors including F5 and Citrix. A new software load balancer driver was introduced in the OpenStack Liberty release called "Octavia". The Octavia driver deploys a software load balancer called HAProxy. Octavia is the default load balancing provider in SUSE OpenStack Cloud 9 for LBaaS V2. Until Octavia is configured the creation of load balancers will fail with an error. Refer to Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 43 “Configuring Load Balancer as a Service” document for information on installing Octavia.

Warning
Warning

Before upgrading to SUSE OpenStack Cloud 9, contact F5 and SUSE to determine which F5 drivers have been certified for use with SUSE OpenStack Cloud. Loading drivers not certified by SUSE may result in failure of your cloud deployment.

LBaaS V2 offers with Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 43 “Configuring Load Balancer as a Service” a software load balancing solution that supports both a highly available control plane and data plane. However, should an external hardware load balancer be selected the cloud operation can achieve additional performance and availability.

LBaaS v2

  1. Your vendor already has a driver that supports LBaaS v2. Many hardware load balancer vendors already support LBaaS v2 and this list is growing all the time.

  2. You intend to script your load balancer creation and management so a UI is not important right now (horizon support will be added in a future release).

  3. You intend to support TLS termination at the load balancer.

  4. You intend to use the Octavia software load balancer (adding HA and scalability).

  5. You do not want to take your load balancers offline to perform subsequent LBaaS upgrades.

  6. You intend in future releases to need L7 load balancing.

Reasons not to select this version.

  1. Your LBaaS vendor does not have a v2 driver.

  2. You must be able to manage your load balancers from horizon.

  3. You have legacy software which utilizes the LBaaS v1 API.

LBaaS v2 is installed by default with SUSE OpenStack Cloud and requires minimal configuration to start the service.

Note
Note

LBaaS V2 API currently supports load balancer failover with Octavia. LBaaS v2 API includes automatic failover of a deployed load balancer with Octavia. More information about this driver can be found in Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 43 “Configuring Load Balancer as a Service”.

10.4.8.1 Prerequisites Edit source

SUSE OpenStack Cloud LBaaS v2

  1. SUSE OpenStack Cloud must be installed for LBaaS v2.

  2. Follow the instructions to install Book “Deployment Guide using Cloud Lifecycle Manager”, Chapter 43 “Configuring Load Balancer as a Service”

10.4.9 Load Balancer: Octavia Driver Administration Edit source

This document provides the instructions on how to enable and manage various components of the Load Balancer Octavia driver if that driver is enabled.

10.4.9.1 Monasca Alerts Edit source

The monasca-agent has the following Octavia-related plugins:

  • Process checks – checks if octavia processes are running. When it starts, it detects which processes are running and then monitors them.

  • http_connect check – checks if it can connect to octavia api servers.

Alerts are displayed in the Operations Console.

10.4.9.2 Tuning Octavia Installation Edit source

Homogeneous Compute Configuration

Octavia works only with homogeneous compute node configurations. Currently, Octavia does not support multiple nova flavors. If Octavia needs to be supported on multiple compute nodes, then all the compute nodes should carry same set of physnets (which will be used for Octavia).

Octavia and Floating IPs

Due to a neutron limitation Octavia will only work with CVR routers. Another option is to use VLAN provider networks which do not require a router.

You cannot currently assign a floating IP address as the VIP (user facing) address for a load balancer created by the Octavia driver if the underlying neutron network is configured to support Distributed Virtual Router (DVR). The Octavia driver uses a neutron function known as allowed address pairs to support load balancer fail over.

There is currently a neutron bug that does not support this function in a DVR configuration

Octavia Configuration Files

The system comes pre-tuned and should not need any adjustments for most customers. If in rare instances manual tuning is needed, follow these steps:

Warning
Warning

Changes might be lost during SUSE OpenStack Cloud upgrades.

Edit the Octavia configuration files in my_cloud/config/octavia. It is recommended that any changes be made in all of the Octavia configuration files.

  • octavia-api.conf.j2

  • octavia-health-manager.conf.j2

  • octavia-housekeeping.conf.j2

  • octavia-worker.conf.j2

After the changes are made to the configuration files, redeploy the service.

  1. Commit changes to git.

    ardana > cd ~/openstack
    ardana > git add -A
    ardana > git commit -m "My Octavia Config"
  2. Run the configuration processor and ready deployment.

    ardana > cd ~/openstack/ardana/ansible/
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  3. Run the Octavia reconfigure.

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts octavia-reconfigure.yml

Spare Pools

The Octavia driver provides support for creating spare pools of the HAProxy software installed in VMs. This means instead of creating a new load balancer when loads increase, create new load balancer calls will pull a load balancer from the spare pool. The spare pools feature consumes resources, therefore the load balancers in the spares pool has been set to 0, which is the default and also disables the feature.

Reasons to enable a load balancing spare pool in SUSE OpenStack Cloud

  1. You expect a large number of load balancers to be provisioned all at once (puppet scripts, or ansible scripts) and you want them to come up quickly.

  2. You want to reduce the wait time a customer has while requesting a new load balancer.

To increase the number of load balancers in your spares pool, edit the Octavia configuration files by uncommenting the spare_amphora_pool_size and adding the number of load balancers you would like to include in your spares pool.

# Pool size for the spare pool
# spare_amphora_pool_size = 0

10.4.9.3 Managing Amphora Edit source

Octavia starts a separate VM for each load balancing function. These VMs are called amphora.

Updating the Cryptographic Certificates

Octavia uses two-way SSL encryption for communication between amphora and the control plane. Octavia keeps track of the certificates on the amphora and will automatically recycle them. The certificates on the control plane are valid for one year after installation of SUSE OpenStack Cloud.

You can check on the status of the certificate by logging into the controller node as root and running:

ardana > cd /opt/stack/service/octavia-SOME UUID/etc/certs/
openssl x509 -in client.pem  -text –noout

This prints the certificate out where you can check on the expiration dates.

To renew the certificates, reconfigure Octavia. Reconfiguring causes Octavia to automatically generate new certificates and deploy them to the controller hosts.

On the Cloud Lifecycle Manager execute octavia-reconfigure:

ardana > cd ~/scratch/ansible/next/ardana/ansible
ardana > ansible-playbook -i hosts/verb_hosts octavia-reconfigure.yml

Accessing VM information in nova

You can use openstack project list as an administrative user to obtain information about the tenant or project-id of the Octavia project. In the example below, the Octavia project has a project-id of 37fd6e4feac14741b6e75aba14aea833.

ardana > openstack project list
+----------------------------------+------------------+
| ID                               | Name             |
+----------------------------------+------------------+
| 055071d8f25d450ea0b981ca67f7ccee | glance-swift     |
| 37fd6e4feac14741b6e75aba14aea833 | octavia          |
| 4b431ae087ef4bd285bc887da6405b12 | swift-monitor    |
| 8ecf2bb5754646ae97989ba6cba08607 | swift-dispersion |
| b6bd581f8d9a48e18c86008301d40b26 | services         |
| bfcada17189e4bc7b22a9072d663b52d | cinderinternal   |
| c410223059354dd19964063ef7d63eca | monitor          |
| d43bc229f513494189422d88709b7b73 | admin            |
| d5a80541ba324c54aeae58ac3de95f77 | demo             |
| ea6e039d973e4a58bbe42ee08eaf6a7a | backup           |
+----------------------------------+------------------+

You can then use openstack server list --tenant <project-id> to list the VMs for the Octavia tenant. Take particular note of the IP address on the OCTAVIA-MGMT-NET; in the example below it is 172.30.1.11. For additional nova command-line options see Section 10.4.9.5, “For More Information”.

ardana > openstack server list --tenant 37fd6e4feac14741b6e75aba14aea833
+--------------------------------------+----------------------------------------------+----------------------------------+--------+------------+-------------+------------------------------------------------+
| ID                                   | Name                                         | Tenant ID                        | Status | Task State | Power State | Networks                                       |
+--------------------------------------+----------------------------------------------+----------------------------------+--------+------------+-------------+------------------------------------------------+
| 1ed8f651-de31-4208-81c5-817363818596 | amphora-1c3a4598-5489-48ea-8b9c-60c821269e4c | 37fd6e4feac14741b6e75aba14aea833 | ACTIVE | -          | Running     | private=10.0.0.4; OCTAVIA-MGMT-NET=172.30.1.11 |
+--------------------------------------+----------------------------------------------+----------------------------------+--------+------------+-------------+------------------------------------------------+
Important
Important

The Amphora VMs do not have SSH or any other access. In the rare case that there is a problem with the underlying load balancer the whole amphora will need to be replaced.

Initiating Failover of an Amphora VM

Under normal operations Octavia will monitor the health of the amphora constantly and automatically fail them over if there are any issues. This helps to minimize any potential downtime for load balancer users. There are, however, a few cases a failover needs to be initiated manually:

  1. The Loadbalancer has become unresponsive and Octavia has not detected an error.

  2. A new image has become available and existing load balancers need to start using the new image.

  3. The cryptographic certificates to control and/or the HMAC password to verify Health information of the amphora have been compromised.

To minimize the impact for end users we will keep the existing load balancer working until shortly before the new one has been provisioned. There will be a short interruption for the load balancing service so keep that in mind when scheduling the failovers. To achieve that follow these steps (assuming the management ip from the previous step):

  1. Assign the IP to a SHELL variable for better readability.

    ardana > export MGM_IP=172.30.1.11
  2. Identify the port of the vm on the management network.

    ardana > openstack port list | grep $MGM_IP
    | 0b0301b9-4ee8-4fb6-a47c-2690594173f4 |                                                   | fa:16:3e:d7:50:92 |
    {"subnet_id": "3e0de487-e255-4fc3-84b8-60e08564c5b7", "ip_address": "172.30.1.11"} |
  3. Disable the port to initiate a failover. Note the load balancer will still function but cannot be controlled any longer by Octavia.

    Note
    Note

    Changes after disabling the port will result in errors.

    ardana > openstack port set --admin-state-up False 0b0301b9-4ee8-4fb6-a47c-2690594173f4
    Updated port: 0b0301b9-4ee8-4fb6-a47c-2690594173f4
  4. You can check to see if the amphora failed over with openstack server list --tenant <project-id>. This may take some time and in some cases may need to be repeated several times. You can tell that the failover has been successful by the changed IP on the management network.

    ardana > openstack server list --tenant 37fd6e4feac14741b6e75aba14aea833
    +--------------------------------------+----------------------------------------------+----------------------------------+--------+------------+-------------+------------------------------------------------+
    | ID                                   | Name                                         | Tenant ID                        | Status | Task State | Power State | Networks                                       |
    +--------------------------------------+----------------------------------------------+----------------------------------+--------+------------+-------------+------------------------------------------------+
    | 1ed8f651-de31-4208-81c5-817363818596 | amphora-1c3a4598-5489-48ea-8b9c-60c821269e4c | 37fd6e4feac14741b6e75aba14aea833 | ACTIVE | -          | Running     | private=10.0.0.4; OCTAVIA-MGMT-NET=172.30.1.12 |
    +--------------------------------------+----------------------------------------------+----------------------------------+--------+------------+-------------+------------------------------------------------+
Warning
Warning

Do not issue too many failovers at once. In a big installation you might be tempted to initiate several failovers in parallel for instance to speed up an update of amphora images. This will put a strain on the nova service and depending on the size of your installation you might need to throttle the failover rate.

10.4.9.4 Load Balancer: Octavia Administration Edit source

10.4.9.4.1 Removing load balancers Edit source

The following procedures demonstrate how to delete a load balancer that is in the ERROR, PENDING_CREATE, or PENDING_DELETE state.

Procedure 10.3: Manually deleting load balancers created with neutron lbaasv2 (in an upgrade/migration scenario)
  1. Query the Neutron service for the loadbalancer ID:

    tux > neutron lbaas-loadbalancer-list
    neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
    +--------------------------------------+---------+----------------------------------+--------------+---------------------+----------+
    | id                                   | name    | tenant_id                        | vip_address  | provisioning_status | provider |
    +--------------------------------------+---------+----------------------------------+--------------+---------------------+----------+
    | 7be4e4ab-e9c6-4a57-b767-da9af5ba7405 | test-lb | d62a1510b0f54b5693566fb8afeb5e33 | 192.168.1.10 | ERROR               | haproxy  |
    +--------------------------------------+---------+----------------------------------+--------------+---------------------+----------+
  2. Connect to the neutron database:

    Important
    Important

    The default database name depends on the life cycle manager. Ardana uses ovs_neutron while Crowbar uses neutron.

    Ardana:

    mysql> use ovs_neutron

    Crowbar:

    mysql> use neutron
  3. Get the pools and healthmonitors associated with the loadbalancer:

    mysql> select id, healthmonitor_id, loadbalancer_id from lbaas_pools where loadbalancer_id = '7be4e4ab-e9c6-4a57-b767-da9af5ba7405';
    +--------------------------------------+--------------------------------------+--------------------------------------+
    | id                                   | healthmonitor_id                     | loadbalancer_id                      |
    +--------------------------------------+--------------------------------------+--------------------------------------+
    | 26c0384b-fc76-4943-83e5-9de40dd1c78c | 323a3c4b-8083-41e1-b1d9-04e1fef1a331 | 7be4e4ab-e9c6-4a57-b767-da9af5ba7405 |
    +--------------------------------------+--------------------------------------+--------------------------------------+
  4. Get the members associated with the pool:

    mysql> select id, pool_id from lbaas_members where pool_id = '26c0384b-fc76-4943-83e5-9de40dd1c78c';
    +--------------------------------------+--------------------------------------+
    | id                                   | pool_id                              |
    +--------------------------------------+--------------------------------------+
    | 6730f6c1-634c-4371-9df5-1a880662acc9 | 26c0384b-fc76-4943-83e5-9de40dd1c78c |
    | 06f0cfc9-379a-4e3d-ab31-cdba1580afc2 | 26c0384b-fc76-4943-83e5-9de40dd1c78c |
    +--------------------------------------+--------------------------------------+
  5. Delete the pool members:

    mysql> delete from lbaas_members where id = '6730f6c1-634c-4371-9df5-1a880662acc9';
    mysql> delete from lbaas_members where id = '06f0cfc9-379a-4e3d-ab31-cdba1580afc2';
  6. Find and delete the listener associated with the loadbalancer:

    mysql> select id, loadbalancer_id, default_pool_id from lbaas_listeners where loadbalancer_id = '7be4e4ab-e9c6-4a57-b767-da9af5ba7405';
    +--------------------------------------+--------------------------------------+--------------------------------------+
    | id                                   | loadbalancer_id                      | default_pool_id                      |
    +--------------------------------------+--------------------------------------+--------------------------------------+
    | 3283f589-8464-43b3-96e0-399377642e0a | 7be4e4ab-e9c6-4a57-b767-da9af5ba7405 | 26c0384b-fc76-4943-83e5-9de40dd1c78c |
    +--------------------------------------+--------------------------------------+--------------------------------------+
    mysql> delete from lbaas_listeners where id = '3283f589-8464-43b3-96e0-399377642e0a';
  7. Delete the pool associated with the loadbalancer:

    mysql> delete from lbaas_pools where id = '26c0384b-fc76-4943-83e5-9de40dd1c78c';
  8. Delete the healthmonitor associated with the pool:

    mysql> delete from lbaas_healthmonitors where id = '323a3c4b-8083-41e1-b1d9-04e1fef1a331';
  9. Delete the loadbalancer:

    mysql> delete from lbaas_loadbalancer_statistics where loadbalancer_id = '7be4e4ab-e9c6-4a57-b767-da9af5ba7405';
    mysql> delete from lbaas_loadbalancers where id = '7be4e4ab-e9c6-4a57-b767-da9af5ba7405';
Procedure 10.4: Manually Deleting Load Balancers Created With Octavia
  1. Query the Octavia service for the loadbalancer ID:

    tux > openstack loadbalancer list --column id --column name --column provisioning_status
    +--------------------------------------+---------+---------------------+
    | id                                   | name    | provisioning_status |
    +--------------------------------------+---------+---------------------+
    | d8ac085d-e077-4af2-b47a-bdec0c162928 | test-lb | ERROR               |
    +--------------------------------------+---------+---------------------+
  2. Query the Octavia service for the amphora IDs (in this example we use ACTIVE/STANDBY topology with 1 spare Amphora):

    tux > openstack loadbalancer amphora list
    +--------------------------------------+--------------------------------------+-----------+--------+---------------+-------------+
    | id                                   | loadbalancer_id                      | status    | role   | lb_network_ip | ha_ip       |
    +--------------------------------------+--------------------------------------+-----------+--------+---------------+-------------+
    | 6dc66d41-e4b6-4c33-945d-563f8b26e675 | d8ac085d-e077-4af2-b47a-bdec0c162928 | ALLOCATED | BACKUP | 172.30.1.7    | 192.168.1.8 |
    | 1b195602-3b14-4352-b355-5c4a70e200cf | d8ac085d-e077-4af2-b47a-bdec0c162928 | ALLOCATED | MASTER | 172.30.1.6    | 192.168.1.8 |
    | b2ee14df-8ac6-4bb0-a8d3-3f378dbc2509 | None                                 | READY     | None   | 172.30.1.20   | None        |
    +--------------------------------------+--------------------------------------+-----------+--------+---------------+-------------+
  3. Query the Octavia service for the loadbalancer pools:

    tux > openstack loadbalancer pool list
    +--------------------------------------+-----------+----------------------------------+---------------------+----------+--------------+----------------+
    | id                                   | name      | project_id                       | provisioning_status | protocol | lb_algorithm | admin_state_up |
    +--------------------------------------+-----------+----------------------------------+---------------------+----------+--------------+----------------+
    | 39c4c791-6e66-4dd5-9b80-14ea11152bb5 | test-pool | 86fba765e67f430b83437f2f25225b65 | ACTIVE              | TCP      | ROUND_ROBIN  | True           |
    +--------------------------------------+-----------+----------------------------------+---------------------+----------+--------------+----------------+
  4. Connect to the octavia database:

    mysql> use octavia
  5. Delete any listeners, pools, health monitors, and members from the load balancer:

    mysql> delete from listener where load_balancer_id = 'd8ac085d-e077-4af2-b47a-bdec0c162928';
    mysql> delete from health_monitor where pool_id = '39c4c791-6e66-4dd5-9b80-14ea11152bb5';
    mysql> delete from member where pool_id = '39c4c791-6e66-4dd5-9b80-14ea11152bb5';
    mysql> delete from pool where load_balancer_id = 'd8ac085d-e077-4af2-b47a-bdec0c162928';
  6. Delete the amphora entries in the database:

    mysql> delete from amphora_health where amphora_id = '6dc66d41-e4b6-4c33-945d-563f8b26e675';
    mysql> update amphora set status = 'DELETED' where id = '6dc66d41-e4b6-4c33-945d-563f8b26e675';
    mysql> delete from amphora_health where amphora_id = '1b195602-3b14-4352-b355-5c4a70e200cf';
    mysql> update amphora set status = 'DELETED' where id = '1b195602-3b14-4352-b355-5c4a70e200cf';
  7. Delete the load balancer instance:

    mysql> update load_balancer set provisioning_status = 'DELETED' where id = 'd8ac085d-e077-4af2-b47a-bdec0c162928';
  8. The following script automates the above steps:

    #!/bin/bash
    
    if (( $# != 1 )); then
    echo "Please specify a loadbalancer ID"
    exit 1
    fi
    
    LB_ID=$1
    
    set -u -e -x
    
    readarray -t AMPHORAE < <(openstack loadbalancer amphora list \
    --format value \
    --column id \
    --column loadbalancer_id \
    | grep ${LB_ID} \
    | cut -d ' ' -f 1)
    
    readarray -t POOLS < <(openstack loadbalancer show ${LB_ID} \
    --format value \
    --column pools)
    
    mysql octavia --execute "delete from listener where load_balancer_id = '${LB_ID}';"
    for p in "${POOLS[@]}"; do
    mysql octavia --execute "delete from health_monitor where pool_id = '${p}';"
    mysql octavia --execute "delete from member where pool_id = '${p}';"
    done
    mysql octavia --execute "delete from pool where load_balancer_id = '${LB_ID}';"
    for a in "${AMPHORAE[@]}"; do
    mysql octavia --execute "delete from amphora_health where amphora_id = '${a}';"
    mysql octavia --execute "update amphora set status = 'DELETED' where id = '${a}';"
    done
    mysql octavia --execute "update load_balancer set provisioning_status = 'DELETED' where id = '${LB_ID}';"

10.4.9.5 For More Information Edit source

For more information on the OpenStackClient and Octavia terminology, see the OpenStackClient guide.

10.4.10 Role-based Access Control in neutron Edit source

This topic explains how to achieve more granular access control for your neutron networks.

Previously in SUSE OpenStack Cloud, a network object was either private to a project or could be used by all projects. If the network's shared attribute was True, then the network could be used by every project in the cloud. If false, only the members of the owning project could use it. There was no way for the network to be shared by only a subset of the projects.

neutron Role Based Access Control (RBAC) solves this problem for networks. Now the network owner can create RBAC policies that give network access to target projects. Members of a targeted project can use the network named in the RBAC policy the same way as if the network was owned by the project. Constraints are described in the section Section 10.4.10.10, “Limitations”.

With RBAC you are able to let another tenant use a network that you created, but as the owner of the network, you need to create the subnet and the router for the network.

10.4.10.1 Creating a Network Edit source

ardana > openstack network create demo-net
+---------------------------+--------------------------------------+
| Field                     | Value                                |
+---------------------------+--------------------------------------+
| admin_state_up            | UP                                   |
| availability_zone_hints   |                                      |
| availability_zones        |                                      |
| created_at                | 2018-07-25T17:43:59Z                 |
| description               |                                      |
| dns_domain                |                                      |
| id                        | 9c801954-ec7f-4a65-82f8-e313120aabc4 |
| ipv4_address_scope        | None                                 |
| ipv6_address_scope        | None                                 |
| is_default                | False                                |
| is_vlan_transparent       | None                                 |
| mtu                       | 1450                                 |
| name                      | demo-net                             |
| port_security_enabled     | False                                |
| project_id                | cb67c79e25a84e328326d186bf703e1b     |
| provider:network_type     | vxlan                                |
| provider:physical_network | None                                 |
| provider:segmentation_id  | 1009                                 |
| qos_policy_id             | None                                 |
| revision_number           | 2                                    |
| router:external           | Internal                             |
| segments                  | None                                 |
| shared                    | False                                |
| status                    | ACTIVE                               |
| subnets                   |                                      |
| tags                      |                                      |
| updated_at                | 2018-07-25T17:43:59Z                 |
+---------------------------+--------------------------------------+

10.4.10.2 Creating an RBAC Policy Edit source

Here we will create an RBAC policy where a member of the project called 'demo' will share the network with members of project 'demo2'

To create the RBAC policy, run:

ardana > openstack network rbac create  --target-project DEMO2-PROJECT-ID --type network --action access_as_shared demo-net

Here is an examp