Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
documentation.suse.com / Documentation / Operations Guide CLM / Managing Object Storage
Applies to SUSE OpenStack Cloud 9

9 Managing Object Storage

Information about managing and configuring the Object Storage service.

The Object Storage service may be deployed in a full-fledged manner, with proxy nodes engaging rings for managing the accounts, containers, and objects being stored. Or, it may simply be deployed as a front-end to SUSE Enterprise Storage, offering Object Storage APIs with an external back-end.

In the former case, managing your Object Storage environment includes tasks related to ensuring your swift rings stay balanced, and that and other topics are discussed in more detail in this section. swift includes many commands and utilities for these purposes.

When used as a front-end to SUSE Enterprise Storage, many swift constructs such as rings and ring balancing, replica dispersion, etc. do not apply, as swift itself is not responsible for the mechanics of object storage.

9.1 Running the swift Dispersion Report

swift contains a tool called swift-dispersion-report that can be used to determine whether your containers and objects have three replicas like they are supposed to. This tool works by populating a percentage of partitions in the system with containers and objects (using swift-dispersion-populate) and then running the report to see if all the replicas of these containers and objects are in the correct place. For a more detailed explanation of this tool in Openstack swift, please see OpenStack swift - Administrator's Guide.

9.1.1 Configuring the swift dispersion populate

Once a swift system has been fully deployed in SUSE OpenStack Cloud 9, you can setup the swift-dispersion-report using the default parameters found in ~/openstack/ardana/ansible/roles/swift-dispersion/templates/dispersion.conf.j2. This populates 1% of the partitions on the system and if you are happy with this figure, please proceed to step 2 below. Otherwise, follow step 1 to edit the configuration file.

  1. If you wish to change the dispersion coverage percentage, then connect to the Cloud Lifecycle Manager server and change the value of dispersion_coverage in the ~/openstack/ardana/ansible/roles/swift-dispersion/templates/dispersion.conf.j2 file to the value you wish to use. In the example below we have altered the file to create 5% dispersion:

    ...
    [dispersion]
    auth_url = {{ keystone_identity_uri }}/v3
    auth_user = {{ swift_dispersion_tenant }}:{{ swift_dispersion_user }}
    auth_key = {{ swift_dispersion_password  }}
    endpoint_type = {{ endpoint_type }}
    auth_version = {{ disp_auth_version }}
    # Set this to the percentage coverage. We recommend a value
    # of 1%. You can increase this to get more coverage. However, if you
    # decrease the value, the dispersion containers and objects are
    # not deleted.
    dispersion_coverage = 5.0
  2. Commit your configuration to the Git repository (Chapter 22, Using Git for Configuration Management), as follows:

    ardana > git add -A
    ardana > git commit -m "My config or other commit message"
  3. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  4. Update your deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  5. Reconfigure the swift servers:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-reconfigure.yml
  6. Run this playbook to populate your swift system for the health check:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-dispersion-populate.yml

9.1.2 Running the swift dispersion report

Check the status of the swift system by running the swift dispersion report with this playbook:

ardana > cd ~/scratch/ansible/next/ardana/ansible
ardana > ansible-playbook -i hosts/verb_hosts swift-dispersion-report.yml

The output of the report will look similar to this:

TASK: [swift-dispersion | report | Display dispersion report results] *********
ok: [padawan-ccp-c1-m1-mgmt] => {
    "var": {
        "dispersion_report_result.stdout_lines": [
            "Using storage policy: General ",
            "",
            "[KQueried 40 containers for dispersion reporting, 0s, 0 retries",
            "100.00% of container copies found (120 of 120)",
            "Sample represents 0.98% of the container partition space",
            "",
            "[KQueried 40 objects for dispersion reporting, 0s, 0 retries",
            "There were 40 partitions missing 0 copies.",
            "100.00% of object copies found (120 of 120)",
            "Sample represents 0.98% of the object partition space"
        ]
    }
}
...

In addition to being able to run the report above, there will be a cron-job scheduled to run every 2 hours located on the primary proxy node of your cloud environment. It will run dispersion-report and save the results to the following location on its local filesystem:

/var/cache/swift/dispersion-report

When interpreting the results you get from this report, we recommend using swift Administrator's Guide - Cluster Health

9.2 Gathering Swift Data

The swift-recon command retrieves data from swift servers and displays the results. To use this command, log on as a root user to any node which is running the swift-proxy service.

9.2.1 Notes

For help with the swift-recon command you can use this:

tux > sudo swift-recon --help
Warning
Warning

The --driveaudit option is not supported.

Warning
Warning

SUSE OpenStack Cloud does not support ec_type isa_l_rs_vand and ec_num_parity_fragments greater than or equal to 5 in the storage-policy configuration. This particular policy is known to harm data durability.

9.2.2 Using the swift-recon Command

The following command retrieves and displays disk usage information:

tux > sudo swift-recon --diskusage

For example:

tux > sudo swift-recon --diskusage
===============================================================================
--> Starting reconnaissance on 3 hosts
===============================================================================
[2015-09-14 16:01:40] Checking disk usage now
Distribution Graph:
 10%    3 *********************************************************************
 11%    1 ***********************
 12%    2 **********************************************
Disk usage: space used: 13745373184 of 119927734272
Disk usage: space free: 106182361088 of 119927734272
Disk usage: lowest: 10.39%, highest: 12.96%, avg: 11.4613798613%
===============================================================================

In the above example, the results for several nodes are combined together. You can also view the results from individual nodes by adding the -v option as shown in the following example:

tux > sudo swift-recon --diskusage -v
===============================================================================
--> Starting reconnaissance on 3 hosts
===============================================================================
[2015-09-14 16:12:30] Checking disk usage now
-> http://192.168.245.3:6000/recon/diskusage: [{'device': 'disk1', 'avail': 17398411264, 'mounted': True, 'used': 2589544448, 'size': 19987955712}, {'device': 'disk0', 'avail': 17904222208, 'mounted': True, 'used': 2083733504, 'size': 19987955712}]
-> http://192.168.245.2:6000/recon/diskusage: [{'device': 'disk1', 'avail': 17769721856, 'mounted': True, 'used': 2218233856, 'size': 19987955712}, {'device': 'disk0', 'avail': 17793581056, 'mounted': True, 'used': 2194374656, 'size': 19987955712}]
-> http://192.168.245.4:6000/recon/diskusage: [{'device': 'disk1', 'avail': 17912147968, 'mounted': True, 'used': 2075807744, 'size': 19987955712}, {'device': 'disk0', 'avail': 17404235776, 'mounted': True, 'used': 2583719936, 'size': 19987955712}]
Distribution Graph:
 10%    3 *********************************************************************
 11%    1 ***********************
 12%    2 **********************************************
Disk usage: space used: 13745414144 of 119927734272
Disk usage: space free: 106182320128 of 119927734272
Disk usage: lowest: 10.39%, highest: 12.96%, avg: 11.4614140152%
===============================================================================

By default, swift-recon uses the object-0 ring for information about nodes and drives. For some commands, it is appropriate to specify account, container, or object to indicate the type of ring. For example, to check the checksum of the account ring, use the following:

tux > sudo swift-recon --md5 account
===============================================================================
--> Starting reconnaissance on 3 hosts
===============================================================================
[2015-09-14 16:17:28] Checking ring md5sums
3/3 hosts matched, 0 error[s] while checking hosts.
===============================================================================
[2015-09-14 16:17:28] Checking swift.conf md5sum
3/3 hosts matched, 0 error[s] while checking hosts.
===============================================================================

9.3 Gathering Swift Monitoring Metrics

The swiftlm-scan command is the mechanism used to gather metrics for the monasca system. These metrics are used to derive alarms. For a list of alarms that can be generated from this data, see Section 18.1.1, “Alarm Resolution Procedures”.

To view the metrics, use the swiftlm-scan command directly. Log on to the swift node as the root user. The following example shows the command and a snippet of the output:

tux > sudo swiftlm-scan --pretty
. . .
  {
    "dimensions": {
      "device": "sdc",
      "hostname": "padawan-ccp-c1-m2-mgmt",
      "service": "object-storage"
    },
    "metric": "swiftlm.swift.drive_audit",
    "timestamp": 1442248083,
    "value": 0,
    "value_meta": {
      "msg": "No errors found on device: sdc"
    }
  },
. . .
Note
Note

To make the JSON file easier to read, use the --pretty option.

The fields are as follows:

metric

Specifies the name of the metric.

dimensions

Provides information about the source or location of the metric. The dimensions differ depending on the metric in question. The following dimensions are used by swiftlm-scan:

  • service: This is always object-storage.

  • component: This identifies the component. For example, swift-object-server indicates that the metric is about the swift-object-server process.

  • hostname: This is the name of the node the metric relates to. This is not necessarily the name of the current node.

  • url: If the metric is associated with a URL, this is the URL.

  • port: If the metric relates to connectivity to a node, this is the port used.

  • device: This is the block device a metric relates to.

value

The value of the metric. For many metrics, this is simply the value of the metric. However, if the value indicates a status. If value_meta contains a msg field, the value is a status. The following status values are used:

  • 0 - no error

  • 1 - warning

  • 2 - failure

value_meta

Additional information. The msg field is the most useful of this information.

9.3.1 Optional Parameters

You can focus on specific sets of metrics by using one of the following optional parameters:

--replication

Checks replication and health status.

--file-ownership

Checks that swift owns its relevant files and directories.

--drive-audit

Checks for logged events about corrupted sectors (unrecoverable read errors) on drives.

--connectivity

Checks connectivity to various servers used by the swift system, including:

  • Checks this node can connect to all memcachd servers

  • Checks that this node can connect to the keystone service (only applicable if this is a proxy server node)

--swift-services

Check that the relevant swift processes are running.

--network-interface

Checks NIC speed and reports statistics for each interface.

--check-mounts

Checks that the node has correctly mounted drives used by swift.

--hpssacli

If this server uses a Smart Array Controller, this checks the operation of the controller and disk drives.

9.4 Using the swift Command-line Client (CLI)

OpenStackClient (OSC) is a command-line client for OpenStack with a uniform command structure for OpenStack services. Some swift commands do not have OSC equivalents. The swift utility (or swift CLI) is installed on the Cloud Lifecycle Manager node and also on all other nodes running the swift proxy service. To use this utility on the Cloud Lifecycle Manager, you can use the ~/service.osrc file as a basis and then edit it with the credentials of another user if you need to.

ardana > cp ~/service.osrc ~/swiftuser.osrc

Then you can use your preferred editor to edit swiftuser.osrc so you can authenticate using the OS_USERNAME, OS_PASSWORD, and OS_PROJECT_NAME you wish to use. For example, if you want use the demo user that is created automatically for you, it would look like this:

unset OS_DOMAIN_NAME
export OS_IDENTITY_API_VERSION=3
export OS_AUTH_VERSION=3
export OS_PROJECT_NAME=demo
export OS_PROJECT_DOMAIN_NAME=Default
export OS_USERNAME=demo
export OS_USER_DOMAIN_NAME=Default
export OS_PASSWORD=<password>
export OS_AUTH_URL=<auth_URL>
export OS_ENDPOINT_TYPE=internalURL
# OpenstackClient uses OS_INTERFACE instead of OS_ENDPOINT
export OS_INTERFACE=internal
export OS_CACERT=/etc/ssl/certs/ca-certificates.crt
export OS_COMPUTE_API_VERSION=2

You must use the appropriate password for the demo user and select the correct endpoint for the OS_AUTH_URL value, which should be in the ~/service.osrc file you copied.

You can then examine the following account data using this command:

ardana > openstack object store account show

Example showing an environment with no containers or objects:

ardana > openstack object store account show
        Account: AUTH_205804d000a242d385b8124188284998
     Containers: 0
        Objects: 0
          Bytes: 0
X-Put-Timestamp: 1442249536.31989
     Connection: keep-alive
    X-Timestamp: 1442249536.31989
     X-Trans-Id: tx5493faa15be44efeac2e6-0055f6fb3f
   Content-Type: text/plain; charset=utf-8

Use the following command to create a container:

ardana > openstack container create CONTAINER_NAME

Example, creating a container named documents:

ardana > openstack container create documents

The newly created container appears. But there are no objects:

ardana > openstack container show documents
         Account: AUTH_205804d000a242d385b8124188284998
       Container: documents
         Objects: 0
           Bytes: 0
        Read ACL:
       Write ACL:
         Sync To:
        Sync Key:
   Accept-Ranges: bytes
X-Storage-Policy: General
      Connection: keep-alive
     X-Timestamp: 1442249637.69486
      X-Trans-Id: tx1f59d5f7750f4ae8a3929-0055f6fbcc
    Content-Type: text/plain; charset=utf-8

Upload a document:

ardana > openstack object create CONTAINER_NAME FILENAME

Example:

ardana > openstack object create documents mydocument
mydocument

List objects in the container:

ardana > openstack object list CONTAINER_NAME

Example using a container called documents:

ardana > openstack object list documents
mydocument
Note
Note

This is a brief introduction to the swift CLI. Use the swift --help command for more information. You can also use the OpenStack CLI, see openstack -h for more information.

9.5 Managing swift Rings

swift rings are a machine-readable description of which disk drives are used by the Object Storage service (for example, a drive is used to store account or object data). Rings also specify the policy for data storage (for example, defining the number of replicas). The rings are automatically built during the initial deployment of your cloud, with the configuration provided during setup of the SUSE OpenStack Cloud Input Model. For more information, see Chapter 5, Input Model.

After successful deployment of your cloud, you may want to change or modify the configuration for swift. For example, you may want to add or remove swift nodes, add additional storage policies, or upgrade the size of the disk drives. For instructions, see Section 9.5.5, “Applying Input Model Changes to Existing Rings” and Section 9.5.6, “Adding a New Swift Storage Policy”.

Note
Note

The process of modifying or adding a configuration is similar to other configuration or topology changes in the cloud. Generally, you make the changes to the input model files at ~/openstack/my_cloud/definition/ on the Cloud Lifecycle Manager and then run Ansible playbooks to reconfigure the system.

Changes to the rings require several phases to complete, therefore, you may need to run the playbooks several times over several days.

The following topics cover ring management.

9.5.1 Rebalancing Swift Rings

The swift ring building process tries to distribute data evenly among the available disk drives. The data is stored in partitions. (For more information, see Section 11.10, “Understanding Swift Ring Specifications”.) If you, for example, double the number of disk drives in a ring, you need to move 50% of the partitions to the new drives so that all drives contain the same number of partitions (and hence same amount of data). However, it is not possible to move the partitions in a single step. It can take minutes to hours to move partitions from the original drives to their new drives (this process is called the replication process).

If you move all partitions at once, there would be a period where swift would expect to find partitions on the new drives, but the data has not yet replicated there so that swift could not return the data to the user. Therefore, swift will not be able to find all of the data in the middle of replication because some data has finished replication while other bits of data are still in the old locations and have not yet been moved. So it is considered best practice to move only one replica at a time. If the replica count is 3, you could first move 16.6% of the partitions and then wait until all data has replicated. Then move another 16.6% of partitions. Wait again and then finally move the remaining 16.6% of partitions. For any given object, only one of the replicas is moved at a time.

9.5.1.1 Reasons to Move Partitions Gradually

Due to the following factors, you must move the partitions gradually:

  • Not all devices are of the same size. SUSE OpenStack Cloud 9 automatically assigns different weights to drives so that smaller drives store fewer partitions than larger drives.

  • The process attempts to keep replicas of the same partition in different servers.

  • Making a large change in one step (for example, doubling the number of drives in the ring), would result in a lot of network traffic due to the replication process and the system performance suffers. There are two ways to mitigate this:

9.5.2 Using the Weight-Step Attributes to Prepare for Ring Changes

swift rings are built during a deployment and this process sets the weights of disk drives such that smaller disk drives have a smaller weight than larger disk drives. When making changes in the ring, you should limit the amount of change that occurs. SUSE OpenStack Cloud 9 does this by limiting the weights of the new drives to a smaller value and then building new rings. Once the replication process has finished, SUSE OpenStack Cloud 9 will increase the weight and rebuild rings to trigger another round of replication. (For more information, see Section 9.5.1, “Rebalancing Swift Rings”.)

In addition, you should become familiar with how the replication process behaves on your system during normal operation. Before making ring changes, use the swift-recon command to determine the typical oldest replication times for your system. For instructions, see Section 9.5.4, “Determining When to Rebalance and Deploy a New Ring”.

In SUSE OpenStack Cloud, the weight-step attribute is set in the ring specification of the input model. The weight-step value specifies a maximum value for the change of the weight of a drive in any single rebalance. For example, if you add a drive of 4TB, you would normally assign a weight of 4096. However, if the weight-step attribute is set to 1024 instead then when you add that drive the weight is initially set to 1024. The next time you rebalance the ring, the weight is set to 2048. The subsequent rebalance would then set the weight to the final value of 4096.

The value of the weight-step attribute is dependent on the size of the drives, number of the servers being added, and how experienced you are with the replication process. A common starting value is to use 20% of the size of an individual drive. For example, when adding X number of 4TB drives a value of 820 would be appropriate. As you gain more experience with your system, you may increase or reduce this value.

9.5.2.1 Setting the weight-step attribute

Perform the following steps to set the weight-step attribute:

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the ~/openstack/my_cloud/definition/data/swift/swift_config.yml file containing the ring-specifications for the account, container, and object rings.

    Add the weight-step attribute to the ring in this format:

    - name: account
      weight-step: WEIGHT_STEP_VALUE
      display-name: Account Ring
      min-part-hours: 16
      ...

    For example, to set weight-step to 820, add the attribute like this:

    - name: account
      weight-step: 820
      display-name: Account Ring
      min-part-hours: 16
      ...
  3. Repeat step 2 for the other rings, if necessary (container, object-0, etc).

  4. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  5. Use the playbook to create a deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  6. To complete the configuration, use the ansible playbooks documented in Section 9.5.3, “Managing Rings Using swift Playbooks”.

9.5.3 Managing Rings Using swift Playbooks

The following table describes how playbooks relate to ring management.

All of these playbooks will be run from the Cloud Lifecycle Manager from the ~/scratch/ansible/next/ardana/ansible directory.

PlaybookDescriptionNotes
swift-update-from-model-rebalance-rings.yml

There are two steps in this playbook:

  • Make delta

    It processes the input model and compares it against the existing rings. After comparison, it produces a list of differences between the input model and the existing rings. This is called the ring delta. The ring delta covers drives being added, drives being removed, weight changes, and replica count changes.

  • Rebalance

    The ring delta is then converted into a series of commands (such as add) to the swift-ring-builder program. Finally, the rebalance command is issued to the swift-ring-builder program.

This playbook performs its actions on the first node running the swift-proxy service. (For more information, see Section 18.6.2.4, “Identifying the Swift Ring Building Server”.) However, it also scans all swift nodes to find the size of disk drives.

If there are no changes in the ring delta, the rebalance command is still executed to rebalance the rings. If min-part-hours has not yet elapsed or if no partitions need to be moved, new rings are not written.

swift-compare-model-rings.yml

There are two steps in this playbook:

  • Make delta

    This is the same as described for swift-update-from-model-rebalance-rings.yml.

  • Report

    This prints a summary of the proposed changes that will be made to the rings (that is, what would happen if you rebalanced).

The playbook reports any issues or problems it finds with the input model.

This playbook can be useful to confirm that there are no errors in the input model. It also allows you to check that when you change the input model, that the proposed ring changes are as expected. For example, if you have added a server to the input model, but this playbook reports that no drives are being added, you should determine the cause.

There is troubleshooting information related to the information that you receive in this report that you can view on this page: Section 18.6.2.3, “Interpreting Swift Input Model Validation Errors”.

swift-deploy.yml

swift-deploy.yml is responsible for installing software and configuring swift on nodes. As part of installing and configuring, it runs the swift-update-from-model-rebalance-rings.yml and swift-reconfigure.yml playbooks.

This playbook is included in the ardana-deploy.yml and site.yml playbooks, so if you run either of those playbooks, the swift-deploy.yml playbook is also run.

swift-reconfigure.yml

swift-reconfigure.yml takes rings that the swift-update-from-model-rebalance-rings.yml playbook has changed and copies those rings to all swift nodes.

Every time that you directly use the swift-update-from-model-rebalance-rings.yml playbook, you must copy these rings to the system using the swift-reconfigure.yml playbook. If you forget and run swift-update-from-model-rebalance-rings.yml twice, the process may move two replicates of some partitions at the same time.

9.5.3.1 Optional Ansible variables related to ring management

The following optional variables may be specified when running the playbooks outlined above. They are specified using the --extra-vars option.

VariableDescription and Use
limit_ring

Limit changes to the named ring. Other rings will not be examined or updated. This option may be used with any of the swift playbooks. For example, to only update the object-1 ring, use the following command:

ardana > ansible-playbook -i hosts/verb_hosts swift-update-from-model-rebalance-rings.yml --extra-vars "limit-ring=object-1"
drive_detail

Used only with the swift-compare-model-rings.yml playbook. The playbook will include details of changes to every drive where the model and existing rings differ. If you omit the drive_detail variable, only summary information is provided. The following shows how to use the drive_detail variable:

ardana > ansible-playbook -i hosts/verb_hosts swift-compare-model-rings.yml --extra-vars "drive_detail=yes"

9.5.3.2 Interpreting the report from the swift-compare-model-rings.yml playbook

The swift-compare-model-rings.yml playbook compares the existing swift rings with the input model and prints a report telling you how the rings and the model differ. Specifically, it will tell you what actions will take place when you next run the swift-update-from-model-rebalance-rings.yml playbook (or a playbook such as ardana-deploy.yml that runs swift-update-from-model-rebalance-rings.yml).

The swift-compare-model-rings.yml playbook will make no changes, but is just an advisory report.

Here is an example output from the playbook. The report is between "report.stdout_lines" and "PLAY RECAP":

TASK: [swiftlm-ring-supervisor | validate-input-model | Print report] *********
ok: [ardana-cp1-c1-m1-mgmt] => {
    "var": {
        "report.stdout_lines": [
            "Rings:",
            "  ACCOUNT:",
            "    ring exists (minimum time to next rebalance: 8:07:33)",
            "    will remove 1 devices (18.00GB)",
            "    ring will be rebalanced",
            "  CONTAINER:",
            "    ring exists (minimum time to next rebalance: 8:07:35)",
            "    no device changes",
            "    ring will be rebalanced",
            "  OBJECT-0:",
            "    ring exists (minimum time to next rebalance: 8:07:34)",
            "    no device changes",
            "    ring will be rebalanced"
        ]
    }
}

The following describes the report in more detail:

MessageDescription

ring exists

The ring already exists on the system.

ring will be created

The ring does not yet exist on the system.

no device changes

The devices in the ring exactly match the input model. There are no servers being added or removed and the weights are appropriate for the size of the drives.

minimum time to next rebalance

If this time is 0:00:00, if you run one of the swift playbooks that update rings, the ring will be rebalanced.

If the time is non-zero, it means that not enough time has elapsed since the ring was last rebalanced. Even if you run a swift playbook that attempts to change the ring, the ring will not actually rebalance. This time is determined by the min-part-hours attribute.

set-weight ardana-ccp-c1-m1-mgmt:disk0:/dev/sdc 8.00 > 12.00 > 18.63

The weight of disk0 (mounted on /dev/sdc) on server ardana-ccp-c1-m1-mgmt is currently set to 8.0 but should be 18.83 given the size of the drive. However, in this example, we cannot go directly from 8.0 to 18.63 because of the weight-step attribute. Hence, the proposed weight change is from 8.0 to 12.0.

This information is only shown when you the drive_detail=yes argument when running the playbook.

will change weight on 12 devices (6.00TB)

The weight of 12 devices will be increased. This might happen for example, if a server had been added in a prior ring update. However, with use of the weight-step attribute, the system gradually increases the weight of these new devices. In this example, the change in weight represents 6TB of total available storage. For example, if your system currently has 100TB of available storage, when the weight of these devices is changed, there will be 106TB of available storage. If your system is 50% utilized, this means that when the ring is rebalanced, up to 3TB of data may be moved by the replication process. This is an estimate - in practice, because only one copy of a given replica is moved in any given rebalance, it may not be possible to move this amount of data in a single ring rebalance.

add: ardana-ccp-c1-m1-mgmt:disk0:/dev/sdc

The disk0 device will be added to the ardana-ccp-c1-m1-mgmt server. This happens when a server is added to the input model or if a disk model is changed to add additional devices.

This information is only shown when you the drive_detail=yes argument when running the playbook.

remove: ardana-ccp-c1-m1-mgmt:disk0:/dev/sdc

The device is no longer in the input model and will be removed from the ring. This happens if a server is removed from the model, a disk drive is removed from a disk model or the server is marked for removal using the pass-through feature.

This information is only shown when you the drive_detail=yes argument when running the playbook.

will add 12 devices (6TB)

There are 12 devices in the input model that have not yet been added to the ring. Usually this is because one or more servers have been added. In this example, this could be one server with 12 drives or two servers, each with 6 drives. The size in the report is the change in total available capacity. When the weight-step attribute is used, this may be a fraction of the total size of the disk drives. In this example, 6TB of capacity is being added. For example, if your system currently has 100TB of available storage, when these devices are added, there will be 106TB of available storage. If your system is 50% utilized, this means that when the ring is rebalanced, up to 3TB of data may be moved by the replication process. This is an estimate - in practice, because only one copy of a given replica is moved in any given rebalance, it may not be possible to move this amount of data in a single ring rebalance.

will remove 12 devices (6TB)

There are 12 devices in rings that no longer appear in the input model. Usually this is because one or more servers have been removed. In this example, this could be one server with 12 drives or two servers, each with 6 drives. The size in the report is the change in total removed capacity. In this example, 6TB of capacity is being removed. For example, if your system currently has 100TB of available storage, when these devices are removed, there will be 94TB of available storage. If your system is 50% utilized, this means that when the ring is rebalanced, approximately 3TB of data must be moved by the replication process.

min-part-hours will be changed

The min-part-hours attribute has been changed in the ring specification in the input model.

replica-count will be changed

The replica-count attribute has been changed in the ring specification in the input model.

ring will be rebalanced

This is always reported. Every time the swift-update-from-model-rebalance-rings.yml playbook is run, it will execute the swift-ring-builder rebalance command. This happens even if there were no input model changes. If the ring is already well balanced, the swift-ring-builder will not rewrite the ring.

9.5.4 Determining When to Rebalance and Deploy a New Ring

Before deploying a new ring, you must be sure the change that has been applied to the last ring is complete (that is, all the partitions are in their correct location). There are three aspects to this:

  • Is the replication system busy?

    You might want to postpone a ring change until after replication has finished. If the replication system is busy repairing a failed drive, a ring change will place additional load on the system. To check that replication has finished, use the swift-recon command with the --replication argument. (For more information, see Section 9.2, “Gathering Swift Data”.) The oldest completion time can indicate that the replication process is very busy. If it is more than 15 or 20 minutes then the object replication process are probably still very busy. The following example indicates that the oldest completion is 120 seconds, so that the replication process is probably not busy:

    root # swift-recon --replication
    ===============================================================================
    --> Starting reconnaissance on 3 hosts
    ===============================================================================
    [2015-10-02 15:31:45] Checking on replication
    [replication_time] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3
    Oldest completion was 2015-10-02 15:31:32 (120 seconds ago) by 192.168.245.4:6000.
    Most recent completion was 2015-10-02 15:31:43 (10 seconds ago) by 192.168.245.3:6000.
    ===============================================================================
  • Are there drive or server failures?

    A drive failure does not preclude deploying a new ring. In principle, there should be two copies elsewhere. However, another drive failure in the middle of replication might make data temporary unavailable. If possible, postpone ring changes until all servers and drives are operating normally.

  • Has min-part-hours elapsed?

    The swift-ring-builder will refuse to build a new ring until the min-part-hours has elapsed since the last time it built rings. You must postpone changes until this time has elapsed.

    You can determine how long you must wait by running the swift-compare-model-rings.yml playbook, which will tell you how long you until the min-part-hours has elapsed. For more details, see Section 9.5.3, “Managing Rings Using swift Playbooks”.

    You can change the value of min-part-hours. (For instructions, see Section 9.5.7, “Changing min-part-hours in Swift”).

  • Is the swift dispersion report clean?

    Run the swift-dispersion-report.yml playbook (as described in Section 9.1, “Running the swift Dispersion Report”) and examine the results. If the replication process has not yet replicated partitions that were moved to new drives in the last ring rebalance, the dispersion report will indicate that some containers or objects are missing a copy.

    For example:

    There were 462 partitions missing one copy.

    Assuming all servers and disk drives are operational, the reason for the missing partitions is that the replication process has not yet managed to copy a replica into the partitions.

    You should wait an hour and rerun the dispersion report process and examine the report. The number of partitions missing one copy should have reduced. Continue to wait until this reaches zero before making any further ring rebalances.

    Note
    Note

    It is normal to see partitions missing one copy if disk drives or servers are down. If all servers and disk drives are mounted, and you did not recently perform a ring rebalance, you should investigate whether there are problems with the replication process. You can use the Operations Console to investigate replication issues.

    Important
    Important

    If there are any partitions missing two copies, you must reboot or repair any failed servers and disk drives as soon as possible. Do not shutdown any swift nodes in this situation. Assuming a replica count of 3, if you are missing two copies you are in danger of losing the only remaining copy.

9.5.5 Applying Input Model Changes to Existing Rings

This page describes a general approach for making changes to your existing swift rings. This approach applies to actions such as adding and removing a server and replacing and upgrading disk drives, and must be performed as a series of phases, as shown below:

9.5.5.1 Changing the Input Model Configuration Files

The first step to apply new changes to the swift environment is to update the configuration files. Follow these steps:

  1. Log in to the Cloud Lifecycle Manager.

  2. Set the weight-step attribute, as needed, for the nodes you are altering. (For instructions, see Section 9.5.2, “Using the Weight-Step Attributes to Prepare for Ring Changes”).

  3. Edit the configuration files as part of the Input Model as appropriate. (For general information about the Input Model, see Section 6.14, “Networks”. For more specific information about the swift parts of the configuration files, see Chapter 11, Modifying Example Configurations for Object Storage using Swift)

  4. Once you have completed all of the changes, commit your configuration to the local git repository. (For more information, seeChapter 22, Using Git for Configuration Management.) :

    ardana > git add -A
    root # git commit -m "commit message"
  5. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  6. Create a deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  7. Run the swift playbook that will validate your configuration files and give you a report as an output:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    root # ansible-playbook -i hosts/verb_hosts swift-compare-model-rings.yml
  8. Use the report to validate that the number of drives proposed to be added or deleted, or the weight change, is correct. Fix any errors in your input model. At this stage, no changes have been made to rings.

9.5.5.2 First phase of Ring Rebalance

To begin the rebalancing of the swift rings, follow these steps:

  1. After going through the steps in the section above, deploy your changes to all of the swift nodes in your environment by running this playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-deploy.yml
  2. Wait until replication has finished or min-part-hours has elapsed (whichever is longer). For more information, see Section 9.5.4, “Determining When to Rebalance and Deploy a New Ring”

9.5.5.3 Weight Change Phase of Ring Rebalance

At this stage, no changes have been made to the input model. However, when you set the weight-step attribute, the rings that were rebuilt in the previous rebalance phase have weights that are different than their target/final value. You gradually move to the target/final weight by rebalancing a number of times as described on this page. For more information about the weight-step attribute, see Section 9.5.2, “Using the Weight-Step Attributes to Prepare for Ring Changes”.

To begin the re-balancing of the rings, follow these steps:

  1. Rebalance the rings by running the playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-update-from-model-rebalance-rings.yml
  2. Run the reconfiguration:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-reconfigure.yml
  3. Wait until replication has finished or min-part-hours has elapsed (whichever is longer). For more information, see Section 9.5.4, “Determining When to Rebalance and Deploy a New Ring”

  4. Run the following command and review the report:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-compare-model-rings.yml --limit SWF*

    The following is an example of the output after executing the above command. In the example no weight changes are proposed:

    TASK: [swiftlm-ring-supervisor | validate-input-model | Print report] *********
    ok: [padawan-ccp-c1-m1-mgmt] => {
        "var": {
            "report.stdout_lines": [
                "Need to add 0 devices",
                "Need to remove 0 devices",
                "Need to set weight on 0 devices"
            ]
        }
    }
  5. When there are no proposed weight changes, you proceed to the final phase.

  6. If there are proposed weight changes repeat this phase again.

9.5.5.4 Final Rebalance Phase

The final rebalance phase moves all replicas to their final destination.

  1. Rebalance the rings by running the playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-update-from-model-rebalance-rings.yml | tee /tmp/rebalance.log
    Note
    Note

    The output is saved for later reference.

  2. Review the output from the previous step. If the output for all rings is similar to the following, the rebalance had no effect. That is, the rings are balanced and no further changes are needed. In addition, the ring files were not changed so you do not need to deploy them to the swift nodes:

    "Running: swift-ring-builder /etc/swiftlm/cloud1/cp1/builder_dir/account.builder rebalance 999",
          "NOTE: No partitions could be reassigned.",
          "Either none need to be or none can be due to min_part_hours [16]."

    The text No partitions could be reassigned indicates that no further rebalances are necessary. If this is true for all the rings, you have completed the final phase.

    Note
    Note

    You must have allowed enough time to elapse since the last rebalance. As mentioned in the above example, min_part_hours [16] means that you must wait at least 16 hours since the last rebalance. If not, you should wait until enough time has elapsed and repeat this phase.

  3. Run the swift-reconfigure.yml playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-reconfigure.yml
  4. Wait until replication has finished or min-part-hours has elapsed (whichever is longer). For more information see Section 9.5.4, “Determining When to Rebalance and Deploy a New Ring”

  5. Repeat the above steps until the ring is rebalanced.

9.5.5.5 System Changes that Change Existing Rings

There are many system changes ranging from adding servers to replacing drives, which might require you to rebuild and rebalance your rings.

Actions Process
Adding Servers(s)
Removing Server(s)

In SUSE OpenStack Cloud, when you remove servers from the input model, the disk drives are removed from the ring - the weight is not gradually reduced using the weight-step attribute.

  • Remove servers in phases:

    • This reduces the impact of the changes on your system.

    • If your rings use swift zones, ensure you remove the same number of servers for each zone at each phase.

Adding Disk Drive(s)
Replacing Disk Drive(s)

When a drive fails, replace it as soon as possible. Do not attempt to remove it from the ring - this creates operator overhead. swift will continue to store the correct number of replicas by handing off objects to other drives instead of the failed drive.

If the disk drives are of the same size as the original when the drive is replaced, no ring changes are required. You can confirm this by running the swift-update-from-model-rebalance-rings.yml playbook. It should report that no weight changes are needed.

For a single drive replacement, even if the drive is significantly larger than the original drives, you do not need to rebalance the ring (however, the extra space on the drive will not be used).

Upgrading Disk Drives

If the drives are different size (for example, you are upgrading your system), you can proceed as follows:

  • If not already done, set the weight-step attribute

  • Replace drives in phases:

    • Avoid replacing too many drives at once.

    • If your rings use swift zones, upgrade a number of drives in the same zone at the same time - not drives in several zones.

    • It is also safer to upgrade one server instead of drives in several servers at the same time.

    • Remember that the final size of all swift zones must be the same, so you may need to replace a small number of drives in one zone, then a small number in second zone, then return to the first zone and replace more drives, etc.

Removing Disk Drive(s)

When removing a disk drive from the input model, keep in mind that this drops the disk out of the ring without allowing Swift to move the data off it first. While it should be fine in a properly replicated healthy cluster, we do not recommend this approach. A better solution is to step down weight_step to 0 to allow Swift to move data.

9.5.6 Adding a New Swift Storage Policy

This page describes how to add an additional storage policy to an existing system. For an overview of storage policies, see Section 11.11, “Designing Storage Policies”.

To Add a Storage Policy

Perform the following steps to add the storage policy to an existing system.

  1. Log in to the Cloud Lifecycle Manager.

  2. Select a storage policy index and ring name.

    For example, if you already have object-0 and object-1 rings in your ring-specifications (usually in the ~/openstack/my_cloud/definition/data/swift/swift_config.yml file), the next index is 2 and the ring name is object-2.

  3. Select a user-visible name so that you can see when you examine container metadata or when you want to specify the storage policy used when you create a container. The name should be a single word (hyphen and dashes are allowed).

  4. Decide if this new policy will be the default for all new containers.

  5. Decide on other attributes such as partition-power and replica-count if you are using a standard replication ring. However, if you are using an erasure coded ring, you also need to decide on other attributes: ec-type, ec-num-data-fragments, ec-num-parity-fragments, and ec-object-segment-size. For more details on the required attributes, see Section 11.10, “Understanding Swift Ring Specifications”.

  6. Edit the ring-specifications attribute (usually in the ~/openstack/my_cloud/definition/data/swift/swift_config.yml file) and add the new ring specification. If this policy is to be the default storage policy for new containers, set the default attribute to yes.

    Note
    Note
    1. Ensure that only one object ring has the default attribute set to yes. If you set two rings as default, swift processes will not start.

    2. Do not specify the weight-step attribute for the new object ring. Since this is a new ring there is no need to gradually increase device weights.

  7. Update the appropriate disk model to use the new storage policy (for example, the data/disks_swobj.yml file). The following sample shows that the object-2 has been added to the list of existing rings that use the drives:

    disk-models:
    - name: SWOBJ-DISKS
      ...
      device-groups:
      - name: swobj
        devices:
           ...
        consumer:
            name: swift
            attrs:
                rings:
                - object-0
                - object-1
                - object-2
      ...
    Note
    Note

    You must use the new object ring on at least one node that runs the swift-object service. If you skip this step and continue to run the swift-compare-model-rings.yml or swift-deploy.yml playbooks, they will fail with an error There are no devices in this ring, or all devices have been deleted, as shown below:

    TASK: [swiftlm-ring-supervisor | build-rings | Build ring (make-delta, rebalance)] ***
    failed: [padawan-ccp-c1-m1-mgmt] => {"changed": true, "cmd": ["swiftlm-ring-supervisor", "--make-delta", "--rebalance"], "delta": "0:00:03.511929", "end": "2015-10-07 14:02:03.610226", "rc": 2, "start": "2015-10-07 14:02:00.098297", "warnings": []}
    ...
    Running: swift-ring-builder /etc/swiftlm/cloud1/cp1/builder_dir/object-2.builder rebalance 999
    ERROR: -------------------------------------------------------------------------------
    An error has occurred during ring validation. Common
    causes of failure are rings that are empty or do not
    have enough devices to accommodate the replica count.
    Original exception message:
    There are no devices in this ring, or all devices have been deleted
    -------------------------------------------------------------------------------
  8. Commit your configuration:

    ardana > git add -A
    ardana > git commit -m "commit message"
  9. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  10. Create a deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  11. Validate the changes by running the swift-compare-model-rings.yml playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-compare-model-rings.yml

    If any errors occur, correct them. For instructions, see Section 18.6.2.3, “Interpreting Swift Input Model Validation Errors”. Then, re-run steps 5 - 10.

  12. Create the new ring (for example, object-2). Then verify the swift service status and reconfigure the swift node to use a new storage policy, by running these playbooks:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-status.yml
    ardana > ansible-playbook -i hosts/verb_hosts swift-deploy.yml

After adding a storage policy, there is no need to rebalance the ring.

9.5.7 Changing min-part-hours in Swift

The min-part-hours parameter specifies the number of hours you must wait before swift will allow a given partition to be moved. In other words, it constrains how often you perform ring rebalance operations. Before changing this value, you should get some experience with how long it takes your system to perform replication after you make ring changes (for example, when you add servers).

See Section 9.5.4, “Determining When to Rebalance and Deploy a New Ring” for more information about determining when replication has completed.

9.5.7.1 Changing the min-part-hours Value

To change the min-part-hours value, following these steps:

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit your ~/openstack/my_cloud/definition/data/swift/swift_config.yml file and change the value(s) of min-part-hours for the rings you desire. The value is expressed in hours and a value of zero is not allowed.

  3. Commit your configuration to the local Git repository (Chapter 22, Using Git for Configuration Management), as follows:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git add -A
    ardana > git commit -m "My config or other commit message"
  4. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  5. Update your deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  6. Apply the changes by running this playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-deploy.yml

9.5.8 Changing Swift Zone Layout

Before changing the number of swift zones or the assignment of servers to specific zones, you must ensure that your system has sufficient storage available to perform the operation. Specifically, if you are adding a new zone, you may need additional storage. There are two reasons for this:

  • You cannot simply change the swift zone number of disk drives in the ring. Instead, you need to remove the server(s) from the ring and then re-add the server(s) with a new swift zone number to the ring. At the point where the servers are removed from the ring, there must be sufficient spare capacity on the remaining servers to hold the data that was originally hosted on the removed servers.

  • The total amount of storage in each swift zone must be the same. This is because new data is added to each zone at the same rate. If one zone has a lower capacity than the other zones, once that zone becomes full, you cannot add more data to the system – even if there is unused space in the other zones.

As mentioned above, you cannot simply change the swift zone number of disk drives in an existing ring. Instead, you must remove and then re-add servers. This is a summary of the process:

  1. Identify appropriate server groups that correspond to the desired swift zone layout.

  2. Remove the servers in a server group from the rings. This process may be protracted, either by removing servers in small batches or by using the weight-step attribute so that you limit the amount of replication traffic that happens at once.

  3. Once all the targeted servers are removed, edit the swift-zones attribute in the ring specifications to add or remove a swift zone.

  4. Re-add the servers you had temporarily removed to the rings. Again you may need to do this in batches or rely on the weight-step attribute.

  5. Continue removing and re-adding servers until you reach your final configuration.

9.5.8.1 Process for Changing Swift Zones

This section describes the detailed process or reorganizing swift zones. As a concrete example, we assume we start with a single swift zone and the target is three swift zones. The same general process would apply if you were reducing the number of zones as well.

The process is as follows:

  1. Identify the appropriate server groups that represent the desired final state. In this example, we are going to change the swift zone layout as follows:

    Original LayoutTarget Layout
    swift-zones:
      - 1d: 1
        server-groups:
           - AZ1
           - AZ2
           - AZ3
    swift-zones:
       - 1d: 1
         server-groups:
            - AZ1
       - id: 2
            - AZ2
       - id: 3
            - AZ3

    The plan is to move servers from server groups AZ2 and AZ3 to a new swift zone number. The servers in AZ1 will remain in swift zone 1.

  2. If you have not already done so, consider setting the weight-step attribute as described in Section 9.5.2, “Using the Weight-Step Attributes to Prepare for Ring Changes”.

  3. Identify the servers in the AZ2 server group. You may remove all servers at once or remove them in batches. If this is the first time you have performed a major ring change, we suggest you remove one or two servers only in the first batch. When you see how long this takes and the impact replication has on your system you can then use that experience to decide whether you can remove a larger batch of servers, or increase or decrease the weight-step attribute for the next server-removal cycle. To remove a server, use steps 2-9 as described in Section 15.1.5.1.4, “Removing a Swift Node” ensuring that you do not remove the servers from the input model.

  4. This process may take a number of ring rebalance cycles until the disk drives are removed from the ring files. Once this happens, you can edit the ring specifications and add swift zone 2 as shown in this example:

    swift-zones:
      - id: 1
        server-groups:
          - AZ1
          - AZ3
      - id: 2
           - AZ2
  5. The server removal process in step #3 set the "remove" attribute in the pass-through attribute of the servers in server group AZ2. Edit the input model files and remove this pass-through attribute. This signals to the system that the servers should be used the next time we rebalance the rings (that is, the server should be added to the rings).

  6. Commit your configuration to the local Git repository (Chapter 22, Using Git for Configuration Management), as follows:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git add -A
    ardana > git commit -m "My config or other commit message"
  7. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  8. Use the playbook to create a deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  9. Rebuild and deploy the swift rings containing the re-added servers by running this playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-deploy.yml
  10. Wait until replication has finished. For more details, see Section 9.5.4, “Determining When to Rebalance and Deploy a New Ring”.

  11. You may need to continue to rebalance the rings. For instructions, see the "Final Rebalance Stage" steps at Section 9.5.5, “Applying Input Model Changes to Existing Rings”.

  12. At this stage, the servers in server group AZ2 are responsible for swift zone 2. Repeat the process in steps #3-9 to remove the servers in server group AZ3 from the rings and then re-add them to swift zone 3. The ring specifications for zones (step 4) should be as follows:

    swift-zones:
      - 1d: 1
        server-groups:
          - AZ1
      - id: 2
          - AZ2
      - id: 3
          - AZ3
  13. Once complete, all data should be dispersed (that is, each replica is located) in the swift zones as specified in the input model.

9.6 Configuring your swift System to Allow Container Sync

swift has a feature where all the contents of a container can be mirrored to another container through background synchronization. swift operators configure their system to allow/accept sync requests to/from other systems, and the user specifies where to sync their container to along with a secret synchronization key. For an overview of this feature, refer to OpenStack swift - Container to Container Synchronization.

9.6.1 Notes and limitations

The container synchronization is done as a background action. When you put an object into the source container, it will take some time before it becomes visible in the destination container. Storage services will not necessarily copy objects in any particular order, meaning they may be transferred in a different order to which they were created.

Container sync may not be able to keep up with a moderate upload rate to a container. For example, if the average object upload rate to a container is greater than one object per second, then container sync may not be able to keep the objects synced.

If container sync is enabled on a container that already has a large number of objects then container sync may take a long time to sync the data. For example, a container with one million 1KB objects could take more than 11 days to complete a sync.

You may operate on the destination container just like any other container -- adding or deleting objects -- including the objects that are in the destination container because they were copied from the source container. To decide how to handle object creation, replacement or deletion, the system uses timestamps to determine what to do. In general, the latest timestamp "wins". That is, if you create an object, replace it, delete it and the re-create it, the destination container will eventually contain the most recently created object. However, if you also create and delete objects in the destination container, you get some subtle behaviours as follows:

  • If an object is copied to the destination container and then deleted, it remains deleted in the destination even though there is still a copy in the source container. If you modify the object (replace or change its metadata) in the source container, it will reappear in the destination again.

  • The same applies to a replacement or metadata modification of an object in the destination container -- the object will remain as-is unless there is a replacement or modification in the source container.

  • If you replace or modify metadata of an object in the destination container and then delete it in the source container, it is not deleted from the destination. This is because your modified object has a later timestamp than the object you deleted in the source.

  • If you create an object in the source container and before the system has a chance to copy it to the destination, you also create an object of the same name in the destination, then the object in the destination is not overwritten by the source container's object.

Segmented objects

Segmented objects (objects larger than 5GB) will not work seamlessly with container synchronization. If the manifest object is copied to the destination container before the object segments, when you perform a GET operation on the manifest object, the system may fail to find some or all of the object segments. If your manifest and object segments are in different containers, do not forget that both containers must be synchonized and that the container name of the object segments must be the same on both source and destination.

9.6.2 Prerequisites

Container to container synchronization requires that SSL certificates are configured on both the source and destination systems. For more information on how to implement SSL, see Chapter 41, Configuring Transport Layer Security (TLS).

9.6.3 Configuring container sync

Container to container synchronization requires that both the source and destination swift systems involved be configured to allow/accept this. In the context of container to container synchronization, swift uses the term cluster to denote a swift system. swift clusters correspond to Control Planes in OpenStack terminology.

Gather the public API endpoints for both swift systems

Gather information about the external/public URL used by each system, as follows:

  1. On the Cloud Lifecycle Manager of one system, get the public API endpoint of the system by running the following commands:

    ardana > source ~/service.osrc
    ardana > openstack endpoint list | grep swift

    The output of the command will look similar to this:

    ardana > openstack endpoint list | grep swift
    | 063a84b205c44887bc606c3ba84fa608 | region0 | swift           | object-store    | True    | admin     | https://10.13.111.176:8080/v1/AUTH_%(tenant_id)s |
    | 3c46a9b2a5f94163bb5703a1a0d4d37b | region0 | swift           | object-store    | True    | public    | https://10.13.120.105:8080/v1/AUTH_%(tenant_id)s |
    | a7b2f4ab5ad14330a7748c950962b188 | region0 | swift           | object-store    | True    | internal  | https://10.13.111.176:8080/v1/AUTH_%(tenant_id)s |

    The portion that you want is the endpoint up to, but not including, the AUTH part. It is bolded in the above example, https://10.13.120.105:8080/v1.

  2. Repeat these steps on the other swift system so you have both of the public API endpoints for them.

Validate connectivity between both systems

The swift nodes running the swift-container service must be able to connect to the public API endpoints of each other for the container sync to work. You can validate connectivity on each system using these steps.

For the sake of the examples, we will use the terms source and destination to notate the nodes doing the synchronization.

  1. Log in to a swift node running the swift-container service on the source system. You can determine this by looking at the service list in your ~/openstack/my_cloud/info/service_info.yml file for a list of the servers containing this service.

  2. Verify the SSL certificates by running this command against the destination swift server:

    echo | openssl s_client -connect PUBLIC_API_ENDPOINT:8080 -CAfile /etc/ssl/certs/ca-certificates.crt

    If the connection was successful you should see a return code of 0 (ok) similar to this:

    ...
    Timeout   : 300 (sec)
    Verify return code: 0 (ok)
  3. Also verify that the source node can connect to the destination swift system using this command:

    ardana > curl -k DESTINATION_IP OR HOSTNAME:8080/healthcheck

    If the connection was successful, you should see a response of OK.

  4. Repeat these verification steps on any system involved in your container synchronization setup.

Configure container to container synchronization

Both the source and destination swift systems must be configured the same way, using sync realms. For more details on how sync realms work, see OpenStack swift - Configuring Container Sync.

To configure one of the systems, follow these steps:

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the ~/openstack/my_cloud/config/swift/container-sync-realms.conf.j2 file and uncomment the sync realm section.

    Here is a sample showing this section in the file:

    #Add sync realms here, for example:
    # [realm1]
    # key = realm1key
    # key2 = realm1key2
    # cluster_name1 = https://host1/v1/
    # cluster_name2 = https://host2/v1/
  3. Add in the details for your source and destination systems. Each realm you define is a set of clusters that have agreed to allow container syncing between them. These values are case sensitive.

    Only one key is required. The second key is optional and can be provided to allow an operator to rotate keys if desired. The values for the clusters must contain the prefix cluster_ and will be populated with the public API endpoints for the systems.

  4. Commit the changes to git:

    ardana > git add -A
    ardana > git commit -a -m "Add node <name>"
  5. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  6. Update the deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  7. Run the swift reconfigure playbook:

    ardana > cd ~/scratch/ansible/next/ardana/ansible/
    ardana > ansible-playbook -i hosts/verb_hosts swift-reconfigure.yml
  8. Run this command to validate that your container synchronization is configured:

    ardana > source ~/service.osrc
    ardana > swift capabilities

    Here is a snippet of the output showing the container sync information. This should be populated with your cluster names:

    ...
    Additional middleware: container_sync
     Options:
      realms: {u'INTRACLUSTER': {u'clusters': {u'THISCLUSTER': {}}}}
  9. Repeat these steps on any other swift systems that will be involved in your sync realms.

9.6.4 Configuring Intra Cluster Container Sync

It is possible to use the swift container sync functionality to sync objects between containers within the same swift system. swift is automatically configured to allow intra cluster container sync. Each swift PAC server will have an intracluster container sync realm defined in /etc/swift/container-sync-realms.conf.

For example:

# The intracluster realm facilitates syncing containers on this system
[intracluster]
key = lQ8JjuZfO
# key2 =
cluster_thiscluster = http://SWIFT-PROXY-VIP:8080/v1/

The keys defined in /etc/swift/container-sync-realms.conf are used by the container-sync daemon to determine trust. On top of this the containers that will be in sync will need a seperate shared key they both define in container metadata to establish their trust between each other.

  1. Create two containers, for example container-src and container-dst. In this example we will sync one way from container-src to container-dst.

    ardana > openstack container create container-src
    ardana > openstack container create container-dst
  2. Determine your swift account. In the following example it is AUTH_1234

    ardana > openstack container show
                                     Account: AUTH_1234
                                  Containers: 3
                                     Objects: 42
                                       Bytes: 21692421
    Containers in policy "erasure-code-ring": 3
       Objects in policy "erasure-code-ring": 42
         Bytes in policy "erasure-code-ring": 21692421
                                Content-Type: text/plain; charset=utf-8
                 X-Account-Project-Domain-Id: default
                                 X-Timestamp: 1472651418.17025
                                  X-Trans-Id: tx81122c56032548aeae8cd-0057cee40c
                               Accept-Ranges: bytes
  3. Configure container-src to sync to container-dst using a key specified by both containers. Replace KEY with your key.

    ardana > openstack container set -t '//intracluster/thiscluster/AUTH_1234/container-dst' -k 'KEY' container-src
  4. Configure container-dst to accept synced objects with this key

    ardana > openstack container set -k 'KEY' container-dst
  5. Upload objects to container-src. Within a number of minutes the objects should be automatically synced to container-dst.

Changing the intracluster realm key

The intracluster realm key used by container sync to sync objects between containers in the same swift system is automatically generated. The process for changing passwords is described in Section 5.7, “Changing Service Passwords”.

The steps to change the intracluster realm key are as follows.

  1. On the Cloud Lifecycle Manager create a file called ~/openstack/change_credentials/swift_data_metadata.yml with the contents included below. The consuming-cp and cp are the control plane name specified in ~/openstack/my_cloud/definition/data/control_plane.yml where the swift-container service is running.

    swift_intracluster_sync_key:
     metadata:
     - clusters:
       - swpac
       component: swift-container
       consuming-cp: control-plane-1
       cp: control-plane-1
     version: '2.0'
  2. Run the following commands

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  3. Reconfigure the swift credentials

    ardana > cd ~/scratch/ansible/next/ardana/ansible/
    ardana > ansible-playbook -i hosts/verb_hosts swift-reconfigure-credentials-change.yml
  4. Delete ~/openstack/change_credentials/swift_data_metadata.yml

    ardana > rm ~/openstack/change_credentials/swift_data_metadata.yml
  5. On a swift PAC server check that the intracluster realm key has been updated in /etc/swift/container-sync-realms.conf

    # The intracluster realm facilitates syncing containers on this system
    [intracluster]
    key = aNlDn3kWK
  6. Update any containers using the intracluster container sync to use the new intracluster realm key

    ardana > openstack container set -k 'aNlDn3kWK' container-src
    ardana > openstack container set -k 'aNlDn3kWK' container-dst