Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
documentation.suse.com / Documentation / Deployment Guide using Cloud Lifecycle Manager / Cloud Lifecycle Manager Overview / Modifying Example Configurations for Object Storage using Swift
Applies to SUSE OpenStack Cloud 9

11 Modifying Example Configurations for Object Storage using Swift

This section contains detailed descriptions about the swift-specific parts of the input model. For example input models, see Chapter 9, Example Configurations. For general descriptions of the input model, see Section 6.14, “Networks”. In addition, the swift ring specifications are available in the ~/openstack/my_cloud/definition/data/swift/swift_config.yml file.

Usually, the example models provide most of the data that is required to create a valid input model. However, before you start to deploy, you must do the following:

For further information, read these related pages:

11.1 Object Storage using swift Overview

11.1.1 What is the Object Storage (swift) Service?

The SUSE OpenStack Cloud Object Storage using swift service leverages swift which uses software-defined storage (SDS) layered on top of industry-standard servers using native storage devices. swift presents an object paradigm, using an underlying set of disk drives. The disk drives are managed by a data structure called a "ring" and you can store, retrieve, and delete objects in containers using RESTful APIs.

SUSE OpenStack Cloud Object Storage using swift provides a highly-available, resilient, and scalable storage pool for unstructured data. It has a highly-durable architecture, with no single point of failure. In addition, SUSE OpenStack Cloud includes the concept of cloud models, where the user can modify the cloud input model to provide the configuration required for their environment.

11.1.2 Object Storage (swift) Services

A swift system consists of a number of services:

  • swift-proxy provides the API for all requests to the swift system.

  • Account and container services provide storage management of the accounts and containers.

  • Object services provide storage management for object storage.

These services can be co-located in a number of ways. The following general pattern exists in the example cloud models distributed in SUSE OpenStack Cloud:

  • The swift-proxy, account, container, and object services run on the same (PACO) node type in the control plane. This is used for smaller clouds or where swift is a minor element in a larger cloud. This is the model seen in most of the entry-scale models.

  • The swift-proxy, account, and container services run on one (PAC) node type in a cluster in a control plane and the object services run on another (OBJ) node type in a resource pool. This deployment model, known as the Entry-Scale swift model, is used in larger clouds or where a larger swift system is in use or planned. See Section 9.5.1, “Entry-scale swift Model” for more details.

The swift storage service can be scaled both vertically (nodes with larger or more disks) and horizontally (more swift storage nodes) to handle an increased number of simultaneous user connections and provide larger storage space.

swift is configured through a number of YAML files in the SUSE OpenStack Cloud implementation of the OpenStack Object Storage (swift) service. For more details on the configuration of the YAML files, see Chapter 11, Modifying Example Configurations for Object Storage using Swift.

11.2 Allocating Proxy, Account, and Container (PAC) Servers for Object Storage

A swift proxy, account, and container (PAC) server is a node that runs the swift-proxy, swift-account and swift-container services. It is used to respond to API requests and to store account and container data. The PAC node does not store object data.

This section describes the procedure to allocate PAC servers during the initial deployment of the system.

11.2.1 To Allocate Swift PAC Servers

Perform the following steps to allocate PAC servers:

  • Verify if the example input model already contains a suitable server role. The server roles are usually described in the data/server_roles.yml file. If the server role is not described, you must add a suitable server role and allocate drives to store object data. For instructions, see Section 11.4, “Creating Roles for swift Nodes” and Section 11.5, “Allocating Disk Drives for Object Storage”.

  • Verify if the example input model has assigned a cluster to swift proxy, account, container servers. It is usually mentioned in the data/control_plane.yml file. If the cluster is not assigned, then add a suitable cluster. For instructions, see Section 11.7, “Creating a Swift Proxy, Account, and Container (PAC) Cluster”.

  • Identify the physical servers and their IP address and other detailed information.

    • You add these details to the servers list (usually in the data/servers.yml file).

    • As with all servers, you must also verify and/or modify the server-groups information (usually in data/server_groups.yml)

The only part of this process that is unique to swift is the allocation of disk drives for use by the account and container rings. For instructions, see Section 11.5, “Allocating Disk Drives for Object Storage”.

11.3 Allocating Object Servers

A swift object server is a node that runs the swift-object service (only) and is used to store object data. It does not run the swift-proxy, swift-account, or swift-container services.

This section describes the procedure to allocate a swift object server during the initial deployment of the system.

11.3.1 To Allocate a Swift Object Server

Perform the following steps to allocate one or more swift object servers:

  • Verify if the example input model already contains a suitable server role. The server roles are usually described in the data/server_roles.yml file. If the server role is not described, you must add a suitable server role. For instructions, see Section 11.4, “Creating Roles for swift Nodes”. While adding a server role for the swift object server, you will also allocate drives to store object data. For instructions, see Section 11.5, “Allocating Disk Drives for Object Storage”.

  • Verify if the example input model has a resource node assigned to swift object servers. The resource nodes are usually assigned in the data/control_plane.yml file. If it is not assigned, you must add a suitable resource node. For instructions, see Section 11.8, “Creating Object Server Resource Nodes”.

  • Identify the physical servers and their IP address and other detailed information. Add the details for the servers in either of the following YAML files and verify the server-groups information:

    • Add details in the servers list (usually in the data/servers.yml file).

    • As with all servers, you must also verify and/or modify the server-groups information (usually in the data/server_groups.yml file).

    The only part of this process that is unique to swift is the allocation of disk drives for use by the object ring. For instructions, see Section 11.5, “Allocating Disk Drives for Object Storage”.

11.4 Creating Roles for swift Nodes

To create roles for swift nodes, you must edit the data/server_roles.yml file and add an entry to the server-roles list using the following syntax:

server-roles:
- name: PICK-A-NAME
  interface-model: SPECIFY-A-NAME
  disk-model: SPECIFY-A-NAME

The fields for server roles are defined as follows:

name Specifies a name assigned for the role. In the following example, SWOBJ-ROLE is the role name.
interface-model You can either select an existing interface model or create one specifically for swift object servers. In the following example SWOBJ-INTERFACES is used. For more information, see Section 11.9, “Understanding Swift Network and Service Requirements”.
disk-model You can either select an existing model or create one specifically for swift object servers. In the following example SWOBJ-DISKS is used. For more information, see Section 11.5, “Allocating Disk Drives for Object Storage”.
server-roles:
- name: SWOBJ-ROLE
  interface-model: SWOBJ-INTERFACES
  disk-model: SWOBJ-DISKS

11.5 Allocating Disk Drives for Object Storage

The disk model describes the configuration of disk drives and their usage. The examples include several disk models. You must always review the disk devices before making any changes to the existing the disk model.

11.5.1 Making Changes to a Swift Disk Model

There are several reasons for changing the disk model:

  • If you have additional drives available, you can add them to the devices list.

  • If the disk devices listed in the example disk model have different names on your servers. This may be due to different hardware drives. Edit the disk model and change the device names to the correct names.

  • If you prefer a different disk drive than the one listed in the model. For example, if /dev/sdb and /dev/sdc are slow hard drives and you have SDD drives available in /dev/sdd and /dev/sde. In this case, delete /dev/sdb and /dev/sdc and replace them with /dev/sdd and /dev/sde.

    Note
    Note

    Disk drives must not contain labels or file systems from a prior usage. For more information, see Section 11.6, “Swift Requirements for Device Group Drives”.

    Tip
    Tip

    The terms add and delete in the document means editing the respective YAML files to add or delete the configurations/values.

Swift Consumer Syntax

The consumer field determines the usage of a disk drive or logical volume by swift. The syntax of the consumer field is as follows:

consumer:
    name: swift
    attrs:
        rings:
        - name: RING-NAME
        - name: RING-NAME
        - etc...

The fields for consumer are defined as follows:

name Specifies the service that uses the device group. A name field containing swift indicates that the drives or logical volumes are used by swift.
attrs Lists the rings that the devices are allocated to. It must contain a rings item.
rings Contains a list of ring names. In the rings list, the name field is optional.

The following are the different configurations (patterns) of the proxy, account, container, and object services:

  • Proxy, account, container, and object (PACO) run on same node type.

  • Proxy, account, and container run on a node type (PAC) and the object services run on a dedicated object server (OBJ).

Note
Note

The proxy service does not have any rings associated with it.

Example 11.1: PACO - proxy, account, container, and object run on the same node type.
consumer:
    name: swift
    attrs:
        rings:
        - name: account
        - name: container
        - name: object-0
Example 11.2: PAC - proxy, account, and container run on the same node type.
consumer:
    name: swift
    attrs:
        rings:
        - name: account
        - name: container
Example 11.3: OBJ - Dedicated object server

The following example shows two Storage Policies (object-0 and object-1). For more information, see Section 11.11, “Designing Storage Policies”.

consumer:
    name: swift
    attrs:
        rings:
        - name: object-0
        - name: object-1
Swift Device Groups

You may have several device groups if you have several different uses for different sets of drives.

The following example shows a configuration where one drive is used for account and container rings and the other drives are used by the object-0 ring:

device-groups:

- name: swiftpac
  devices:
  - name: /dev/sdb
  consumer:
      name: swift
      attrs:
      - name: account
      - name: container
  - name: swiftobj
    devices:
    - name: /dev/sdc
    - name: /dev/sde
    - name: /dev/sdf
    consumer:
       name: swift
       attrs:
           rings:
              - name: object-0
Swift Logical Volumes
Warning
Warning

Be careful while using logical volumes to store swift data. The data remains intact during an upgrade, but will be lost if the server is reimaged. If you use logical volumes you must ensure that you only reimage one server at a time. This is to allow the data from the other replicas to be replicated back to the logical volume once the reimage is complete.

swift can use a logical volume. To do this, ensure you meet the requirements listed in the table below:

  • mount

  • mkfs-opts

  • fstype

Do not specify these attributes.
  • name

  • size

Specify both of these attributes.
  • consumer

This attribute must have a name field set to swift.
Note
Note

When setting up swift as a logical volume, the configuration processor will give a warning. This warning is normal and does not affect the configuration.

Following is an example of swift logical volumes:

...
   - name: swift
     size: 50%
     consumer:
         name: swift
         attrs:
             rings:
             - name: object-0
             - name: object-1

11.6 Swift Requirements for Device Group Drives

To install and deploy, swift requires that the disk drives listed in the devices list of the device-groups item in a disk model meet the following criteria (if not, the deployment will fail):

  • The disk device must exist on the server. For example, if you add /dev/sdX to a server with only three devices, then the deploy process will fail.

  • The disk device must be unpartitioned or have a single partition that uses the whole drive.

  • The partition must not be labeled.

  • The XFS file system must not contain a file system label.

  • If the disk drive is already labeled as described above, the swiftlm-drive-provision process will assume that the drive has valuable data and will not use or modify the drive.

11.7 Creating a Swift Proxy, Account, and Container (PAC) Cluster

If you already have a cluster with the server-role SWPAC-ROLE there is no need to proceed through these steps.

11.7.1 Steps to Create a swift Proxy, Account, and Container (PAC) Cluster

To create a cluster for swift proxy, account, and container (PAC) servers, you must identify the control plane and node type/role:

  1. In the ~/openstack/my_cloud/definition/data/control_plane.yml file, identify the control plane that the PAC servers are associated with.

  2. Next, identify the node type/role used by the swift PAC servers. In the following example, server-role is set to SWPAC-ROLE.

    Add an entry to the clusters item in the control-plane section.

    Example:

    control-planes:
        - name: control-plane-1
          control-plane-prefix: cp1
    
      . . .
      clusters:
      . . .
         - name: swpac
           cluster-prefix: swpac
           server-role: SWPAC-ROLE
           member-count: 3
           allocation-policy: strict
           service-components:
             - ntp-client
             - swift-ring-builder
             - swift-proxy
             - swift-account
             - swift-container
             - swift-client
    Important
    Important

    Do not change the name of the cluster swpac to ensure that it remains unique among clusters. Use names for its servers such as swpac1, swpac2, and swpac3.

  3. If you have more than three servers available that have the SWPAC-ROLE assigned to them, you must change member-count to match the number of servers.

    For example, if you have four servers with a role of SWPAC-ROLE, then the member-count should be 4.

11.7.2 Service Components

A swift PAC server requires the following service components:

  • ntp-client

  • swift-proxy

  • swift-account

  • swift-container

  • swift-ring-builder

  • swift-client

11.8 Creating Object Server Resource Nodes

To create a resource node for swift object servers, you must identify the control plane and node type/role:

  • In the data/control_plane.yml file, identify the control plane that the object servers are associated with.

  • Next, identify the node type/role used by the swift object servers. In the following example, server-role is set to SWOBJ-ROLE:

    Add an entry to the resources item in the control-plane:

    control-planes:
        - name: control-plane-1
          control-plane-prefix: cp1
          region-name: region1
      . . .
      resources:
      . . .
      - name: swobj
        resource-prefix: swobj
        server-role: SWOBJ-ROLE
        allocation-policy: strict
        min-count: 0
        service-components:
        - ntp-client
        - swift-object

Service Components

A swift object server requires the following service components:

  • ntp-client

  • swift-object

  • swift-client is optional; installs the python-swiftclient package on the server.

Resource nodes do not have a member count attribute. So the number of servers allocated with the SWOBJ-ROLE is the number of servers in the data/servers.yml file with a server role of SWOBJ-ROLE.

11.9 Understanding Swift Network and Service Requirements

This topic describes swift’s requirements for which service components must exist in the input model and how these relate to the network model. This information is useful if you are creating a cluster or resource node, or when defining the networks used by swift. The network model allows many options and configurations. For smooth swift operation, the following must be true:

  • The following services must have a direct connection to the same network:

    • swift-proxy

    • swift-account

    • swift-container

    • swift-object

    • swift-ring-builder

  • The swift-proxy service must have a direct connection to the same network as the cluster-ip service.

  • The memcached service must be configured on a cluster of the control plane. In small deployments, it is convenient to run it on the same cluster as the horizon service. For larger deployments, with many nodes running the swift-proxy service, it is better to co-locate the swift-proxy and memcached services. The swift-proxy and swift-container services must have a direct connection to the same network as the memcached service.

  • The swift-proxy and swift-ring-builder service must be co-located in the same cluster of the control plane.

  • The ntp-client service must be present on all swift nodes.

11.10 Understanding Swift Ring Specifications

In swift, the ring is responsible for mapping data on particular disks. There is a separate ring for account databases, container databases, and each object storage policy, but each ring works similarly. The swift-ring-builder utility is used to build and manage rings. This utility uses a builder file to contain ring information and additional data required to build future rings. In SUSE OpenStack Cloud 9, you will use the cloud model to specify how the rings are configured and used. This model is used to automatically invoke the swift-ring-builder utility as part of the deploy process. (Normally, you will not run the swift-ring-builder utility directly.)

The rings are specified in the input model using the configuration-data key. The configuration-data in the control-planes definition is given a name that you will then use in the swift_config.yml file. If you have several control planes hosting swift services, the ring specifications can use a shared configuration-data object, however it is considered best practice to give each swift instance its own configuration-data object.

11.10.1 Ring Specifications in the Input Model

In most models, the ring-specification is mentioned in the ~/openstack/my_cloud/definition/data/swift/swift_config.yml file. For example:

configuration-data:
  - name: SWIFT-CONFIG-CP1
    services:
      - swift
    data:
      control_plane_rings:
        swift-zones:
          - id: 1
            server-groups:
              - AZ1
          - id: 2
            server-groups:
              - AZ2
          - id: 3
            server-groups:
              - AZ3
        rings:
          - name: account
            display-name: Account Ring
            min-part-hours: 16
            partition-power: 12
            replication-policy:
              replica-count: 3

          - name: container
            display-name: Container Ring
            min-part-hours: 16
            partition-power: 12
            replication-policy:
              replica-count: 3

          - name: object-0
            display-name: General
            default: yes
            min-part-hours: 16
            partition-power: 12
            replication-policy:
              replica-count: 3

The above sample file shows that the rings are specified using the configuration-data object SWIFT-CONFIG-CP1 and has three rings as follows:

  • Account ring: You must always specify a ring called account. The account ring is used by swift to store metadata about the projects in your system. In swift, a keystone project maps to a swift account. The display-name is informational and not used.

  • Container ring:You must always specify a ring called container. The display-name is informational and not used.

  • Object ring: This ring is also known as a storage policy. You must always specify a ring called object-0. It is possible to have multiple object rings, which is known as storage policies. The display-name is the name of the storage policy and can be used by users of the swift system when they create containers. It allows them to specify the storage policy that the container uses. In the example, the storage policy is called General. There are also two aliases for the storage policy name: GeneralPolicy and AnotherAliasForGeneral. In this example, you can use General, GeneralPolicy, or AnotherAliasForGeneral to refer to this storage policy. The aliases item is optional. The display-name is required.

  • Min-part-hours, partition-power, replication-policy and replica-count are described in the following section.

11.10.2 Replication Ring Parameters

The ring parameters for traditional replication rings are defined as follows:

ParameterDescription
replica-count

Defines the number of copies of object created.

Use this to control the degree of resiliency or availability. The replica-count is normally set to 3 (that means swift will keep three copies of accounts, containers, or objects). As a best practice, do not set the value below 3. To achieve higher resiliency, increase the value.

min-part-hours

Changes the value used to decide when a given partition can be moved. This is the number of hours that the swift-ring-builder tool will enforce between ring rebuilds. On a small system, this can be as low as 1 (one hour). The value can be different for each ring.

In the example above, the swift-ring-builder will enforce a minimum of 16 hours between ring rebuilds. However, this time is system-dependent so you will be unable to determine the appropriate value for min-part-hours until you have more experience with your system.

A value of 0 (zero) is not allowed.

In prior releases, this parameter was called min-part-time. The older name is still supported, however do not specify both min-part-hours and min-part-time in the same files.

partition-power The optimal value for this parameter is related to the number of disk drives that you allocate to swift storage. As a best practice, you should use the same drives for both the account and container rings. In this case, the partition-power value should be the same. For more information, see Section 11.10.4, “Selecting a Partition Power”.
replication-policy Specifies that a ring uses replicated storage. The duplicate copies of the object are created and stored on different disk drives. All replicas are identical. If one is lost or corrupted, the system automatically copies one of the remaining replicas to restore the missing replica.
default The default value in the above sample file of ring-specification is set to yes, which means that the storage policy is enabled to store objects. For more information, see Section 11.11, “Designing Storage Policies”.

11.10.3 Erasure Coded Rings

In the cloud model, a ring-specification is mentioned in the ~/openstack/my_cloud/definition/data/swift/swift_config.yml file. A typical erasure coded ring in this file looks like this:

- name: object-1
  display-name: EC_ring
  default: no
  min-part-hours: 16
  partition-power: 12
  erasure-coding-policy:
    ec-type: jerasure_rs_vand
    ec-num-data-fragments: 10
    ec-num-parity-fragments: 4
    ec-object-segment-size: 1048576

The additional parameters are defined as follows:

ParameterDescription
ec-type

This is the particular erasure policy scheme that is being used. The supported ec_types in SUSE OpenStack Cloud 9 are:

  • jerasure_rs_vand => Vandermonde Reed-Solomon encoding, based on Jerasure

erasure-coding-policyThis line indicates that the object ring will be of type "erasure coding"
ec-num-data-fragmentsThis indicated the number of data fragments for an object in the ring.
ec-num-parity-fragmentsThis indicated the number of parity fragments for an object in the ring.
ec-object-segment-sizeThe amount of data that will be buffered up before feeding a segment into the encoder/decoder. The default value is 1048576.

When using an erasure coded ring, the number of devices in the ring must be greater than or equal to the total number of fragments of an object. For example, if you define an erasure coded ring with 10 data fragments and 4 parity fragments, there must be at least 14 (10+4) devices added to the ring.

When using erasure codes, for a PUT object to be successful it must store ec_ndata + 1 fragment to achieve quorum. Where the number of data fragments (ec_ndata) is 10 then at least 11 fragments must be saved for the object PUT to be successful. The 11 fragments must be saved to different drives. To tolerate a single object server going down, say in a system with 3 object servers, each object server must have at least 6 drives assigned to the erasure coded storage policy. So with a single object server down, 12 drives are available between the remaining object servers. This allows an object PUT to save 12 fragments, one more than the minimum to achieve quorum.

Unlike replication rings, none of the erasure coded parameters may be edited after the initial creation. Otherwise there is potential for permanent loss of access to the data.

On the face of it, you would expect that an erasure coded configuration that uses a data to parity ratio of 10:4, that the data consumed storing the object is 1.4 times the size of the object just like the x3 replication takes x3 times the size of the data when storing the object. However, for erasure coding, this 10:4 ratio is not correct. The efficiency (that is how much storage is needed to store the object) is very poor for small objects and improves as the object size grows. However, the improvement is not linear. If all of your files are less than 32K in size, erasure coding will take more space to store than the x3 replication.

11.10.4 Selecting a Partition Power

When storing an object, the object storage system hashes the name. This hash results in a hit on a partition (so a number of different object names result in the same partition number). Generally, the partition is mapped to available disk drives. With a replica count of 3, each partition is mapped to three different disk drives. The hashing algorithm used hashes over a fixed number of partitions. The partition-power attribute determines the number of partitions you have.

Partition power is used to distribute the data uniformly across drives in a swift nodes. It also defines the storage cluster capacity. You must set the partition power value based on the total amount of storage you expect your entire ring to use.

You should select a partition power for a given ring that is appropriate to the number of disk drives you allocate to the ring for the following reasons:

  • If you use a high partition power and have a few disk drives, each disk drive will have thousands of partitions. With too many partitions, audit and other processes in the Object Storage system cannot walk the partitions in a reasonable time and updates will not occur in a timely manner.

  • If you use a low partition power and have many disk drives, you will have tens (or maybe only one) partition on a drive. The Object Storage system does not use size when hashing to a partition - it hashes the name.

    With many partitions on a drive, a large partition is cancelled out by a smaller partition so the overall drive usage is similar. However, with very small numbers of partitions, the uneven distribution of sizes can be reflected in uneven disk drive usage (so one drive becomes full while a neighboring drive is empty).

An ideal number of partitions per drive is 100. If you know the number of drives, select a partition power that will give you approximately 100 partitions per drive. Usually, you install a system with a specific number of drives and add drives as needed. However, you cannot change the value of the partition power. Hence you must select a value that is a compromise between current and planned capacity.

Important
Important

If you are installing a small capacity system and you need to grow to a very large capacity but you cannot fit within any of the ranges in the table, please seek help from Sales Engineering to plan your system.

There are additional factors that can help mitigate the fixed nature of the partition power:

  • Account and container storage represents a small fraction (typically 1 percent) of your object storage needs. Hence, you can select a smaller partition power (relative to object ring partition power) for the account and container rings.

  • For object storage, you can add additional storage policies (that is, another object ring). When you have reached capacity in an existing storage policy, you can add a new storage policy with a higher partition power (because you now have more disk drives in your system). This means that you can install your system using a small partition power appropriate to a small number of initial disk drives. Later, when you have many disk drives, the new storage policy can have a higher value appropriate to the larger number of drives.

However, when you continue to add storage capacity, existing containers will continue to use their original storage policy. Hence, the additional objects must be added to new containers to take advantage of the new storage policy.

Use the following table to select an appropriate partition power for each ring. The partition power of a ring cannot be changed, so it is important to select an appropriate value. This table is based on a replica count of 3. If your replica count is different, or you are unable to find your system in the table, then see Section 11.10.4, “Selecting a Partition Power” for information of selecting a partition power.

The table assumes that when you first deploy swift, you have a small number of drives (the minimum column in the table), and later you add drives.

Note
Note
  • Use the total number of drives. For example, if you have three servers, each with two drives, the total number of drives is six.

  • The lookup should be done separately for each of the account, container and object rings. Since account and containers represent approximately 1 to 2 percent of object storage, you will probably use fewer drives for the account and container rings (that is, you will have fewer proxy, account, and container (PAC) servers) so that your object rings may have a higher partition power.

  • The largest anticipated number of drives imposes a limit in the minimum drives you can have. (For more information, see Section 11.10.4, “Selecting a Partition Power”.) This means that, if you anticipate significant growth, your initial system can be small, but under a certain limit. For example, if you determine that the maximum number of drives the system will grow to is 40,000, then use a partition power of 17 as listed in the table below. In addition, a minimum of 36 drives is required to build the smallest system with this partition power.

  • The table assumes that disk drives are the same size. The actual size of a drive is not significant.

11.11 Designing Storage Policies

Storage policies enable you to differentiate the way objects are stored.

Reasons to use storage policies include the following:

  • Different types or classes of disk drive

    You can use different drives to store various type of data. For example, you can use 7.5K RPM high-capacity drives for one type of data and fast SSD drives for another type of data.

  • Different redundancy or availability needs

    You can define the redundancy and availability based on your requirement. You can use a replica count of 3 for "normal" data and a replica count of 4 for "critical" data.

  • Growing of cluster capacity

    If the storage cluster capacity grows beyond the recommended partition power as described in Section 11.10, “Understanding Swift Ring Specifications”.

  • Erasure-coded storage and replicated storage

    If you use erasure-coded storage for some objects and replicated storage for other objects.

Storage policies are implemented on a per-container basis. If you want a non-default storage policy to be used for a new container, you can explicitly specify the storage policy to use when you create the container. You can change which storage policy is the default. However, this does not affect existing containers. Once the storage policy of a container is set, the policy for that container cannot be changed.

The disk drives used by storage policies can overlap or be distinct. If the storage policies overlap (that is, have disks in common between two storage policies), it is recommended to use the same set of disk drives for both policies. But in the case where there is a partial overlap in disk drives, because one storage policy receives many objects, the drives that are common to both policies must store more objects than drives that are only allocated to one storage policy. This can be appropriate for a situation where the overlapped disk drives are larger than the non-overlapped drives.

11.11.1 Specifying Storage Policies

There are two places where storage policies are specified in the input model:

  • The attribute of the storage policy is specified in ring-specification in the data/swift/swift_config.yml file.

  • When associating disk drives with specific rings in a disk model. This specifies which drives and nodes use the storage policy. In other word words, where data associated with a storage policy is stored.

A storage policy is specified similar to other rings. However, the following features are unique to storage policies:

  • Storage policies are applicable to object rings only. The account or container rings cannot have storage policies.

  • There is a format for the ring name: object-index, where index is a number in the range 0 to 9 (in this release). For example: object-0.

  • The object-0 ring must always be specified.

  • Once a storage policy is deployed, it should never be deleted. You can remove all disk drives for the storage policy, however the ring specification itself cannot be deleted.

  • You can use the display-name attribute when creating a container to indicate which storage policy you want to use for that container.

  • One of the storage policies can be the default policy. If you do not specify the storage policy then the object created in new container uses the default storage policy.

  • If you change the default, only containers created later will have that changed default policy.

The following example shows three storage policies in use. Note that the third storage policy example is an erasure coded ring.

rings:
. . .
- name: object-0
  display-name: General
  default: no
  min-part-hours: 16
  partition-power: 12
  replication-policy:
      replica-count: 3
- name: object-1
  display-name: Data
  default: yes
  min-part-hours: 16
  partition-power: 20
  replication-policy:
      replica-count: 3
- name: object-2
  display-name: Archive
  default: no
  min-part-hours: 16
  partition-power: 20
  erasure-coded-policy:
    ec-type: jerasure_rs_vand
    ec-num-data-fragments: 10
    ec-num-parity-fragments: 4
    ec-object-segment-size: 1048576

11.12 Designing Swift Zones

The concept of swift zones allows you to control the placement of replicas on different groups of servers. When constructing rings and allocating replicas to specific disk drives, swift will, where possible, allocate replicas using the following hierarchy so that the greatest amount of resiliency is achieved by avoiding single points of failure:

  • swift will place each replica on a different disk drive within the same server.

  • swift will place each replica on a different server.

  • swift will place each replica in a different swift zone.

If you have three servers and a replica count of three, it is easy for swift to place each replica on a different server. If you only have two servers though, swift will place two replicas on one server (different drives on the server) and one copy on the other server.

With only three servers there is no need to use the swift zone concept. However, if you have more servers than your replica count, the swift zone concept can be used to control the degree of resiliency. The following table shows how data is placed and explains what happens under various failure scenarios. In all cases, a replica count of three is assumed and that there are a total of six servers.

Number of swift ZonesReplica PlacementFailure ScenariosDetails
One (all servers in the same zone) Replicas are placed on different servers. For any given object, you have no control over which servers the replicas are placed on. One server fails You are guaranteed that there are two other replicas.
Two servers failYou are guaranteed that there is one remaining replica.
Three servers fail 1/3 of the objects cannot be accessed. 2/3 of the objects have three replicas.
Two (three servers in each swift zone) Half the objects have two replicas in swift zone 1 with one replica in swift zone The other objects are reversed, with one replica in swift zone 1 and two replicas in swift zone 2. One swift zone fails You are guaranteed to have at least one replica. Half the objects have two remaining replicas and the other half have a single replica.
Three (two servers in each swift zone) Each zone contains a replica. For any given object, there is a replica in each swift zone. One swift zone failsYou are guaranteed to have two replicas of every object.
Two swift zones failYou are guaranteed to have one replica of every object.

The following sections show examples of how to specify the swift zones in your input model.

11.12.1 Using Server Groups to Specify swift Zones

swift zones are specified in the ring specifications using the server group concept. To define a swift zone, you specify:

  • An id - this is the swift zone number

  • A list of associated server groups

Server groups are defined in your input model. The example input models typically define a number of server groups. You can use these pre-defined server groups or create your own.

For example, the following three models use the example server groups CLOUD, AZ1, AZ2 and AZ3. Each of these examples achieves the same effect – creating a single swift zone.

ring-specifications:
              - region: region1
              swift-zones:
              - id: 1
              server-groups:
              - CLOUD
              rings:
              …
ring-specifications:
              - region: region1
              swift-zones:
              - id: 1
              server-groups:
              - AZ1
              - AZ2
              - AZ3
              rings:
              …
server-groups:
              - name: ZONE_ONE
              server-groups:
              - AZ1
              - AZ2
              - AZ3
              ring-specifications:
              - region: region1
              swift-zones:
              - id: 1
              server-groups:
              - ZONE_ONE
              rings:
              …

Alternatively, if you omit the swift-zones specification, a single swift zone is used by default for all servers.

In the following example, three swift zones are specified and mapped to the same availability zones that nova uses (assuming you are using one of the example input models):

ring-specifications:
      - region: region1
      swift-zones:
      - id: 1
      server-groups:
      - AZ1
      - id: 2
      server-groups:
      - AZ2
      - id: 3
      server-groups:
      - AZ3

In this example, it shows a datacenter with four availability zones which are mapped to two swift zones. This type of setup may be used if you had two buildings where each building has a duplicated network infrastructure:

ring-specifications:
      - region: region1
      swift-zones:
      - id: 1
      server-groups:
      - AZ1
      - AZ2
      - id: 2
      server-groups:
      - AZ3
      - AZ4

11.12.2 Specifying Swift Zones at Ring Level

Usually, you would use the same swift zone layout for all rings in your system. However, it is possible to specify a different layout for a given ring. The following example shows that the account, container and object-0 rings have two zones, but the object-1 ring has a single zone.

ring-specifications:
        - region: region1
        swift-zones:
        - id: 1
        server-groups:
        - AZ1
        - id: 2
        server-groups:
        - AZ2
        rings
        - name: account
        …
        - name: container
        …
        - name: object-0
        …
        - name: object-1
        swift-zones:
        - id: 1
        server-groups:
        - CLOUD
        …

11.13 Customizing Swift Service Configuration Files

SUSE OpenStack Cloud 9 enables you to modify various swift service configuration files. The following swift service configuration files are located on the Cloud Lifecycle Manager in the ~/openstack/my_cloud/config/swift/ directory:

  • account-server.conf.j2

  • container-reconciler.conf.j2

  • container-server.conf.j2

  • container-sync-realms.conf.j2

  • object-expirer.conf.j2

  • object-server.conf.j2

  • proxy-server.conf.j2

  • rsyncd.conf.j2

  • swift.conf.j2

  • swift-recon.j2

There are many configuration options that can be set or changed, including container rate limit and logging level:

11.13.1 Configuring Swift Container Rate Limit

The swift container rate limit allows you to limit the number of PUT and DELETE requests of an object based on the number of objects in a container. For example, suppose the container_ratelimit_x = r . It means that for containers of size x, limit requests per second to r.

To enable container rate limiting:

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the DEFAULT section of ~/openstack/my_cloud/config/swift/proxy-server.conf.j2:

    container_ratelimit_0 = 100
    container_ratelimit_1000000 = 100
    container_ratelimit_5000000 = 50

    This will set the PUT and DELETE object rate limit to 100 requests per second for containers with up to 1,000,000 objects. Also, the PUT and DELETE rate for containers with between 1,000,000 and 5,000,000 objects will vary linearly from between 100 and 50 requests per second as the container object count increases.

  3. Commit your changes to git:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git commit -m "COMMIT_MESSAGE" \
    ~/openstack/my_cloud/config/swift/proxy-server.conf.j2
  4. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  5. Create a deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  6. Run the swift-reconfigure.yml playbook to reconfigure the swift servers:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-reconfigure.yml

11.13.2 Configuring swift Account Server Logging Level

By default the swift logging level is set to INFO. As a best practice, do not set the log level to DEBUG for a long period of time. Use it for troubleshooting issues and then change it back to INFO.

Perform the following steps to set the logging level of the account-server to DEBUG:

  1. Log in to the Cloud Lifecycle Manager.

  2. Edit the DEFAULT section of ~/openstack/my_cloud/config/swift/account-server.conf.j2:

    [DEFAULT] . . log_level = DEBUG
  3. Commit your changes to git:

    ardana > cd ~/openstack/ardana/ansible
    ardana > git commit -m "COMMIT_MESSAGE" \
    ~/openstack/my_cloud/config/swift/account-server.conf.j2
  4. Run the configuration processor:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost config-processor-run.yml
  5. Create a deployment directory:

    ardana > cd ~/openstack/ardana/ansible
    ardana > ansible-playbook -i hosts/localhost ready-deployment.yml
  6. Run the swift-reconfigure.yml playbook to reconfigure the swift servers:

    ardana > cd ~/scratch/ansible/next/ardana/ansible
    ardana > ansible-playbook -i hosts/verb_hosts swift-reconfigure.yml

11.13.3 For More Information

For more information, see: