21 Ceph Object Gateway #
This chapter introduces details about administration tasks related to Object Gateway, such as checking status of the service, managing accounts, multisite gateways, or LDAP authentication.
21.1 Object Gateway restrictions and naming limitations #
Following is a list of important Object Gateway limits:
21.1.1 Bucket limitations #
When approaching Object Gateway via the S3 API, bucket names are limited to DNS-compliant names with a dash character '-' allowed. When approaching Object Gateway via the Swift API, you may use any combination of UTF-8 supported characters except for a slash character '/'. The maximum length of a bucket name is 255 characters. Bucket names must be unique.
Although you may use any UTF-8 based bucket name via the Swift API, it is recommended to name buckets with regard to the S3 naming limitations to avoid problems accessing the same bucket via the S3 API.
21.1.2 Stored object limitations #
- Maximum number of objects per user
No restriction by default (limited by ~ 2^63).
- Maximum number of objects per bucket
No restriction by default (limited by ~ 2^63).
- Maximum size of an object to upload/store
Single uploads are restricted to 5 GB. Use multipart for larger object sizes. The maximum number of multipart chunks is 10000.
21.1.3 HTTP header limitations #
HTTP header and request limitation depend on the Web front-end used. The default Beast restricts the size of the HTTP header to 16 kB.
21.2 Deploying the Object Gateway #
The Ceph Object Gateway deployment follows the same procedure as the deployment of other Ceph services—by means of cephadm. For more details, refer to Section 8.2, “Service and placement specification”, specifically to Section 8.3.4, “Deploying Object Gateways”.
21.3 Operating the Object Gateway service #
You can operate the Object Gateways same as other Ceph services by first
identifying the service name with the ceph orch ps
command, and running the following command for operating services, for
example:
ceph orch daemon restart OGW_SERVICE_NAME
Refer to Chapter 14, Operation of Ceph services for complete information about operating Ceph services.
21.4 Configuration options #
Refer to Section 28.5, “Ceph Object Gateway” for a list of Object Gateway configuration options.
21.5 Managing Object Gateway access #
You can communicate with Object Gateway using either S3- or Swift-compatible interface. S3 interface is compatible with a large subset of the Amazon S3 RESTful API. Swift interface is compatible with a large subset of the OpenStack Swift API.
Both interfaces require you to create a specific user, and install the relevant client software to communicate with the gateway using the user's secret key.
21.5.1 Accessing Object Gateway #
21.5.1.1 S3 interface access #
To access the S3 interface, you need a REST client.
S3cmd
is a command line S3 client. You can find it in
the
OpenSUSE
Build Service. The repository contains versions for both SUSE Linux Enterprise and
openSUSE based distributions.
If you want to test your access to the S3 interface, you can also write a
small a Python script. The script will connect to Object Gateway, create a new
bucket, and list all buckets. The values for
aws_access_key_id
and
aws_secret_access_key
are taken from the values of
access_key
and secret_key
returned by
the radosgw_admin
command from
Section 21.5.2.1, “Adding S3 and Swift users”.
Install the
python-boto
package:#
zypper in python-botoCreate a new Python script called
s3test.py
with the following content:import boto import boto.s3.connection access_key = '11BS02LGFB6AL6H1ADMW' secret_key = 'vzCEkuryfn060dfee4fgQPqFrncKEIkh3ZcdOANY' conn = boto.connect_s3( aws_access_key_id = access_key, aws_secret_access_key = secret_key, host = 'HOSTNAME', is_secure=False, calling_format = boto.s3.connection.OrdinaryCallingFormat(), ) bucket = conn.create_bucket('my-new-bucket') for bucket in conn.get_all_buckets(): print "NAME\tCREATED".format( name = bucket.name, created = bucket.creation_date, )
Replace
HOSTNAME
with the host name of the host where you configured the Object Gateway service, for examplegateway_host
.Run the script:
python s3test.py
The script outputs something like the following:
my-new-bucket 2015-07-22T15:37:42.000Z
21.5.1.2 Swift interface access #
To access Object Gateway via Swift interface, you need the swift
command line client. Its manual page man 1 swift
tells
you more about its command line options.
The package is included in the 'Public Cloud' module for SUSE Linux Enterprise 12 from SP3 and SUSE Linux Enterprise 15. Before installing the package, you need to activate the module and refresh the software repository:
#
SUSEConnect -p sle-module-public-cloud/12/SYSTEM-ARCH
sudo zypper refresh
Or
#
SUSEConnect -p sle-module-public-cloud/15/SYSTEM-ARCH#
zypper refresh
To install the swift
command, run the following:
#
zypper in python-swiftclient
The swift access uses the following syntax:
>
swift -A http://IP_ADDRESS/auth/1.0 \
-U example_user:swift -K 'SWIFT_SECRET_KEY' list
Replace IP_ADDRESS with the IP address of the
gateway server, and SWIFT_SECRET_KEY with its
value from the output of the radosgw-admin key create
command executed for the swift
user in
Section 21.5.2.1, “Adding S3 and Swift users”.
For example:
>
swift -A http://gateway.example.com/auth/1.0 -U example_user:swift \
-K 'r5wWIxjOCeEO7DixD1FjTLmNYIViaC6JVhi3013h' list
The output is:
my-new-bucket
21.5.2 Manage S3 and Swift accounts #
21.5.2.1 Adding S3 and Swift users #
You need to create a user, access key and secret to enable end users to interact with the gateway. There are two types of users: a user and subuser. While users are used when interacting with the S3 interface, subusers are users of the Swift interface. Each subuser is associated to a user.
To create a Swift user, follow the steps:
To create a Swift user—which is a subuser in our terminology—you need to create the associated user first.
cephuser@adm >
radosgw-admin user create --uid=USERNAME \ --display-name="DISPLAY-NAME" --email=EMAILFor example:
cephuser@adm >
radosgw-admin user create \ --uid=example_user \ --display-name="Example User" \ --email=penguin@example.comTo create a subuser (Swift interface) for the user, you must specify the user ID (--uid=USERNAME), a subuser ID, and the access level for the subuser.
cephuser@adm >
radosgw-admin subuser create --uid=UID \ --subuser=UID \ --access=[ read | write | readwrite | full ]For example:
cephuser@adm >
radosgw-admin subuser create --uid=example_user \ --subuser=example_user:swift --access=fullGenerate a secret key for the user.
cephuser@adm >
radosgw-admin key create \ --gen-secret \ --subuser=example_user:swift \ --key-type=swiftBoth commands will output JSON-formatted data showing the user state. Notice the following lines, and remember the
secret_key
value:"swift_keys": [ { "user": "example_user:swift", "secret_key": "r5wWIxjOCeEO7DixD1FjTLmNYIViaC6JVhi3013h"}],
When accessing Object Gateway through the S3 interface you need to create an S3 user by running:
cephuser@adm >
radosgw-admin user create --uid=USERNAME \
--display-name="DISPLAY-NAME" --email=EMAIL
For example:
cephuser@adm >
radosgw-admin user create \
--uid=example_user \
--display-name="Example User" \
--email=penguin@example.com
The command also creates the user's access and secret key. Check its
output for access_key
and secret_key
keywords and their values:
[...] "keys": [ { "user": "example_user", "access_key": "11BS02LGFB6AL6H1ADMW", "secret_key": "vzCEkuryfn060dfee4fgQPqFrncKEIkh3ZcdOANY"}], [...]
21.5.2.2 Removing S3 and Swift users #
The procedure for deleting users is similar for S3 and Swift users. But in case of Swift users you may need to delete the user including its subusers.
To remove a S3 or Swift user (including all its subusers), specify
user rm
and the user ID in the following command:
cephuser@adm >
radosgw-admin user rm --uid=example_user
To remove a subuser, specify subuser rm
and the subuser
ID.
cephuser@adm >
radosgw-admin subuser rm --uid=example_user:swift
You can make use of the following options:
- --purge-data
Purges all data associated to the user ID.
- --purge-keys
Purges all keys associated to the user ID.
When you remove a subuser, you are removing access to the Swift interface. The user will remain in the system.
21.5.2.3 Changing S3 and Swift user access and secret keys #
The access_key
and secret_key
parameters identify the Object Gateway user when accessing the gateway. Changing
the existing user keys is the same as creating new ones, as the old keys
get overwritten.
For S3 users, run the following:
cephuser@adm >
radosgw-admin key create --uid=EXAMPLE_USER --key-type=s3 --gen-access-key --gen-secret
For Swift users, run the following:
cephuser@adm >
radosgw-admin key create --subuser=EXAMPLE_USER:swift --key-type=swift --gen-secret
--key-type=TYPE
Specifies the type of key. Either
swift
ors3
.--gen-access-key
Generates a random access key (for S3 user by default).
--gen-secret
Generates a random secret key.
--secret=KEY
Specifies a secret key, for example manually generated.
21.5.2.4 Enabling user quota management #
The Ceph Object Gateway enables you to set quotas on users and buckets owned by users. Quotas include the maximum number of objects in a bucket and the maximum storage size in megabytes.
Before you enable a user quota, you first need to set its parameters:
cephuser@adm >
radosgw-admin quota set --quota-scope=user --uid=EXAMPLE_USER \
--max-objects=1024 --max-size=1024
--max-objects
Specifies the maximum number of objects. A negative value disables the check.
--max-size
Specifies the maximum number of bytes. A negative value disables the check.
--quota-scope
Sets the scope for the quota. The options are
bucket
anduser
. Bucket quotas apply to buckets a user owns. User quotas apply to a user.
Once you set a user quota, you may enable it:
cephuser@adm >
radosgw-admin quota enable --quota-scope=user --uid=EXAMPLE_USER
To disable a quota:
cephuser@adm >
radosgw-admin quota disable --quota-scope=user --uid=EXAMPLE_USER
To list quota settings:
cephuser@adm >
radosgw-admin user info --uid=EXAMPLE_USER
To update quota statistics:
cephuser@adm >
radosgw-admin user stats --uid=EXAMPLE_USER --sync-stats
21.6 HTTP front-ends #
The Ceph Object Gateway supports two embedded HTTP front-ends: Beast and Civetweb.
The Beast front-end uses the Boost.Beast library for HTTP parsing and the Boost.Asio library for asynchronous network I/O.
The Civetweb front-end uses the Civetweb HTTP library, which is a fork of Mongoose.
You can configure them with the rgw_frontends
option. Refer
to Section 28.5, “Ceph Object Gateway” for a list of configuration options.
21.7 Enable HTTPS/SSL for Object Gateways #
To enable the Object Gateway to communicate securely using SSL, you need to either have a CA-issued certificate or create a self-signed one.
21.7.1 Creating a self-signed certificate #
Skip this section if you already have a valid certificate signed by CA.
The following procedure describes how to generate a self-signed SSL certificate on the Salt Master.
If you need your Object Gateway to be known by additional subject identities, add them to the
subjectAltName
option in the[v3_req]
section of the/etc/ssl/openssl.cnf
file:[...] [ v3_req ] subjectAltName = DNS:server1.example.com DNS:server2.example.com [...]
Tip: IP addresses insubjectAltName
To use IP addresses instead of domain names in the
subjectAltName
option, replace the example line with the following:subjectAltName = IP:10.0.0.10 IP:10.0.0.11
Create the key and the certificate using
openssl
. Enter all data you need to include in your certificate. We recommend entering the FQDN as the common name. Before signing the certificate, verify that 'X509v3 Subject Alternative Name:' is included in requested extensions, and that the resulting certificate has "X509v3 Subject Alternative Name:" set.root@master #
openssl req -x509 -nodes -days 1095 \ -newkey rsa:4096 -keyout rgw.key -out rgw.pemAppend the key to the certificate file:
root@master #
cat rgw.key >> rgw.pem
21.7.2 Configuring Object Gateway with SSL #
To configure Object Gateway to use SSL certificates, use the
rgw_frontends
option. For example:
cephuser@adm >
ceph config set WHO rgw_frontends \
beast ssl_port=443 ssl_certificate=config://CERT ssl_key=config://KEY
If you do not specify the CERT and KEY configuration keys, then the Object Gateway service will look for the SSL certificate and key under the following configuration keys:
rgw/cert/RGW_REALM/RGW_ZONE.key rgw/cert/RGW_REALM/RGW_ZONE.crt
If you want to override the default SSL key and certificate location, import them to the configuration database by using the following command:
ceph config-key set CUSTOM_CONFIG_KEY -i PATH_TO_CERT_FILE
Then use your custom configuration keys using the
config://
directive.
21.8 Synchronization modules #
Object Gateway is deployed as a multi-site service while you can mirror data and metadata between the zones. Synchronization modules are built atop of the multisite framework that allows for forwarding data and metadata to a different external tier. A synchronization module allows for a set of actions to be performed whenever a change in data occurs (for example, metadata operations such as bucket or user creation). As the Object Gateway multisite changes are eventually consistent at remote sites, changes are propagated asynchronously. This covers use cases such as backing up the object storage to an external cloud cluster, a custom backup solution using tape drives, or indexing metadata in ElasticSearch.
21.8.1 Configuring synchronization modules #
All synchronization modules are configured in a similar way. You need to
create a new zone (refer to Section 21.13, “Multisite Object Gateways” for more
details) and set its --tier_type
option, for example
--tier-type=cloud
for the cloud synchronization module:
cephuser@adm >
radosgw-admin zone create --rgw-zonegroup=ZONE-GROUP-NAME \
--rgw-zone=ZONE-NAME \
--endpoints=http://endpoint1.example.com,http://endpoint2.example.com, [...] \
--tier-type=cloud
You can configure the specific tier by using the following command:
cephuser@adm >
radosgw-admin zone modify --rgw-zonegroup=ZONE-GROUP-NAME \
--rgw-zone=ZONE-NAME \
--tier-config=KEY1=VALUE1,KEY2=VALUE2
The KEY in the configuration specifies the configuration variable that you want to update, and the VALUE specifies its new value. Nested values can be accessed using period. For example:
cephuser@adm >
radosgw-admin zone modify --rgw-zonegroup=ZONE-GROUP-NAME \
--rgw-zone=ZONE-NAME \
--tier-config=connection.access_key=KEY,connection.secret=SECRET
You can access array entries by appending square brackets '[]' with the referenced entry. You can add a new array entry by using square brackets '[]'. Index value of -1 references the last entry in the array. It is not possible to create a new entry and reference it again in the same command. For example, a command to create a new profile for buckets starting with PREFIX follows:
cephuser@adm >
radosgw-admin zone modify --rgw-zonegroup=ZONE-GROUP-NAME \ --rgw-zone=ZONE-NAME \ --tier-config=profiles[].source_bucket=PREFIX'*'cephuser@adm >
radosgw-admin zone modify --rgw-zonegroup=ZONE-GROUP-NAME \ --rgw-zone=ZONE-NAME \ --tier-config=profiles[-1].connection_id=CONNECTION_ID,profiles[-1].acls_id=ACLS_ID
You can add a new tier configuration entry by using the
--tier-config-add=KEY=VALUE
parameter.
You can remove an existing entry by using
--tier-config-rm=KEY
.
21.8.2 Synchronizing zones #
A synchronization module configuration is local to a zone. The
synchronization module determines whether the zone exports data or can only
consume data that was modified in another zone. As of Luminous the
supported synchronization plug-ins are ElasticSearch
,
rgw
, which is the default synchronization plug-in that
synchronizes data between the zones and log
which is a
trivial synchronization plug-in that logs the metadata operation that
happens in the remote zones. The following sections are written with the
example of a zone using ElasticSearch
synchronization
module. The process would be similar for configuring any other
synchronization plug-in.
rgw
is the default synchronization plug-in and there is
no need to explicitly configure this.
21.8.2.1 Requirements and assumptions #
Let us assume a simple multisite configuration as described in
Section 21.13, “Multisite Object Gateways” consists of 2 zones:
us-east
and us-west
. Now we add a
third zone us-east-es
which is a zone that only
processes metadata from the other sites. This zone can be in the same or a
different Ceph cluster than us-east
. This zone would
only consume metadata from other zones and Object Gateways in this zone will not
serve any end user requests directly.
21.8.2.2 Configuring zones #
Create the third zone similar to the ones described in Section 21.13, “Multisite Object Gateways”, for example
cephuser@adm >
radosgw-admin
zone create --rgw-zonegroup=us --rgw-zone=us-east-es \ --access-key=SYSTEM-KEY --secret=SECRET --endpoints=http://rgw-es:80A synchronization module can be configured for this zone via the following:
cephuser@adm >
radosgw-admin
zone modify --rgw-zone=ZONE-NAME --tier-type=TIER-TYPE \ --tier-config={set of key=value pairs}For example in the
ElasticSearch
synchronization modulecephuser@adm >
radosgw-admin
zone modify --rgw-zone=ZONE-NAME --tier-type=elasticsearch \ --tier-config=endpoint=http://localhost:9200,num_shards=10,num_replicas=1For the various supported tier-config options refer to Section 21.8.3, “ElasticSearch synchronization module”.
Finally update the period
cephuser@adm >
radosgw-admin
period update --commitNow start the Object Gateway in the zone
cephuser@adm >
ceph orch start rgw.REALM-NAME.ZONE-NAME
21.8.3 ElasticSearch synchronization module #
This synchronization module writes the metadata from other zones to ElasticSearch. As of Luminous this is JSON of data fields we currently store in ElasticSearch.
{ "_index" : "rgw-gold-ee5863d6", "_type" : "object", "_id" : "34137443-8592-48d9-8ca7-160255d52ade.34137.1:object1:null", "_score" : 1.0, "_source" : { "bucket" : "testbucket123", "name" : "object1", "instance" : "null", "versioned_epoch" : 0, "owner" : { "id" : "user1", "display_name" : "user1" }, "permissions" : [ "user1" ], "meta" : { "size" : 712354, "mtime" : "2017-05-04T12:54:16.462Z", "etag" : "7ac66c0f148de9519b8bd264312c4d64" } } }
21.8.3.1 ElasticSearch tier type configuration parameters #
- endpoint
Specifies the ElasticSearch server endpoint to access.
- num_shards
(integer) The number of shards that ElasticSearch will be configured with on data synchronization initialization. Note that this cannot be changed after initialization. Any change here requires rebuild of the ElasticSearch index and reinitialization of the data synchronization process.
- num_replicas
(integer) The number of replicas that ElasticSearch will be configured with on data synchronization initialization.
- explicit_custom_meta
(true | false) Specifies whether all user custom metadata will be indexed, or whether user will need to configure (at the bucket level) what customer metadata entries should be indexed. This is false by default
- index_buckets_list
(comma separated list of strings) If empty, all buckets will be indexed. Otherwise, only buckets specified here will be indexed. It is possible to provide bucket prefixes (for example 'foo*'), or bucket suffixes (for example '*bar').
- approved_owners_list
(comma separated list of strings) If empty, buckets of all owners will be indexed (subject to other restrictions), otherwise, only buckets owned by specified owners will be indexed. Suffixes and prefixes can also be provided.
- override_index_path
(string) if not empty, this string will be used as the ElasticSearch index path. Otherwise the index path will be determined and generated on synchronization initialization.
- username
Specifies a user name for ElasticSearch if authentication is required.
- password
Specifies a password for ElasticSearch if authentication is required.
21.8.3.2 Metadata queries #
Since the ElasticSearch cluster now stores object metadata, it is important that the ElasticSearch endpoint is not exposed to the public and only accessible to the cluster administrators. For exposing metadata queries to the end user itself this poses a problem since we'd want the user to only query their metadata and not of any other users, this would require the ElasticSearch cluster to authenticate users in a way similar to RGW does which poses a problem.
As of Luminous RGW in the metadata master zone can now service end user requests. This allows for not exposing the ElasticSearch endpoint in public and also solves the authentication and authorization problem since RGW itself can authenticate the end user requests. For this purpose RGW introduces a new query in the bucket APIs that can service ElasticSearch requests. All these requests must be sent to the metadata master zone.
- Get an ElasticSearch Query
GET /BUCKET?query=QUERY-EXPR
request params:
max-keys: max number of entries to return
marker: pagination marker
expression := [(]<arg> <op> <value> [)][<and|or> ...]
op is one of the following: <, <=, ==, >=, >
For example:
GET /?query=name==foo
Will return all the indexed keys that user has read permission to, and are named 'foo'. The output will be a list of keys in XML that is similar to the S3 list buckets response.
- Configure custom metadata fields
Define which custom metadata entries should be indexed (under the specified bucket), and what are the types of these keys. If explicit custom metadata indexing is configured, this is needed so that rgw will index the specified custom metadata values. Otherwise it is needed in cases where the indexed metadata keys are of a type other than string.
POST /BUCKET?mdsearch x-amz-meta-search: <key [; type]> [, ...]
Multiple metadata fields must be comma separated, a type can be forced for a field with a `;`. The currently allowed types are string(default), integer and date, for example, if you want to index a custom object metadata x-amz-meta-year as int, x-amz-meta-date as type date and x-amz-meta-title as string, you would do
POST /mybooks?mdsearch x-amz-meta-search: x-amz-meta-year;int, x-amz-meta-release-date;date, x-amz-meta-title;string
- Delete custom metadata configuration
Delete custom metadata bucket configuration.
DELETE /BUCKET?mdsearch
- Get custom metadata configuration
Retrieve custom metadata bucket configuration.
GET /BUCKET?mdsearch
21.8.4 Cloud synchronization module #
This section introduces a module that synchronizes the zone data to a remote cloud service. The synchronization is only unidirectional—the date is not synchronized back from the remote zone. The main goal of this module is to enable synchronizing data to multiple cloud service providers. Currently it supports cloud providers that are compatible with AWS (S3).
To synchronize data to a remote cloud service, you need to configure user credentials. Because many cloud services introduce limits on the number of buckets that each user can create, you can configure the mapping of source objects and buckets, different targets to different buckets and bucket prefixes. Note that source access lists (ACLs) will not be preserved. It is possible to map permissions of specific source users to specific destination users.
Because of API limitations, there is no way to preserve original object modification time and HTTP entity tag (ETag). The cloud synchronization module stores these as metadata attributes on the destination objects.
21.8.4.1 Configuring the cloud synchronization module #
Following are examples of a trivial and non-trivial configuration for the cloud synchronization module. Note that the trivial configuration can collide with the non-trivial one.
{ "connection": { "access_key": ACCESS, "secret": SECRET, "endpoint": ENDPOINT, "host_style": path | virtual, }, "acls": [ { "type": id | email | uri, "source_id": SOURCE_ID, "dest_id": DEST_ID } ... ], "target_path": TARGET_PATH, }
{ "default": { "connection": { "access_key": ACCESS, "secret": SECRET, "endpoint": ENDPOINT, "host_style" path | virtual, }, "acls": [ { "type": id | email | uri, # optional, default is id "source_id": ID, "dest_id": ID } ... ] "target_path": PATH # optional }, "connections": [ { "connection_id": ID, "access_key": ACCESS, "secret": SECRET, "endpoint": ENDPOINT, "host_style": path | virtual, # optional } ... ], "acl_profiles": [ { "acls_id": ID, # acl mappings "acls": [ { "type": id | email | uri, "source_id": ID, "dest_id": ID } ... ] } ], "profiles": [ { "source_bucket": SOURCE, "connection_id": CONNECTION_ID, "acls_id": MAPPINGS_ID, "target_path": DEST, # optional } ... ], }
Explanation of used configuration terms follows:
- connection
Represents a connection to the remote cloud service. Contains 'connection_id', 'access_key', 'secret', 'endpoint', and 'host_style'.
- access_key
The remote cloud access key that will be used for the specific connection.
- secret
The secret key for the remote cloud service.
- endpoint
URL of remote cloud service endpoint.
- host_style
Type of host style ('path' or 'virtual') to be used when accessing remote cloud endpoint. Default is 'path'.
- acls
Array of access list mappings.
- acl_mapping
Each 'acl_mapping' structure contains 'type', 'source_id', and 'dest_id'. These will define the ACL mutation for each object. An ACL mutation allows converting source user ID to a destination ID.
- type
ACL type: 'id' defines user ID, 'email' defines user by e-mail, and 'uri' defines user by uri (group).
- source_id
ID of user in the source zone.
- dest_id
ID of user in the destination.
- target_path
A string that defines how the target path is created. The target path specifies a prefix to which the source object name is appended. The target path configurable can include any of the following variables:
- SID
A unique string that represents the synchronization instance ID.
- ZONEGROUP
Zonegroup name.
- ZONEGROUP_ID
Zonegroup ID.
- ZONE
Zone name.
- ZONE_ID
Zone ID.
- BUCKET
Source bucket name.
- OWNER
Source bucket owner ID.
For example: target_path = rgwx-ZONE-SID/OWNER/BUCKET
- acl_profiles
An array of access list profiles.
- acl_profile
Each profile contains 'acls_id' that represents the profile, and an 'acls' array that holds a list of 'acl_mappings'.
- profiles
A list of profiles. Each profile contains the following:
- source_bucket
Either a bucket name, or a bucket prefix (if ends with *) that defines the source bucket(s) for this profile.
- target_path
See above for the explanation.
- connection_id
ID of the connection that will be used for this profile.
- acls_id
ID of ACL's profile that will be used for this profile.
21.8.4.2 S3 specific configurables #
The cloud synchronization module will only work with back-ends that are compatible with AWS S3. There are a few configurables that can be used to tweak its behavior when accessing S3 cloud services:
{ "multipart_sync_threshold": OBJECT_SIZE, "multipart_min_part_size": PART_SIZE }
- multipart_sync_threshold
Objects whose size is equal to or larger than this value will be synchronized with the cloud service using multipart upload.
- multipart_min_part_size
Minimum parts size to use when synchronizing objects using multipart upload.
21.8.5 Archive synchronization module #
The archive sync module uses the versioning feature of S3 objects in Object Gateway. You can configure an archive zone that captures the different versions of S3 objects as they occur over time in other zones. The history of versions that the archive zone keeps can only be eliminated via gateways associated with the archive zone.
With such an architecture, several non-versioned zones can mirror their data and metadata via their zone gateways providing high availability to the end users, while the archive zone captures all the data updates to consolidate them as versions of S3 objects.
By including the archive zone in a multi-zone configuration, you gain the flexibility of an S3 object history in one zone while saving the space that the replicas of the versioned S3 objects would consume in the remaining zones.
21.8.5.1 Configuring the archive synchronization module #
Refer to Section 21.13, “Multisite Object Gateways” for details on configuring multisite gateways.
Refer to Section 21.8, “Synchronization modules” for details on configuring synchronization modules.
To use the archive sync module, you need to create a new zone whose tier
type is set to archive
:
cephuser@adm >
radosgw-admin zone create --rgw-zonegroup=ZONE_GROUP_NAME \
--rgw-zone=OGW_ZONE_NAME \
--endpoints=http://OGW_ENDPOINT1_URL[,http://OGW_ENDPOINT2_URL,...]
--tier-type=archive
21.9 LDAP authentication #
Apart from the default local user authentication, Object Gateway can use LDAP server services to authenticate users as well.
21.9.1 Authentication mechanism #
The Object Gateway extracts the user's LDAP credentials from a token. A search filter is constructed from the user name. The Object Gateway uses the configured service account to search the directory for a matching entry. If an entry is found, the Object Gateway attempts to bind to the found distinguished name with the password from the token. If the credentials are valid, the bind will succeed, and the Object Gateway grants access.
You can limit the allowed users by setting the base for the search to a specific organizational unit or by specifying a custom search filter, for example requiring specific group membership, custom object classes, or attributes.
21.9.2 Requirements #
LDAP or Active Directory: A running LDAP instance accessible by the Object Gateway.
Service account: LDAP credentials to be used by the Object Gateway with search permissions.
User account: At least one user account in the LDAP directory.
You should not use the same user names for local users and for users being authenticated by using LDAP. The Object Gateway cannot distinguish them and it treats them as the same user.
Use the ldapsearch
utility to verify the service
account or the LDAP connection. For example:
>
ldapsearch -x -D "uid=ceph,ou=system,dc=example,dc=com" -W \
-H ldaps://example.com -b "ou=users,dc=example,dc=com" 'uid=*' dn
Make sure to use the same LDAP parameters as in the Ceph configuration file to eliminate possible problems.
21.9.3 Configuring Object Gateway to use LDAP authentication #
The following parameters are related to the LDAP authentication:
rgw_s3_auth_use_ldap
Set this option to
true
to enable S3 authentication with LDAP.rgw_ldap_uri
Specifies the LDAP server to use. Make sure to use the
ldaps://FQDN:PORT
parameter to avoid transmitting the plain text credentials openly.rgw_ldap_binddn
The Distinguished Name (DN) of the service account used by the Object Gateway.
rgw_ldap_secret
The password for the service account.
- rgw_ldap_searchdn
Specifies the base in the directory information tree for searching users. This might be your users organizational unit or some more specific Organizational Unit (OU).
rgw_ldap_dnattr
The attribute being used in the constructed search filter to match a user name. Depending on your Directory Information Tree (DIT) this would probably be
uid
orcn
.rgw_search_filter
If not specified, the Object Gateway automatically constructs the search filter with the
rgw_ldap_dnattr
setting. Use this parameter to narrow the list of allowed users in very flexible ways. Consult Section 21.9.4, “Using a custom search filter to limit user access” for details.
21.9.4 Using a custom search filter to limit user access #
There are two ways you can use the rgw_search_filter
parameter.
21.9.4.1 Partial filter to further limit the constructed search filter #
An example of a partial filter:
"objectclass=inetorgperson"
The Object Gateway will generate the search filter as usual with the user name from
the token and the value of rgw_ldap_dnattr
. The
constructed filter is then combined with the partial filter from the
rgw_search_filter
attribute. Depending on the user name
and the settings the final search filter may become:
"(&(uid=hari)(objectclass=inetorgperson))"
In that case, user 'hari' will only be granted access if he is found in the LDAP directory, has an object class of 'inetorgperson', and did specify a valid password.
21.9.4.2 Complete filter #
A complete filter must contain a USERNAME
token which
will be substituted with the user name during the authentication attempt.
The rgw_ldap_dnattr
parameter is not used anymore in this
case. For example, to limit valid users to a specific group, use the
following filter:
"(&(uid=USERNAME)(memberOf=cn=ceph-users,ou=groups,dc=mycompany,dc=com))"
memberOf
attribute
Using the memberOf
attribute in LDAP searches requires
server side support from you specific LDAP server implementation.
21.9.5 Generating an access token for LDAP authentication #
The radosgw-token
utility generates the access token
based on the LDAP user name and password. It outputs a base-64 encoded
string which is the actual access token. Use your favorite S3 client (refer
to Section 21.5.1, “Accessing Object Gateway”) and specify the token as the
access key and use an empty secret key.
>
export RGW_ACCESS_KEY_ID="USERNAME">
export RGW_SECRET_ACCESS_KEY="PASSWORD"cephuser@adm >
radosgw-token --encode --ttype=ldap
The access token is a base-64 encoded JSON structure and contains the LDAP credentials as a clear text.
For Active Directory, use the --ttype=ad
parameter.
21.10 Bucket index sharding #
The Object Gateway stores bucket index data in an index pool, which defaults to
.rgw.buckets.index
. If you put too many (hundreds of
thousands) objects into a single bucket and the quota for maximum number of
objects per bucket (rgw bucket default quota max objects
)
is not set, the performance of the index pool may degrade. Bucket
index sharding prevents such performance decreases and allows a
high number of objects per bucket.
21.10.1 Bucket index resharding #
If a bucket has grown large and its initial configuration is not sufficient anymore, the bucket's index pool needs to be resharded. You can either use automatic online bucket index resharding (refer to Section 21.10.1.1, “Dynamic resharding”), or reshard the bucket index offline manually (refer to Section 21.10.1.2, “Resharding manually”).
21.10.1.1 Dynamic resharding #
From SUSE Enterprise Storage 5, we support online bucket resharding. This detects if the number of objects per bucket reaches a certain threshold, and automatically increases the number of shards used by the bucket index. This process reduces the number of entries in each bucket index shard.
The detection process runs:
When new objects are added to the bucket.
In a background process that periodically scans all the buckets. This is needed in order to deal with existing buckets that are not being updated.
A bucket that requires resharding is added to the
reshard_log
queue and will be scheduled to be resharded
later. The reshard threads run in the background and execute the scheduled
resharding, one at a time.
rgw_dynamic_resharding
Enables or disables dynamic bucket index resharding. Possible values are 'true' or 'false'. Defaults to 'true'.
rgw_reshard_num_logs
Number of shards for the resharding log. Defaults to 16.
rgw_reshard_bucket_lock_duration
Duration of lock on the bucket object during resharding. Defaults to 120 seconds.
rgw_max_objs_per_shard
Maximum number of objects per bucket index shard. Defaults to 100000 objects.
rgw_reshard_thread_interval
Maximum time between rounds of reshard thread processing. Defaults to 600 seconds.
- Add a bucket to the resharding queue:
cephuser@adm >
radosgw-admin reshard add \ --bucket BUCKET_NAME \ --num-shards NEW_NUMBER_OF_SHARDS- List resharding queue:
cephuser@adm >
radosgw-admin reshard list- Process/schedule a bucket resharding:
cephuser@adm >
radosgw-admin reshard process- Display the bucket resharding status:
cephuser@adm >
radosgw-admin reshard status --bucket BUCKET_NAME- Cancel pending bucket resharding:
cephuser@adm >
radosgw-admin reshard cancel --bucket BUCKET_NAME
21.10.1.2 Resharding manually #
Dynamic resharding as mentioned in Section 21.10.1.1, “Dynamic resharding” is supported only for simple Object Gateway configurations. For multisite configurations, use manual resharding as described in this section.
To reshard the bucket index manually offline, use the following command:
cephuser@adm >
radosgw-admin bucket reshard
The bucket reshard
command performs the following:
Creates a new set of bucket index objects for the specified object.
Spreads all entries of these index objects.
Creates a new bucket instance.
Links the new bucket instance with the bucket so that all new index operations go through the new bucket indexes.
Prints the old and the new bucket ID to the standard output.
When choosing a number of shards, note the following: aim for no more than 100000 entries per shard. Bucket index shards that are prime numbers tend to work better in evenly distributing bucket index entries across the shards. For example, 503 bucket index shards is better than 500 since the former is prime.
Multi-site configurations do not support resharding a bucket index.
For multi-site configurations, resharding a bucket index requires resynchronizing all data from the master zone to all slave zones. Depending on the bucket size, this can take a considerable amount of time and resources.
Make sure that all operations to the bucket are stopped.
Back up the original bucket index:
cephuser@adm >
radosgw-admin bi list \ --bucket=BUCKET_NAME \ > BUCKET_NAME.list.backupReshard the bucket index:
cephuser@adm >
radosgw-admin bucket reshard \ --bucket=BUCKET_NAME \ --num-shards=NEW_SHARDS_NUMBERTip: Old bucket IDAs part of its output, this command also prints the new and the old bucket ID.
21.10.2 Bucket index sharding for new buckets #
There are two options that affect bucket index sharding:
Use the
rgw_override_bucket_index_max_shards
option for simple configurations.Use the
bucket_index_max_shards
option for multisite configurations.
Setting the options to 0
disables bucket index sharding.
A value greater than 0
enables bucket index sharding and
sets the maximum number of shards.
The following formula helps you calculate the recommended number of shards:
number_of_objects_expected_in_a_bucket / 100000
Be aware that the maximum number of shards is 7877.
21.10.2.1 Multisite configurations #
Multisite configurations can have a different index pool to manage
failover. To configure a consistent shard count for zones in one zone
group, set the bucket_index_max_shards
option in the zone
group's configuration:
Export the zonegroup configuration to the
zonegroup.json
file:cephuser@adm >
radosgw-admin zonegroup get > zonegroup.jsonEdit the
zonegroup.json
file and set thebucket_index_max_shards
option for each named zone.Reset the zonegroup:
cephuser@adm >
radosgw-admin zonegroup set < zonegroup.jsonUpdate the period. See Section 21.13.3.6, “Update the period”.
21.11 OpenStack Keystone integration #
OpenStack Keystone is an identity service for the OpenStack product. You can integrate the Object Gateway with Keystone to set up a gateway that accepts a Keystone authentication token. A user authorized by Keystone to access the gateway will be verified on the Ceph Object Gateway side and automatically created if needed. The Object Gateway queries Keystone periodically for a list of revoked tokens.
21.11.1 Configuring OpenStack #
Before configuring the Ceph Object Gateway, you need to configure the OpenStack Keystone to enable the Swift service and point it to the Ceph Object Gateway:
Set the Swift service. To use OpenStack to validate Swift users, first create the Swift service:
>
openstack service create \ --name=swift \ --description="Swift Service" \ object-storeSet the endpoints. After you create the Swift service, point to the Ceph Object Gateway. Replace REGION_NAME with the name of the gateway’s zonegroup name or region name.
>
openstack endpoint create --region REGION_NAME \ --publicurl "http://radosgw.example.com:8080/swift/v1" \ --adminurl "http://radosgw.example.com:8080/swift/v1" \ --internalurl "http://radosgw.example.com:8080/swift/v1" \ swiftVerify the settings. After you create the Swift service and set the endpoints, show the endpoints to verify that all the settings are correct.
>
openstack endpoint show object-store
21.11.2 Configuring the Ceph Object Gateway #
21.11.2.1 Configure SSL certificates #
The Ceph Object Gateway queries Keystone periodically for a list of revoked tokens. These requests are encoded and signed. Keystone may be also configured to provide self-signed tokens, which are also encoded and signed. You need to configure the gateway so that it can decode and verify these signed messages. Therefore, the OpenSSL certificates that Keystone uses to create the requests need to be converted to the 'nss db' format:
#
mkdir /var/ceph/nss#
openssl x509 -in /etc/keystone/ssl/certs/ca.pem \ -pubkey | certutil -d /var/ceph/nss -A -n ca -t "TCu,Cu,Tuw"root
openssl x509 -in /etc/keystone/ssl/certs/signing_cert.pem \ -pubkey | certutil -A -d /var/ceph/nss -n signing_cert -t "P,P,P"
To allow Ceph Object Gateway to interact with OpenStack Keystone, OpenStack Keystone can use a
self-signed SSL certificate. Either install Keystone’s SSL certificate
on the node running the Ceph Object Gateway, or alternatively set the value of the
option rgw keystone verify ssl
to 'false'. Setting
rgw keystone verify ssl
to 'false' means that the gateway
will not attempt to verify the certificate.
21.11.2.2 Configure the Object Gateway's options #
You can configure Keystone integration using the following options:
rgw keystone api version
Version of the Keystone API. Valid options are 2 or 3. Defaults to 2.
rgw keystone url
The URL and port number of the administrative RESTful API on the Keystone server. Follows the pattern SERVER_URL:PORT_NUMBER.
rgw keystone admin token
The token or shared secret that is configured internally in Keystone for administrative requests.
rgw keystone accepted roles
The roles required to serve requests. Defaults to 'Member, admin'.
rgw keystone accepted admin roles
The list of roles allowing a user to gain administrative privileges.
rgw keystone token cache size
The maximum number of entries in the Keystone token cache.
rgw keystone revocation interval
The number of seconds before checking revoked tokens. Defaults to 15 * 60.
rgw keystone implicit tenants
Create new users in their own tenants of the same name. Defaults to 'false'.
rgw s3 auth use keystone
If set to 'true', the Ceph Object Gateway will authenticate users using Keystone. Defaults to 'false'.
nss db path
The path to the NSS database.
It is also possible to configure the Keystone service tenant, user, and
password for Keystone (for version 2.0 of the OpenStack Identity API),
similar to the way OpenStack services tend to be configured. This way you
can avoid setting the shared secret rgw keystone admin
token
in the configuration file, which should be disabled in
production environments. The service tenant credentials should have admin
privileges. For more details refer to the
official
OpenStack Keystone documentation. The related configuration options
follow:
rgw keystone admin user
The Keystone administrator user name.
rgw keystone admin password
The keystone administrator user password.
rgw keystone admin tenant
The Keystone version 2.0 administrator user tenant.
A Ceph Object Gateway user is mapped to a Keystone tenant. A Keystone user has
different roles assigned to it, possibly on more than one tenant. When the
Ceph Object Gateway gets the ticket, it looks at the tenant and the user roles that are
assigned to that ticket, and accepts or rejects the request according to
the setting of the rgw keystone accepted roles
option.
Although Swift tenants are mapped to the Object Gateway user by default, they
can be also mapped to OpenStack tenants via the rgw keystone
implicit tenants
option. This will make containers use the
tenant namespace instead of the S3 like global namespace that the Object Gateway
defaults to. We recommend deciding on the mapping method at the planning
stage to avoid confusion. The reason for this is that toggling the option
later affects only newer requests which get mapped under a tenant, while
older buckets created before still continue to be in a global namespace.
For version 3 of the OpenStack Identity API, you should replace the
rgw keystone admin tenant
option with:
rgw keystone admin domain
The Keystone administrator user domain.
rgw keystone admin project
The Keystone administrator user project.
21.12 Pool placement and storage classes #
21.12.1 Displaying placement targets #
Placement targets control which pools are associated with a particular
bucket. A bucket’s placement target is selected on creation, and cannot be
modified. You can display its placement_rule
by running
the following command:
cephuser@adm >
radosgw-admin bucket stats
The zonegroup configuration contains a list of placement targets with an initial target named 'default-placement'. The zone configuration then maps each zonegroup placement target name onto its local storage. This zone placement information includes the 'index_pool' name for the bucket index, the 'data_extra_pool' name for metadata about incomplete multipart uploads, and a 'data_pool' name for each storage class.
21.12.2 Storage classes #
Storage classes help customizing the placement of object data. S3 Bucket Lifecycle rules can automate the transition of objects between storage classes.
Storage classes are defined in terms of placement targets. Each zonegroup placement target lists its available storage classes with an initial class named 'STANDARD'. The zone configuration is responsible for providing a 'data_pool' pool name for each of the zonegroup’s storage classes.
21.12.3 Configuring zonegroups and zones #
Use the radosgw-admin
command on the zonegroups and zones
to configure their placement. You can query the zonegroup placement
configuration using the following command:
cephuser@adm >
radosgw-admin zonegroup get
{
"id": "ab01123f-e0df-4f29-9d71-b44888d67cd5",
"name": "default",
"api_name": "default",
...
"placement_targets": [
{
"name": "default-placement",
"tags": [],
"storage_classes": [
"STANDARD"
]
}
],
"default_placement": "default-placement",
...
}
To query the zone placement configuration, run:
cephuser@adm >
radosgw-admin zone get
{
"id": "557cdcee-3aae-4e9e-85c7-2f86f5eddb1f",
"name": "default",
"domain_root": "default.rgw.meta:root",
...
"placement_pools": [
{
"key": "default-placement",
"val": {
"index_pool": "default.rgw.buckets.index",
"storage_classes": {
"STANDARD": {
"data_pool": "default.rgw.buckets.data"
}
},
"data_extra_pool": "default.rgw.buckets.non-ec",
"index_type": 0
}
}
],
...
}
If you have not done any previous multisite configuration, a 'default'
zone and zonegroup are created for you, and changes to the zone/zonegroup
will not take effect until you restart the Ceph Object Gateways. If you have created a
realm for multisite, the zone/zonegroup changes will take effect after you
commit the changes with the radosgw-admin period update
--commit
command.
21.12.3.1 Adding a placement target #
To create a new placement target named 'temporary', start by adding it to the zonegroup:
cephuser@adm >
radosgw-admin zonegroup placement add \
--rgw-zonegroup default \
--placement-id temporary
Then provide the zone placement info for that target:
cephuser@adm >
radosgw-admin zone placement add \
--rgw-zone default \
--placement-id temporary \
--data-pool default.rgw.temporary.data \
--index-pool default.rgw.temporary.index \
--data-extra-pool default.rgw.temporary.non-ec
21.12.3.2 Adding a storage class #
To add a new storage class named 'COLD' to the 'default-placement' target, start by adding it to the zonegroup:
cephuser@adm >
radosgw-admin zonegroup placement add \
--rgw-zonegroup default \
--placement-id default-placement \
--storage-class COLD
Then provide the zone placement info for that storage class:
cephuser@adm >
radosgw-admin zone placement add \
--rgw-zone default \
--placement-id default-placement \
--storage-class COLD \
--data-pool default.rgw.cold.data \
--compression lz4
21.12.4 Placement customization #
21.12.4.1 Editing default zonegroup placement #
By default, new buckets will use the zonegroup’s
default_placement
target. You can change this zonegroup
setting with:
cephuser@adm >
radosgw-admin zonegroup placement default \
--rgw-zonegroup default \
--placement-id new-placement
21.12.4.2 Editing default user placement #
A Ceph Object Gateway user can override the zonegroup’s default placement target by
setting a non-empty default_placement
field in the user
info. Similarly, the default_storage_class
can override
the STANDARD
storage class applied to objects by default.
cephuser@adm >
radosgw-admin user info --uid testid
{
...
"default_placement": "",
"default_storage_class": "",
"placement_tags": [],
...
}
If a zonegroup’s placement target contains any tags, users will be unable to create buckets with that placement target unless their user info contains at least one matching tag in its 'placement_tags' field. This can be useful to restrict access to certain types of storage.
The radosgw-admin
command cannot modify these fields
directly, therefore you need to edit the JSON format manually:
cephuser@adm >
radosgw-admin metadata get user:USER-ID > user.json>
vi user.json # edit the file as requiredcephuser@adm >
radosgw-admin metadata put user:USER-ID < user.json
21.12.4.3 Editing the S3 default bucket placement #
When creating a bucket with the S3 protocol, a placement target can be
provided as part of the LocationConstraint
to override
the default placement targets from the user and zonegroup.
Normally, the LocationConstraint
needs to match the
zonegroup’s api_name
:
<LocationConstraint>default</LocationConstraint>
You can add a custom placement target to the api_name
following a colon:
<LocationConstraint>default:new-placement</LocationConstraint>
21.12.4.4 Editing the Swift bucket placement #
When creating a bucket with the Swift protocol, you can provide a
placement target in the HTTP header's X-Storage-Policy
:
X-Storage-Policy: NEW-PLACEMENT
21.12.5 Using storage classes #
All placement targets have a STANDARD
storage class which
is applied to new objects by default. You can override this default with
its default_storage_class
.
To create an object in a non-default storage class, provide that storage
class name in an HTTP header with the request. The S3 protocol uses the
X-Amz-Storage-Class
header, while the Swift protocol
uses the X-Object-Storage-Class
header.
You can use S3 Object Lifecycle Management to move
object data between storage classes using Transition
actions.
21.13 Multisite Object Gateways #
Ceph supports several multi-site configuration options for the Ceph Object Gateway:
- Multi-zone
A configuration consisting of one zonegroup and multiple zones, each zone with one or more
ceph-radosgw
instances. Each zone is backed by its own Ceph Storage Cluster. Multiple zones in a zone group provide disaster recovery for the zonegroup should one of the zones experience a significant failure. Each zone is active and may receive write operations. In addition to disaster recovery, multiple active zones may also serve as a foundation for content delivery networks.- Multi-zone-group
Ceph Object Gateway supports multiple zonegroups, each zonegroup with one or more zones. Objects stored to zones in one zonegroup within the same realm as another zonegroup share a global object namespace, ensuring unique object IDs across zonegroups and zones.
NoteIt is important to note that zonegroups only sync metadata amongst themselves. Data and metadata are replicated between the zones within the zonegroup. No data or metadata is shared across a realm.
- Multiple realms
Ceph Object Gateway supports the notion of realms; a globally unique namespace. Multiple realms are supported which may encompass single or multiple zonegroups.
You can configure each Object Gateway to work in an active-active zone configuration,
allowing for writes to non-master zones. The multi-site configuration is
stored within a container called a realm. The realm stores zonegroups, zones,
and a time period with multiple epochs for tracking changes to the
configuration. The rgw
daemons
handle the synchronization, eliminating the need for a separate
synchronization agent. This approach to synchronization allows the Ceph Object Gateway to
operate with an active-active configuration instead of active-passive.
21.13.1 Requirements and assumptions #
A multi-site configuration requires at least two Ceph storage clusters,
and at least two Ceph Object Gateway instances, one for each Ceph storage cluster. The
following configuration assumes at least two Ceph storage clusters are in
geographically separate locations. However, the configuration can work on
the same site. For example, named rgw1
and
rgw2
.
A multi-site configuration requires a master zonegroup and a master zone. A
master zone is the source of truth with regard to all metadata operations
in a multisite cluster. Additionally, each zonegroup requires a master zone.
zonegroups may have one or more secondary or non-master zones. In this
guide, the rgw1
host serves as the master zone of the
master zonegroup and the rgw2
host serves as the
secondary zone of the master zonegroup.
21.13.2 Limitations #
Multi-site configurations do not support resharding a bucket index.
As a workaround, the bucket can be purged from the slave zones, resharded on the master zone, and then resynchronized. Depending on the contents of the bucket, this can be a time- and resource-intensive operation.
21.13.3 Configuring a master zone #
All gateways in a multi-site configuration retrieve their configuration
from a ceph-radosgw
daemon on a
host within the master zonegroup and master zone. To configure your gateways
in a multi-site configuration, select a
ceph-radosgw
instance to configure
the master zonegroup and master zone.
21.13.3.1 Creating a realm #
A realm represents a globally unique namespace consisting of one or more zonegroups containing one or more zones. Zones contain buckets, which in turn contain objects. A realm enables the Ceph Object Gateway to support multiple namespaces and their configuration on the same hardware. A realm contains the notion of periods. Each period represents the state of the zonegroup and zone configuration in time. Each time you make a change to a zonegroup or zone, update the period and commit it. By default, the Ceph Object Gateway does not create a realm for backward compatibility. As a best practice, we recommend creating realms for new clusters.
Create a new realm called gold
for the multi-site
configuration by opening a command line interface on a host identified to
serve in the master zonegroup and zone. Then, execute the following:
cephuser@adm >
radosgw-admin realm create --rgw-realm=gold --default
If the cluster has a single realm, specify the --default
flag. If --default
is specified,
radosgw-admin
uses this realm by default. If
--default
is not specified, adding zone-groups and zones
requires specifying either the --rgw-realm
flag or the
--realm-id
flag to identify the realm when adding
zonegroups and zones.
After creating the realm, radosgw-admin
returns the
realm configuration:
{ "id": "4a367026-bd8f-40ee-b486-8212482ddcd7", "name": "gold", "current_period": "09559832-67a4-4101-8b3f-10dfcd6b2707", "epoch": 1 }
Ceph generates a unique ID for the realm, which allows the renaming of a realm if the need arises.
21.13.3.2 Creating a master zonegroup #
A realm must have at least one zonegroup to serve as the master zonegroup
for the realm. Create a new master zonegroup for the multi-site
configuration by opening a command line interface on a host identified to
serve in the master zonegroup and zone. Create a master zonegroup called
us
by executing the following:
cephuser@adm >
radosgw-admin zonegroup create --rgw-zonegroup=us \
--endpoints=http://rgw1:80 --master --default
If the realm only has a single zonegroup, specify the
--default
flag. If --default
is
specified, radosgw-admin
uses this zonegroup by default
when adding new zones. If --default
is not specified,
adding zones requires either the --rgw-zonegroup
flag or
the --zonegroup-id
flag to identify the zonegroup when
adding or modifying zones.
After creating the master zonegroup, radosgw-admin
returns the zonegroup configuration. For example:
{ "id": "d4018b8d-8c0d-4072-8919-608726fa369e", "name": "us", "api_name": "us", "is_master": "true", "endpoints": [ "http:\/\/rgw1:80" ], "hostnames": [], "hostnames_s3website": [], "master_zone": "", "zones": [], "placement_targets": [], "default_placement": "", "realm_id": "4a367026-bd8f-40ee-b486-8212482ddcd7" }
21.13.3.3 Creating a master zone #
Zones need to be created on a Ceph Object Gateway node that will be within the zone.
Create a new master zone for the multi-site configuration by opening a command line interface on a host identified to serve in the master zonegroup and zone. Execute the following:
cephuser@adm >
radosgw-admin zone create --rgw-zonegroup=us --rgw-zone=us-east-1 \
--endpoints=http://rgw1:80 --access-key=SYSTEM_ACCESS_KEY --secret=SYSTEM_SECRET_KEY
The --access-key
and --secret
options
are not specified in the above example. These settings are added to the
zone when the user is created in the next section.
After creating the master zone, radosgw-admin
returns
the zone configuration. For example:
{ "id": "56dfabbb-2f4e-4223-925e-de3c72de3866", "name": "us-east-1", "domain_root": "us-east-1.rgw.meta:root", "control_pool": "us-east-1.rgw.control", "gc_pool": "us-east-1.rgw.log:gc", "lc_pool": "us-east-1.rgw.log:lc", "log_pool": "us-east-1.rgw.log", "intent_log_pool": "us-east-1.rgw.log:intent", "usage_log_pool": "us-east-1.rgw.log:usage", "reshard_pool": "us-east-1.rgw.log:reshard", "user_keys_pool": "us-east-1.rgw.meta:users.keys", "user_email_pool": "us-east-1.rgw.meta:users.email", "user_swift_pool": "us-east-1.rgw.meta:users.swift", "user_uid_pool": "us-east-1.rgw.meta:users.uid", "otp_pool": "us-east-1.rgw.otp", "system_key": { "access_key": "1555b35654ad1656d804", "secret_key": "h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q==" }, "placement_pools": [ { "key": "us-east-1-placement", "val": { "index_pool": "us-east-1.rgw.buckets.index", "storage_classes": { "STANDARD": { "data_pool": "us-east-1.rgw.buckets.data" } }, "data_extra_pool": "us-east-1.rgw.buckets.non-ec", "index_type": 0 } } ], "metadata_heap": "", "realm_id": "" }
21.13.3.4 Deleting the default zone and group #
The following steps assume a multi-site configuration using newly installed systems that are not storing data yet. Do not delete the default zone and its pools if you are already using it to store data, or the data will be deleted and unrecoverable.
The default installation of Object Gateway creates the default zonegroup called
default
. Delete the default zone if it exists. Make
sure to remove it from the default zonegroup first.
cephuser@adm >
radosgw-admin zonegroup delete --rgw-zonegroup=default
Delete the default pools in your Ceph storage cluster if they exist:
The following step assumes a multi-site configuration using newly installed systems that are not currently storing data. Do not delete the default zonegroup if you are already using it to store data.
cephuser@adm >
ceph osd pool rm default.rgw.control default.rgw.control --yes-i-really-really-mean-itcephuser@adm >
ceph osd pool rm default.rgw.data.root default.rgw.data.root --yes-i-really-really-mean-itcephuser@adm >
ceph osd pool rm default.rgw.gc default.rgw.gc --yes-i-really-really-mean-itcephuser@adm >
ceph osd pool rm default.rgw.log default.rgw.log --yes-i-really-really-mean-itcephuser@adm >
ceph osd pool rm default.rgw.meta default.rgw.meta --yes-i-really-really-mean-it
If you delete the default zonegroup, you are also deleting the system user. If your admin user keys are not propagated, the Object Gateway management functionality of the Ceph Dashboard will fail. Follow on to the next section to re-create your system user if you go ahead with this step.
21.13.3.5 Creating system users #
The ceph-radosgw
daemons must
authenticate before pulling realm and period information. In the master
zone, create a system user to simplify authentication between daemons:
cephuser@adm >
radosgw-admin user create --uid=zone.user \
--display-name="Zone User" --access-key=SYSTEM_ACCESS_KEY \
--secret=SYSTEM_SECRET_KEY --system
Make a note of the access_key
and
secret_key
as the secondary zones require them to
authenticate with the master zone.
Add the system user to the master zone:
cephuser@adm >
radosgw-admin zone modify --rgw-zone=us-east-1 \
--access-key=ACCESS-KEY --secret=SECRET
Update the period to make the changes take effect:
cephuser@adm >
radosgw-admin period update --commit
21.13.3.6 Update the period #
After updating the master zone configuration, update the period:
cephuser@adm >
radosgw-admin period update --commit
After updating the period, radosgw-admin
returns the
period configuration. For example:
{ "id": "09559832-67a4-4101-8b3f-10dfcd6b2707", "epoch": 1, "predecessor_uuid": "", "sync_status": [], "period_map": { "id": "09559832-67a4-4101-8b3f-10dfcd6b2707", "zonegroups": [], "short_zone_ids": [] }, "master_zonegroup": "", "master_zone": "", "period_config": { "bucket_quota": { "enabled": false, "max_size_kb": -1, "max_objects": -1 }, "user_quota": { "enabled": false, "max_size_kb": -1, "max_objects": -1 } }, "realm_id": "4a367026-bd8f-40ee-b486-8212482ddcd7", "realm_name": "gold", "realm_epoch": 1 }
Updating the period changes the epoch and ensures that other zones receive the updated configuration.
21.13.3.7 Start the gateway #
On the Object Gateway host, start and enable the Ceph Object Gateway service. To identify the
unique FSID of the cluster, run ceph fsid
. To identify
the Object Gateway daemon name, run ceph orch ps --hostname
HOSTNAME
.
cephuser@ogw >
systemctl start ceph-FSID@DAEMON_NAMEcephuser@ogw >
systemctl enable ceph-FSID@DAEMON_NAME
21.13.4 Configure secondary zones #
Zones within a zonegroup replicate all data to ensure that each zone has the same data. When creating the secondary zone, execute all of the following operations on a host identified to serve the secondary zone.
To add a third zone, follow the same procedures as for adding the secondary zone. Use different zone name.
You must execute metadata operations, such as user creation, on a host within the master zone. The master zone and the secondary zone can receive bucket operations, but the secondary zone redirects bucket operations to the master zone. If the master zone is down, bucket operations will fail.
21.13.4.1 Pulling the realm #
Using the URL path, access key, and secret of the master zone in the
master zonegroup, pull the realm configuration to the host. To pull a
non-default realm, specify the realm using the
--rgw-realm
or --realm-id
configuration
options.
cephuser@adm >
radosgw-admin realm pull --url=url-to-master-zone-gateway --access-key=access-key --secret=secret
Pulling the realm also retrieves the remote's current period configuration, and makes it the current period on this host as well.
If this realm is the default realm or the only realm, make the realm the default realm.
cephuser@adm >
radosgw-admin realm default --rgw-realm=REALM-NAME
21.13.4.2 Creating a secondary zone #
Create a secondary zone for the multi-site configuration by opening a
command line interface on a host identified to serve the secondary zone.
Specify the zonegroup ID, the new zone name and an endpoint for the zone.
Do not use the --master
flag. All
zones run in an active-active configuration by default. If the secondary
zone should not accept write operations, specify the
--read-only
flag to create an active-passive
configuration between the master zone and the secondary zone.
Additionally, provide the access_key
and
secret_key
of the generated system user stored in the
master zone of the master zonegroup. Execute the following:
cephuser@adm >
radosgw-admin zone create --rgw-zonegroup=ZONE-GROUP-NAME\
--rgw-zone=ZONE-NAME --endpoints=URL \
--access-key=SYSTEM-KEY --secret=SECRET\
--endpoints=http://FQDN:80 \
[--read-only]
For example:
cephuser@adm >
radosgw-admin zone create --rgw-zonegroup=us --endpoints=http://rgw2:80 \
--rgw-zone=us-east-2 --access-key=SYSTEM_ACCESS_KEY --secret=SYSTEM_SECRET_KEY
{
"id": "950c1a43-6836-41a2-a161-64777e07e8b8",
"name": "us-east-2",
"domain_root": "us-east-2.rgw.data.root",
"control_pool": "us-east-2.rgw.control",
"gc_pool": "us-east-2.rgw.gc",
"log_pool": "us-east-2.rgw.log",
"intent_log_pool": "us-east-2.rgw.intent-log",
"usage_log_pool": "us-east-2.rgw.usage",
"user_keys_pool": "us-east-2.rgw.users.keys",
"user_email_pool": "us-east-2.rgw.users.email",
"user_swift_pool": "us-east-2.rgw.users.swift",
"user_uid_pool": "us-east-2.rgw.users.uid",
"system_key": {
"access_key": "1555b35654ad1656d804",
"secret_key": "h7GhxuBLTrlhVUyxSPUKUV8r\/2EI4ngqJxD7iBdBYLhwluN30JaT3Q=="
},
"placement_pools": [
{
"key": "default-placement",
"val": {
"index_pool": "us-east-2.rgw.buckets.index",
"data_pool": "us-east-2.rgw.buckets.data",
"data_extra_pool": "us-east-2.rgw.buckets.non-ec",
"index_type": 0
}
}
],
"metadata_heap": "us-east-2.rgw.meta",
"realm_id": "815d74c2-80d6-4e63-8cfc-232037f7ff5c"
}
The following steps assume a multi-site configuration using newly-installed systems that are not yet storing data. Do not delete the default zone and its pools if you are already using it to store data, or the data will be lost and unrecoverable.
Delete the default zone if needed:
cephuser@adm >
radosgw-admin zone delete --rgw-zone=default
Delete the default pools in your Ceph storage cluster if needed:
cephuser@adm >
ceph osd pool rm default.rgw.control default.rgw.control --yes-i-really-really-mean-itcephuser@adm >
ceph osd pool rm default.rgw.data.root default.rgw.data.root --yes-i-really-really-mean-itcephuser@adm >
ceph osd pool rm default.rgw.gc default.rgw.gc --yes-i-really-really-mean-itcephuser@adm >
ceph osd pool rm default.rgw.log default.rgw.log --yes-i-really-really-mean-itcephuser@adm >
ceph osd pool rm default.rgw.users.uid default.rgw.users.uid --yes-i-really-really-mean-it
21.13.4.3 Updating the Ceph configuration file #
Update the Ceph configuration file on the secondary zone hosts by adding
the rgw_zone
configuration option and the name of the
secondary zone to the instance entry.
To do so, execute the following command:
cephuser@adm >
ceph config set SERVICE_NAME rgw_zone us-west
21.13.4.4 Updating the period #
After updating the master zone configuration, update the period:
cephuser@adm >
radosgw-admin period update --commit
{
"id": "b5e4d3ec-2a62-4746-b479-4b2bc14b27d1",
"epoch": 2,
"predecessor_uuid": "09559832-67a4-4101-8b3f-10dfcd6b2707",
"sync_status": [ "[...]"
],
"period_map": {
"id": "b5e4d3ec-2a62-4746-b479-4b2bc14b27d1",
"zonegroups": [
{
"id": "d4018b8d-8c0d-4072-8919-608726fa369e",
"name": "us",
"api_name": "us",
"is_master": "true",
"endpoints": [
"http:\/\/rgw1:80"
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "83859a9a-9901-4f00-aa6d-285c777e10f0",
"zones": [
{
"id": "83859a9a-9901-4f00-aa6d-285c777e10f0",
"name": "us-east-1",
"endpoints": [
"http:\/\/rgw1:80"
],
"log_meta": "true",
"log_data": "false",
"bucket_index_max_shards": 0,
"read_only": "false"
},
{
"id": "950c1a43-6836-41a2-a161-64777e07e8b8",
"name": "us-east-2",
"endpoints": [
"http:\/\/rgw2:80"
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false"
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": []
}
],
"default_placement": "default-placement",
"realm_id": "4a367026-bd8f-40ee-b486-8212482ddcd7"
}
],
"short_zone_ids": [
{
"key": "83859a9a-9901-4f00-aa6d-285c777e10f0",
"val": 630926044
},
{
"key": "950c1a43-6836-41a2-a161-64777e07e8b8",
"val": 4276257543
}
]
},
"master_zonegroup": "d4018b8d-8c0d-4072-8919-608726fa369e",
"master_zone": "83859a9a-9901-4f00-aa6d-285c777e10f0",
"period_config": {
"bucket_quota": {
"enabled": false,
"max_size_kb": -1,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"max_size_kb": -1,
"max_objects": -1
}
},
"realm_id": "4a367026-bd8f-40ee-b486-8212482ddcd7",
"realm_name": "gold",
"realm_epoch": 2
}
Updating the period changes the epoch and ensures that other zones receive the updated configuration.
21.13.4.5 Starting the Object Gateway #
On the Object Gateway host, start and enable the Ceph Object Gateway service:
cephuser@adm >
ceph orch start rgw.us-east-2
21.13.4.6 Checking the synchronization status #
When the secondary zone is up and running, check the synchronization status. Synchronization copies users and buckets created in the master zone to the secondary zone.
cephuser@adm >
radosgw-admin sync status
The output provides the status of synchronization operations. For example:
realm f3239bc5-e1a8-4206-a81d-e1576480804d (gold) zonegroup c50dbb7e-d9ce-47cc-a8bb-97d9b399d388 (us) zone 4c453b70-4a16-4ce8-8185-1893b05d346e (us-west) metadata sync syncing full sync: 0/64 shards metadata is caught up with master incremental sync: 64/64 shards data sync source: 1ee9da3e-114d-4ae3-a8a4-056e8a17f532 (us-east) syncing full sync: 0/128 shards incremental sync: 128/128 shards data is caught up with source
Secondary zones accept bucket operations; however, secondary zones redirect bucket operations to the master zone and then synchronize with the master zone to receive the result of the bucket operations. If the master zone is down, bucket operations executed on the secondary zone will fail, but object operations should succeed.
21.13.4.7 Verification of an Object #
By default, objects are not verified again after the synchronization of an
object was successful. To enable verification, set the
rgw_sync_obj_etag_verify
option to true
.
After enabling, the optional objects will be synchronized. An additional
MD5 checksum will verify that it is computed on the source and the
destination. This is to ensure the integrity of the objects fetched from a
remote server over HTTP including multisite sync. This option can decrease
the performance of RGWs as more computation is needed.
21.13.5 General Object Gateway maintenance #
21.13.5.1 Checking the synchronization status #
Information about the replication status of a zone can be queried with:
cephuser@adm >
radosgw-admin sync status
realm b3bc1c37-9c44-4b89-a03b-04c269bea5da (gold)
zonegroup f54f9b22-b4b6-4a0e-9211-fa6ac1693f49 (us)
zone adce11c9-b8ed-4a90-8bc5-3fc029ff0816 (us-west)
metadata sync syncing
full sync: 0/64 shards
incremental sync: 64/64 shards
metadata is behind on 1 shards
oldest incremental change not applied: 2017-03-22 10:20:00.0.881361s
data sync source: 341c2d81-4574-4d08-ab0f-5a2a7b168028 (us-east)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source
source: 3b5d1a3f-3f27-4e4a-8f34-6072d4bb1275 (us-3)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source
The output can differ depending on the sync status. The shards are described as two different types during sync:
- Behind shards
Behind shards are shards that need a full data synchronization and shards needing an incremental data synchronization because they are not up-to-date.
- Recovery shards
Recovery shards are shards that encountered an error during synchronization and marked for retry. The error mostly occurs on minor issues like acquiring a lock on a bucket. This will typically resolve itself.
21.13.5.2 Check the logs #
For multi-site only, you can check out the metadata log
(mdlog
), the bucket index log (bilog
)
and the data log (datalog
). You can list them and also
trim them. This is not needed in most cases as
rgw_sync_log_trim_interval
option is set to 20 minutes as
default. If it is not manually set to 0, you will not need to trim it at
any time as it could cause side effects otherwise.
21.13.5.3 Changing the metadata master zone #
Be careful when changing which zone is the metadata master. If a zone has
not finished synchronizing metadata from the current master zone, it is
unable to serve any remaining entries when promoted to master and those
changes will be lost. For this reason, we recommend waiting for a zone's
radosgw-admin
synchronization status to catch up on
metadata synchronization before promoting it to master. Similarly, if
changes to metadata are being processed by the current master zone while
another zone is being promoted to master, those changes are likely to be
lost. To avoid this, we recommend shutting down any Object Gateway instances on
the previous master zone. After promoting another zone, its new period
can be fetched with radosgw-admin
period pull and the
gateway(s) can be restarted.
To promote a zone (for example, zone us-west
in
zonegroup us
) to metadata master, run the following
commands on that zone:
cephuser@ogw >
radosgw-admin zone modify --rgw-zone=us-west --mastercephuser@ogw >
radosgw-admin zonegroup modify --rgw-zonegroup=us --mastercephuser@ogw >
radosgw-admin period update --commit
This generates a new period, and the Object Gateway instance(s) in zone
us-west
sends this period to other zones.
21.13.5.4 Resharding a bucket index #
Resharding a bucket index in a multi-site setup requires a full resynchronization of the bucket content. Depending on the size and number of objects in the bucket, this is a time- and resource-intensive operation.
Make sure that all operations to the bucket are stopped.
Back up the original bucket index:
cephuser@adm >
radosgw-admin bi list \ --bucket=BUCKET_NAME \ > BUCKET_NAME.list.backupDisable bucket synchronization for the affected bucket:
cephuser@adm >
radosgw-admin bucket sync disable --bucket=BUCKET_NAMEWait for the synchronization to finish on all zones. Check on master and slave zones with the following command:
cephuser@adm >
radosgw-admin sync statusStop the Object Gateway instances. First on all slave zones, then on the master zone, too.
cephuser@ogw >
systemctl stop ceph-radosgw@rgw.NODE.serviceReshard the bucket index on the master zone:
cephuser@adm >
radosgw-admin bucket reshard \ --bucket=BUCKET_NAME \ --num-shards=NEW_SHARDS_NUMBERTip: Old bucket IDAs part of its output, this command also prints the new and the old bucket ID.
Purge the bucket on all slave zones:
cephuser@adm >
radosgw-admin bucket rm \ --purge-objects \ --bucket=BUCKET_NAME \ --yes-i-really-mean-itRestart the Object Gateway on the master zone first, then on the slave zones as well.
cephuser@ogw >
systemctl restart ceph-radosgw.targetOn the master zone, re-enable bucket synchronization.
cephuser@adm >
radosgw-admin bucket sync enable --bucket=BUCKET_NAME
21.13.6 Performing failover and disaster recovery #
If the master zone should fail, failover to the secondary zone for disaster recovery.
Make the secondary zone the master and default zone. For example:
cephuser@adm >
radosgw-admin zone modify --rgw-zone=ZONE-NAME --master --defaultBy default, Ceph Object Gateway runs in an active-active configuration. If the cluster was configured to run in an active-passive configuration, the secondary zone is a read-only zone. Remove the
--read-only
status to allow the zone to receive write operations. For example:cephuser@adm >
radosgw-admin zone modify --rgw-zone=ZONE-NAME --master --default \ --read-only=falseUpdate the period to make the changes take effect:
cephuser@adm >
radosgw-admin period update --commitRestart the Ceph Object Gateway:
cephuser@adm >
ceph orch restart rgw
If the former master zone recovers, revert the operation.
From the recovered zone, pull the latest realm configuration from the current master zone.
cephuser@adm >
radosgw-admin realm pull --url=URL-TO-MASTER-ZONE-GATEWAY \ --access-key=ACCESS-KEY --secret=SECRETMake the recovered zone the master and default zone:
cephuser@adm >
radosgw-admin zone modify --rgw-zone=ZONE-NAME --master --defaultUpdate the period to make the changes take effect:
cephuser@adm >
radosgw-admin period update --commitRestart the Ceph Object Gateway in the recovered zone:
cephuser@adm >
ceph orch restart rgw@rgwIf the secondary zone needs to be a read-only configuration, update the secondary zone:
cephuser@adm >
radosgw-admin zone modify --rgw-zone=ZONE-NAME --read-onlyUpdate the period to make the changes take effect:
cephuser@adm >
radosgw-admin period update --commitRestart the Ceph Object Gateway in the secondary zone:
cephuser@adm >
ceph orch restart@rgw