Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
documentation.suse.com / SUSE Enterprise Storage 7.1 Documentation / Administration and Operations Guide / Configuring a Cluster / Ceph cluster configuration
Applies to SUSE Enterprise Storage 7.1

28 Ceph cluster configuration

This chapter describes how to configure the Ceph cluster by means of configuration options.

28.1 Configure the ceph.conf file

cephadm uses a basic ceph.conf file that only contains a minimal set of options for connecting to MONs, authenticating, and fetching configuration information. In most cases, this is limited to the mon_host option (although this can be avoided through the use of DNS SRV records).

Important
Important

The ceph.conf file no longer serves as a central place for storing cluster configuration, in favor of the configuration database (see Section 28.2, “Configuration database”).

If you still need to change cluster configuration via the ceph.conf file—for example, because you use a client that does not support reading options form the configuration database—you need to run the following command, and take care of maintaining and distributing the ceph.conf file across the whole cluster:

cephuser@adm > ceph config set mgr mgr/cephadm/manage_etc_ceph_ceph_conf false

28.1.1 Accessing ceph.conf inside container images

Although Ceph daemons run inside containers, you can still access their ceph.conf configuration file. It is bind-mounted as the following file on the host system:

/var/lib/ceph/CLUSTER_FSID/DAEMON_NAME/config

Replace CLUSTER_FSID with the unique FSID of the running cluster as returned by the ceph fsid command, and DAEMON_NAME with the name of the specific daemon as listed by the ceph orch ps command. For example:

/var/lib/ceph/b4b30c6e-9681-11ea-ac39-525400d7702d/osd.2/config

To modify the configuration of a daemon, edit its config file and restart it:

# systemctl restart ceph-CLUSTER_FSID-DAEMON_NAME

For example:

# systemctl restart ceph-b4b30c6e-9681-11ea-ac39-525400d7702d-osd.2
Important
Important

All custom settings will be lost after cephadm redeploys the daemon.

28.2 Configuration database

Ceph Monitors manage a central database of configuration options that affect the behavior of the whole cluster.

28.2.1 Configuring sections and masks

Configuration options stored by the MON can live in a global section, daemon type section, or a specific daemon section. In addition, options may also have a mask associated with them to further restrict to which daemons or clients the option applies. Masks have two forms:

  • TYPE:LOCATION where TYPE is a CRUSH property such as rack or host, while LOCATION is a value for that property.

    For example, host:example_host will limit the option only to daemons or clients running on a particular host.

  • CLASS:DEVICE_CLASS where DEVICE_CLASS is the name of a CRUSH device class such as hdd or ssd. For example, class:ssd will limit the option only to OSDs backed by SSDs. This mask has no effect for non-OSD daemons or clients.

28.2.2 Setting and reading configuration options

Use the following commands to set or read cluster configuration options. The WHO parameter may be a section name, a mask, or a combination of both separated by a slash (/) character. For example, osd/rack:foo represents all OSD daemons in the rack called foo.

ceph config dump

Dumps the entire configuration database for a whole cluster.

ceph config get WHO

Dumps the configuration for a specific daemon or client (for example, mds.a), as stored in the configuration database.

ceph config set WHO OPTION VALUE

Sets the configuration option to the specified value in the configuration database.

ceph config show WHO

Shows the reported running configuration for a running daemon. These settings may differ from those stored by the monitors if there are also local configuration files in use, or options have been overridden on the command line or at runtime. The source of the option values is reported as part of the output.

ceph config assimilate-conf -i INPUT_FILE -o OUTPUT_FILE

Imports a configuration file specified as INPUT_FILE and stores any valid options into the configuration database. Any settings that are unrecognized, invalid, or cannot be controlled by the monitor will be returned in an abbreviated file stored as OUTPUT_FILE. This command is useful for transitioning from legacy configuration files to centralized monitor-based configuration.

28.2.3 Configuring daemons at runtime

In most cases, Ceph allows you to make changes to the configuration of a daemon at runtime. This is useful, for example, when you need to increase or decrease the amount of logging output, or when performing runtime cluster optimization.

You can update the values of configuration options with the following command:

cephuser@adm > ceph config set DAEMON OPTION VALUE

For example, to adjust the debugging log level on a specific OSD, run:

cephuser@adm > ceph config set osd.123 debug_ms 20
Note
Note

If the same option is also customized in a local configuration file, the monitor setting will be ignored because it has lower priority than the configuration file.

28.2.3.1 Overriding values

You can temporarily modify an option value using the tell or daemon subcommands. Such modification only affect the running process and is discarded after the daemon or process restarts.

There are two ways to override values:

  • Use the tell subcommand to send a message to a specific daemon from any cluster node:

    cephuser@adm > ceph tell DAEMON config set OPTION VALUE

    For example:

    cephuser@adm > ceph tell osd.123 config set debug_osd 20
    Tip
    Tip

    The tell subcommand accepts wild cards as daemon identifiers. For example, to adjust the debug level on all OSD daemons, run:

    cephuser@adm > ceph tell osd.* config set debug_osd 20
  • Use the daemon subcommand to connect to a specific daemon process via a socket in /var/run/ceph from the node where the process is running:

    cephuser@adm > cephadm enter --name osd.ID -- ceph daemon DAEMON config set OPTION VALUE

    For example:

    cephuser@adm > cephadm enter --name osd.4 -- ceph daemon osd.4 config set debug_osd 20
Tip
Tip

When viewing runtime settings with the ceph config show command (see Section 28.2.3.2, “Viewing runtime settings”), temporarily overridden values will be shown with a source override.

28.2.3.2 Viewing runtime settings

To view all options set for a daemon:

cephuser@adm > ceph config show-with-defaults osd.0

To view all non-default options set for a daemon:

cephuser@adm > ceph config show osd.0

To inspect a specific option:

cephuser@adm > ceph config show osd.0 debug_osd

You can also connect to a running daemon from the node where its process is running, and observe its configuration:

cephuser@adm > cephadm enter --name osd.0 -- ceph daemon osd.0 config show

To view only non-default settings:

cephuser@adm > cephadm enter --name osd.0 -- ceph daemon osd.0 config diff

To inspect a specific option:

cephuser@adm > cephadm enter --name osd.0 -- ceph daemon osd.0 config get debug_osd

28.3 config-key store

config-key is a general-purpose service offered by the Ceph Monitors. It simplifies managing configuration keys by storing key-value pairs persistently. config-key is mainly used by Ceph tools and daemons.

Tip
Tip

After you add a new key or modify an existing one, restart the affected service for the changes to take effect. Find more details about operating Ceph services in Chapter 14, Operation of Ceph services.

Use the config-key command to operate the config-key store. The config-key command uses the following subcommands:

ceph config-key rm KEY

Deletes the specified key.

ceph config-key exists KEY

Checks for the existence of the specified key.

ceph config-key get KEY

Retrieves the value of the specified key.

ceph config-key ls

Lists all keys.

ceph config-key dump

Dumps all keys and their values.

ceph config-key set KEY VALUE

Stores the specified key with the given value.

28.3.1 iSCSI Gateway

The iSCSI Gateway uses the config-key store to save or read its configuration options. All iSCSI Gateway related keys are prefixed with the iscsi string, for example:

iscsi/trusted_ip_list
iscsi/api_port
iscsi/api_user
iscsi/api_password
iscsi/api_secure

If you need, for example, two sets of configuration options, extend the prefix with another descriptive keyword, for example datacenterA and datacenterB:

iscsi/datacenterA/trusted_ip_list
iscsi/datacenterA/api_port
[...]
iscsi/datacenterB/trusted_ip_list
iscsi/datacenterB/api_port
[...]

28.4 Ceph OSD and BlueStore

28.4.1 Configuring automatic cache sizing

BlueStore can be configured to automatically resize its caches when tc_malloc is configured as the memory allocator and the bluestore_cache_autotune setting is enabled. This option is currently enabled by default. BlueStore will attempt to keep OSD heap memory usage under a designated target size via the osd_memory_target configuration option. This is a best effort algorithm and caches will not shrink smaller than the amount specified by osd_memory_cache_min. Cache ratios will be chosen based on a hierarchy of priorities. If priority information is not available, the bluestore_cache_meta_ratio and bluestore_cache_kv_ratio options are used as fallbacks.

bluestore_cache_autotune

Automatically tunes the ratios assigned to different BlueStore caches while respecting minimum values. Default is True.

osd_memory_target

When tc_malloc and bluestore_cache_autotune are enabled, try to keep this many bytes mapped in memory.

Note
Note

This may not exactly match the RSS memory usage of the process. While the total amount of heap memory mapped by the process should generally stay close to this target, there is no guarantee that the kernel will actually reclaim memory that has been unmapped.

osd_memory_cache_min

When tc_malloc and bluestore_cache_autotune are enabled, set the minimum amount of memory used for caches.

Note
Note

Setting this value too low can result in significant cache thrashing.

28.5 Ceph Object Gateway

You can influence the Object Gateway behavior by a number of options. If an option is not specified, its default value is used. A complete list of the Object Gateway options follows:

28.5.1 General Settings

rgw_frontends

Configures the HTTP front-end(s). Specify multiple front-ends in a comma-delimited list. Each front-end configuration may include a list of options separated by spaces, where each option is in the form “key=value” or “key”. Default is beast port=7480.

rgw_data

Sets the location of the data files for the Object Gateway. Default is /var/lib/ceph/radosgw/CLUSTER_ID.

rgw_enable_apis

Enables the specified APIs. Default is 's3, swift, swift_auth, admin All APIs'.

rgw_cache_enabled

Enables or disables the Object Gateway cache. Default is true.

rgw_cache_lru_size

The number of entries in the Object Gateway cache. Default is 10000.

rgw_socket_path

The socket path for the domain socket. FastCgiExternalServer uses this socket. If you do not specify a socket path, the Object Gateway will not run as an external server. The path you specify here needs to be the same as the path specified in the rgw.conf file.

rgw_fcgi_socket_backlog

The socket backlog for fcgi. Default is 1024.

rgw_host

The host for the Object Gateway instance. It can be an IP address or a host name. Default is 0.0.0.0.

rgw_port

The port number where the instance listens for requests. If not specified, the Object Gateway runs external FastCGI.

rgw_dns_name

The DNS name of the served domain.

rgw_script_uri

The alternative value for the SCRIPT_URI if not set in the request.

rgw_request_uri

The alternative value for the REQUEST_URI if not set in the request.

rgw_print_continue

Enable 100-continue if it is operational. Default is true.

rgw_remote_addr_param

The remote address parameter. For example, the HTTP field containing the remote address, or the X-Forwarded-For address if a reverse proxy is operational. Default is REMOTE_ADDR.

rgw_op_thread_timeout

The timeout in seconds for open threads. Default is 600.

rgw_op_thread_suicide_timeout

The time timeout in seconds before the Object Gateway process dies. Disabled if set to 0 (default).

rgw_thread_pool_size

Number of threads for the Beast server. Increase to a higher value if you need to serve more requests. Defaults to 100 threads.

rgw_num_rados_handles

The number of RADOS cluster handles for Object Gateway. Each Object Gateway worker thread now gets to pick a RADOS handle for its lifetime. This option may be deprecated and removed in future releases. Default is 1.

rgw_num_control_oids

The number of notification objects used for cache synchronization between different Object Gateway instances. Default is 8.

rgw_init_timeout

The number of seconds before the Object Gateway gives up on initialization. Default is 30.

rgw_mime_types_file

The path and location of the MIME types. Used for Swift auto-detection of object types. Default is /etc/mime.types.

rgw_gc_max_objs

The maximum number of objects that may be handled by garbage collection in one garbage collection processing cycle. Default is 32.

rgw_gc_obj_min_wait

The minimum wait time before the object may be removed and handled by garbage collection processing. Default is 2 * 3600.

rgw_gc_processor_max_time

The maximum time between the beginning of two consecutive garbage collection processing cycles. Default is 3600.

rgw_gc_processor_period

The cycle time for garbage collection processing. Default is 3600.

rgw_s3_success_create_obj_status

The alternate success status response for create-obj. Default is 0.

rgw_resolve_cname

Whether the Object Gateway should use DNS CNAME record of the request host name field (if host name is not equal to the Object Gateway DNS name). Default is false.

rgw_obj_stripe_size

The size of an object stripe for Object Gateway objects. Default is 4 << 20.

rgw_extended_http_attrs

Add a new set of attributes that can be set on an entity (for example, a user, a bucket, or an object). These extra attributes can be set through HTTP header fields when putting the entity or modifying it using the POST method. If set, these attributes will return as HTTP fields when requesting GET/HEAD on the entity. Default is content_foo, content_bar, x-foo-bar.

rgw_exit_timeout_secs

Number of seconds to wait for a process before exiting unconditionally. Default is 120.

rgw_get_obj_window_size

The window size in bytes for a single object request. Default is 16 << 20.

rgw_get_obj_max_req_size

The maximum request size of a single GET operation sent to the Ceph Storage Cluster. Default is 4 << 20.

rgw_relaxed_s3_bucket_names

Enables relaxed S3 bucket name rules for US region buckets. Default is false.

rgw_list_buckets_max_chunk

The maximum number of buckets to retrieve in a single operation when listing user buckets. Default is 1000.

rgw_override_bucket_index_max_shards

Represents the number of shards for the bucket index object. Setting 0 (default) indicates there is no sharding. It is not recommended to set a value too large (for example 1000) as it increases the cost for bucket listing. This variable should be set in the client or global sections so that it is automatically applied to radosgw-admin commands.

rgw_curl_wait_timeout_ms

The timeout in milliseconds for certain curl calls. Default is 1000.

rgw_copy_obj_progress

Enables output of object progress during long copy operations. Default is true.

rgw_copy_obj_progress_every_bytes

The minimum bytes between copy progress output. Default is 1024 * 1024.

rgw_admin_entry

The entry point for an admin request URL. Default is admin.

rgw_content_length_compat

Enable compatibility handling of FCGI requests with both CONTENT_LENGTH AND HTTP_CONTENT_LENGTH set. Default is false.

rgw_bucket_quota_ttl

The amount of time in seconds for which cached quota information is trusted. After this timeout, the quota information will be re-fetched from the cluster. Default is 600.

rgw_user_quota_bucket_sync_interval

The amount of time in seconds for which the bucket quota information is accumulated before synchronizing to the cluster. During this time, other Object Gateway instances will not see the changes in the bucket quota stats related to operations on this instance. Default is 180.

rgw_user_quota_sync_interval

The amount of time in seconds for which user quota information is accumulated before synchronizing to the cluster. During this time, other Object Gateway instances will not see the changes in the user quota stats related to operations on this instance. Default is 180.

rgw_bucket_default_quota_max_objects

Default maximum number of objects per bucket. It is set on new users if no other quota is specified, and has no effect on existing users. This variable should be set in the client or global sections so that it is automatically applied to radosgw-admin commands. Default is -1.

rgw_bucket_default_quota_max_size

Default maximum capacity per bucket in bytes. It is set on new users if no other quota is specified, and has no effect on existing users. Default is -1.

rgw_user_default_quota_max_objects

Default maximum number of objects for a user. This includes all objects in all buckets owned by the user. It is set on new users if no other quota is specified, and has no effect on existing users. Default is -1.

rgw_user_default_quota_max_size

The value for user maximum size quota in bytes set on new users if no other quota is specified. It has no effect on existing users. Default is -1.

rgw_verify_ssl

Verify SSL certificates while making requests. Default is true.

rgw_max_chunk_size

Maximum size of a chunk of data that will be read in a single operation. Increasing the value to 4 MB (4194304) will provide better performance when processing large objects. Default is 128 kB (131072).

Multisite Settings
rgw_zone

The name of the zone for the gateway instance. If no zone is set, a cluster-wide default can be configured with the radosgw-admin zone default command.

rgw_zonegroup

The name of the zonegroup for the gateway instance. If no zonegroup is set, a cluster-wide default can be configured with the radosgw-admin zonegroup default command.

rgw_realm

The name of the realm for the gateway instance. If no realm is set, a cluster-wide default can be configured with theradosgw-admin realm default command.

rgw_run_sync_thread

If there are other zones in the realm to synchronize from, spawn threads to handle the synchronization of data and metadata. Default is true.

rgw_data_log_window

The data log entries window in seconds. Default is 30.

rgw_data_log_changes_size

The number of in-memory entries to hold for the data changes log. Default is 1000.

rgw_data_log_obj_prefix

The object name prefix for the data log. Default is 'data_log'.

rgw_data_log_num_shards

The number of shards (objects) on which to keep the data changes log. Default is 128.

rgw_md_log_max_shards

The maximum number of shards for the metadata log. Default is 64.

Swift Settings
rgw_enforce_swift_acls

Enforces the Swift Access Control List (ACL) settings. Default is true.

rgw_swift_token_expiration

The time in seconds for expiring a Swift token. Default is 24 * 3600.

rgw_swift_url

The URL for the Ceph Object Gateway Swift API.

rgw_swift_url_prefix

The URL prefix for the Swift StorageURL that goes in front of the '/v1' part. This allows to run several Gateway instances on the same host. For compatibility, setting this configuration variable to empty causes the default '/swift' to be used. Use explicit prefix '/' to start StorageURL at the root.

Warning
Warning

Setting this option to '/' will not work if S3 API is enabled. Keep in mind that disabling S3 will make it impossible to deploy the Object Gateway in the multisite configuration.

rgw_swift_auth_url

Default URL for verifying v1 authentication tokens when the internal Swift authentication is not used.

rgw_swift_auth_entry

The entry point for a Swift authentication URL. Default is auth.

rgw_swift_versioning_enabled

Enables the Object Versioning of OpenStack Object Storage API. This allows clients to put the X-Versions-Location attribute on containers that should be versioned. The attribute specifies the name of container storing archived versions. It must be owned by the same user as the versioned container for reasons of access control verification—ACLs are not taken into consideration. Those containers cannot be versioned by the S3 object versioning mechanism. Default is false.

Logging Settings
rgw_log_nonexistent_bucket

Enables the Object Gateway to log a request for a non-existent bucket. Default is false.

rgw_log_object_name

The logging format for an object name. See the manual page man 1 date for details about format specifiers. Default is %Y-%m-%d-%H-%i-%n.

rgw_log_object_name_utc

Whether a logged object name includes a UTC time. If set to false (default), it uses the local time.

rgw_usage_max_shards

The maximum number of shards for usage logging. Default is 32.

rgw_usage_max_user_shards

The maximum number of shards used for a single user’s usage logging. Default is 1.

rgw_enable_ops_log

Enable logging for each successful Object Gateway operation. Default is false.

rgw_enable_usage_log

Enable the usage log. Default is false.

rgw_ops_log_rados

Whether the operations log should be written to the Ceph Storage Cluster back-end. Default is true.

rgw_ops_log_socket_path

The Unix domain socket for writing operations logs.

rgw_ops_log_data_backlog

The maximum data backlog data size for operations logs written to a Unix domain socket. Default is 5 << 20.

rgw_usage_log_flush_threshold

The number of dirty merged entries in the usage log before flushing synchronously. Default is 1024.

rgw_usage_log_tick_interval

Flush pending usage log data every 'n' seconds. Default is 30.

rgw_log_http_headers

Comma-delimited list of HTTP headers to include in log entries. Header names are case-insensitive, and use the full header name with words separated by underscores. For example, 'http_x_forwarded_for', 'http_x_special_k'.

rgw_intent_log_object_name

The logging format for the intent log object name. See the manual page man 1 date for details about format specifiers. Default is '%Y-%m-%d-%i-%n'.

rgw_intent_log_object_name_utc

Whether the intent log object name includes a UTC time. If set to false (default), it uses the local time.

Keystone Settings
rgw_keystone_url

The URL for the Keystone server.

rgw_keystone_api_version

The version (2 or 3) of OpenStack Identity API that should be used for communication with the Keystone server. Default is 2.

rgw_keystone_admin_domain

The name of the OpenStack domain with the administrator privilege when using OpenStack Identity API v3.

rgw_keystone_admin_project

The name of the OpenStack project with the administrator privilege when using OpenStack Identity API v3. If not set, the value of the rgw keystone admin tenant will be used instead.

rgw_keystone_admin_token

The Keystone administrator token (shared secret). In the Object Gateway, authentication with the administrator token has priority over authentication with the administrator credentials (options rgw keystone admin user, rgw keystone admin password, rgw keystone admin tenant, rgw keystone admin project, and rgw keystone admin domain). The administrator token feature is considered as deprecated.

rgw_keystone_admin_tenant

The name of the OpenStack tenant with the administrator privilege (Service Tenant) when using OpenStack Identity API v2.

rgw_keystone_admin_user

The name of the OpenStack user with the administrator privilege for Keystone authentication (Service User) when using OpenStack Identity API v2.

rgw_keystone_admin_password

The password for the OpenStack administrator user when using OpenStack Identity API v2.

rgw_keystone_accepted_roles

The roles required to serve requests. Default is 'Member, admin'.

rgw_keystone_token_cache_size

The maximum number of entries in each Keystone token cache. Default is 10000.

rgw_keystone_revocation_interval

The number of seconds between token revocation checks. Default is 15 * 60.

rgw_keystone_verify_ssl

Verify SSL certificates while making token requests to Keystone. Default is true.

28.5.1.1 Additional notes

rgw_dns_name

Allows clients to use vhost-style buckets.

vhost-style access refers to the use of bucketname.s3-endpoint/object-path. This is in comparison to path-style access: s3-endpoint/bucket/object

If the rgw dns name is set, verify that the S3 client is configured to direct requests to the endpoint specified by rgw dns name.

28.5.2 Configuring HTTP front-ends

28.5.2.1 Beast

port, ssl_port

IPv4 & IPv6 listening port numbers. You can specify multiple port numbers:

port=80 port=8000 ssl_port=8080

Default is 80.

endpoint, ssl_endpoint

The listening addresses in the form 'address[:port]', where the address is an IPv4 address string in dotted decimal form, or an IPv6 address in hexadecimal notation surrounded by square brackets. Specifying an IPv6 endpoint would listen to IPv6 only. The optional port number defaults to 80 for endpoint and 443 for ssl_endpoint. You can specify multiple addresses:

endpoint=[::1] endpoint=192.168.0.100:8000 ssl_endpoint=192.168.0.100:8080
ssl_private_key

Optional path to the private key file used for SSL-enabled endpoints. If not specified, the ssl_certificate file is used as a private key.

tcp_nodelay

If specified, the socket option will disable Nagle's algorithm on the connection. It means that packets will be sent as soon as possible instead of waiting for a full buffer or timeout to occur.

'1' disables Nagle's algorithm for all sockets.

'0' keeps Nagle's algorithm enabled (default).

Example 28.1: Example Beast Configuration
cephuser@adm > ceph config set rgw.myrealm.myzone.ses-node1.kwwazo \
 rgw_frontends beast port=8000 ssl_port=443 \
 ssl_certificate=/etc/ssl/ssl.crt \
 error_log_file=/var/log/radosgw/beast.error.log

28.5.2.2 CivetWeb

port

The listening port number. For SSL-enabled ports, add an 's' suffix (for example, '443s'). To bind a specific IPv4 or IPv6 address, use the form 'address:port'. You can specify multiple endpoints either by joining them with '+' or by providing multiple options:

port=127.0.0.1:8000+443s
port=8000 port=443s

Default is 7480.

num_threads

The number of threads spawned by Civetweb to handle incoming HTTP connections. This effectively limits the number of concurrent connections that the front-end can service.

Default is the value specified by the rgw_thread_pool_size option.

request_timeout_ms

The amount of time in milliseconds that Civetweb will wait for more incoming data before giving up.

Default is 30000 milliseconds.

access_log_file

Path to the access log file. You can specify either a full path, or a path relative to the current working directory. If not specified (default), then accesses are not logged.

error_log_file

Path to the error log file. You can specify either a full path, or a path relative to the current working directory. If not specified (default), then errors are not logged.

Example 28.2: Example Civetweb Configuration in /etc/ceph/ceph.conf
cephuser@adm > ceph config set rgw.myrealm.myzone.ses-node2.ingabw \
 rgw_frontends civetweb port=8000+443s request_timeout_ms=30000 \
 error_log_file=/var/log/radosgw/civetweb.error.log

28.5.2.3 Common Options

ssl_certificate

Path to the SSL certificate file used for SSL-enabled endpoints.

prefix

A prefix string that is inserted into the URI of all requests. For example, a Swift-only front-end could supply a URI prefix of /swift.