14 Tuning I/O performance #
  I/O scheduling controls how input/output operations will be submitted to
  storage. SUSE Linux Enterprise Server offers various I/O algorithms—called
  elevators—suiting different workloads.
  Elevators can help to reduce seek operations and can prioritize I/O requests.
 
Choosing the best suited I/O elevator not only depends on the workload, but on the hardware, too. Single ATA disk systems, SSDs, RAID arrays, or network storage systems, for example, each require different tuning strategies.
14.1 Switching I/O scheduling #
SUSE Linux Enterprise Server picks a default I/O scheduler at boot-time, which can be changed on the fly per block device. This makes it possible to set different algorithms, for example, for the device hosting the system partition and the device hosting a database.
   The default I/O scheduler is chosen for each device based on whether the
   device reports to be rotational disk or not. For rotational disks, the
   BFQ I/O scheduler is picked.
   Other devices default to MQ-DEADLINE or NONE.
  
To change the elevator for a specific device in the running system, run the following command:
>sudoecho SCHEDULER > /sys/block/DEVICE/queue/scheduler
   Here, SCHEDULER is one of
   bfq, none,
   kyber, or mq-deadline.
   DEVICE is the block device
   (sda for example). Note that this change will not
   persist during reboot. For permanent I/O scheduler change for a particular
   device, copy /usr/lib/udev/rules.d/60-io-scheduler.rules to
   /etc/udev/rules.d/60-io-scheduler.rules, and edit
   the latter file to suit your needs.
  
On IBM Z, the default I/O scheduler for a storage device is set by the device driver.
elevator boot parameter removed
     The elevator boot parameter has been removed. The blk-mq I/O path replaces cfq, and does not include the 
     elevator boot parameter.
   
14.2 Available I/O elevators with blk-mq I/O path #
Below is a list of elevators available on SUSE Linux Enterprise Server for devices that use the blk-mq I/O path. If an elevator has tunable parameters, they can be set with the command:
echo VALUE > /sys/block/DEVICE/queue/iosched/TUNABLE
In the command above, VALUE is the desired value for the TUNABLE and DEVICE is the block device.
   To find out what elevators are available for a device
   (sda for example), run the following
   command (the currently selected scheduler is listed in brackets):
  
> cat /sys/block/sda/queue/scheduler
[mq-deadline] kyber bfq none
      When switching from legacy block to blk-mq I/O path for a device,
      the none option is roughly comparable to
      noop, mq-deadline is comparable
      to deadline, and bfq is
      comparable to cfq.
     
14.2.1 MQ-DEADLINE #
     MQ-DEADLINE is a
     latency-oriented I/O scheduler. MQ-DEADLINE has the following
     tunable parameters:
   
MQ-DEADLINE tunable parameters #| File | Description | 
|---|---|
| 
 |  Controls how many times reads are preferred
	   over writes. A value of  Default is  | 
| 
 |  Sets the deadline (current time plus the
	    Default is  | 
| 
 |  Sets the deadline (current time plus the
	    Default is  | 
| 
 | Enables (1) or disables (0) attempts to front merge requests. Default is  | 
| 
 |  Sets the maximum number of requests per batch
	   (deadline expiration is only checked for batches). This
	   parameter allows to balance between latency and
	   throughput. When set to  Default is  | 
14.2.2 NONE #
     When NONE is selected
     as I/O elevator option for blk-mq, no I/O scheduler
     is used, and I/O requests are passed down to the
     device without further I/O scheduling interaction.
   
     NONE is the default for
     NVM Express devices. With no overhead compared to other I/O
     elevator options, it is considered the fastest way of passing down
     I/O requests on multiple queues to such devices.
   
     There are no tunable parameters for NONE.
   
14.2.3 BFQ (Budget Fair Queueing) #
     BFQ is a
     fairness-oriented scheduler. It is described as "a
     proportional-share storage-I/O scheduling algorithm based on the
     slice-by-slice service scheme of CFQ. But BFQ assigns budgets,
     measured in number of sectors, to processes instead of time
     slices." (Source: 
     linux-4.12/block/bfq-iosched.c)
   
     BFQ allows to assign
     I/O priorities to tasks which are taken into account during
     scheduling decisions (see Section 9.3.3, “Prioritizing disk access with ionice”).
   
     BFQ scheduler has the
     following tunable parameters:
   
BFQ tunable parameters #| File | Description | 
|---|---|
| 
 | Value in milliseconds specifies how long to idle, waiting for next request on an empty queue. Default is  | 
| 
 | Same as  Default is  | 
| 
 | Enables (1) or disables (0)  Default is  | 
| 
 | Maximum value (in Kbytes) for backward seeking. Default is  | 
| 
 | Used to compute the cost of backward seeking. Default is  | 
| 
 | Value (in milliseconds) is used to set the timeout of asynchronous requests. Default is  | 
| 
 | Value in milliseconds specifies the timeout of synchronous requests. Default is  | 
| 
 | Maximum time in milliseconds that a task (queue) is serviced after it has been selected. Default is  | 
| 
 |  Limit for number of sectors that are served
	   at maximum within  Default is  | 
| 
 |  Enables (1) or disables (0)  Default is  | 
14.2.4 KYBER #
    KYBER is a
    latency-oriented I/O scheduler. It makes it possible to set target latencies
    for reads and synchronous writes and throttles I/O requests in
    order to try to meet these target latencies.
   
KYBER tunable parameters #| File | Description | 
|---|---|
| 
 | Sets the target latency for read operations in nanoseconds. Default is  | 
| 
 | Sets the target latency for write operations in nanoseconds. Default is  | 
14.3 I/O barrier tuning #
Some file systems (for example, Ext3 or Ext4) send write barriers to disk after fsync or during transaction commits. Write barriers enforce proper ordering of writes, making volatile disk write caches safe to use (at some performance penalty). If your disks are battery-backed in one way or another, disabling barriers can safely improve performance.
nobarrier is deprecated in XFS
  Note that the nobarrier option has been completely deprecated
  for XFS, and it is not a valid mount option in SUSE Linux Enterprise 15 SP2 and upward. Any
  XFS mount command that explicitly specifies the flag will fail to mount the
  file system. To prevent this from happening, make sure that no scripts or
  fstab entries contain the nobarrier option.
  
   Sending write barriers can be disabled using the
   nobarrier mount option.
  
Disabling barriers when disks cannot guarantee caches are properly written in case of power failure can lead to severe file system corruption and data loss.