Chapter 13. Tuning Per-Device I/O Performance

Contents

13.1. I/O Scheduler -- /sys/block/<device>/queue/scheduler
13.2. I/O Barrier Tuning

13.1. I/O Scheduler -- /sys/block/<device>/queue/scheduler

This parameter allows changing the I/O scheduler algorithm. There are three options:

13.1.1. CFQ

This is the default option. Fairness-oriented I/O scheduler. The algorithm assigns each thread a time slice in which it is allowed to submit I/O to disk. This way each thread gets a fair share of I/O throughput. This I/O scheduler also allows assigning tasks I/O priorities which are taken into account during scheduling decisions (see man 1 ionice). The CFQ scheduler has the following parameters:

/sys/block/<device>/queue/iosched/slice_idle

When a task has no more I/O to submit in its timeslice, the I/O scheduler waits for a while before scheduling the next thread to improve locality of I/O. For media where locality does not play a big role (SSDs, SANs with lots of disks) setting /sys/block/<device>/queue/iosched/slice_idle to 0 can improve the throughput considerably.

/sys/block/<device>/queue/iosched/quantum

This option limits the maximum number of requests that are being processed by the device at once. The default value is 4. For a storage with several disks, this setting can unnecessarily limit parallel processing of requests. Therefore, increasing the value can improve performance although this can cause that the latency of some I/O may be increased due to more requests being buffered inside the storage. When changing this value, you can also consider tuning /sys/block/<device>/queue/iosched/slice_async_rq (the default value is 2) which limits the maximum number of asynchronous requests, usually writing requests, that are submitted in one timeslice.

/sys/block/<device>/queue/iosched/low_latency

For workloads where the latency of I/O is crucial, setting /sys/block/<device>/queue/iosched/low_latency to 1 can help.

13.1.2. NOOP

A trivial scheduler that just passes down the I/O that comes to it. Useful for checking whether complex I/O scheduling decisions of other schedulers are not causing I/O performance regressions.

In some cases it can be helpful for devices that do I/O scheduling themselves, as intelligent storage, or devices that do not depend on mechanical movement, like SSDs. Usually, the DEADLINE I/O scheduler is a better choice for these devices. It is a rather lightweight I/O scheduler but already does some useful work. However, NOOP may produce better performance on certain workloads.

13.1.3. DEADLINE

Latency-oriented I/O scheduler. The algorithm preferably serves reads before writes. /sys/block/<device>/queue/iosched/writes_starved controls how many reads can be sent to disk before it is possible to send writes and tries to observe given deadlines /sys/block/<device>/queue/iosched/read_expire for reads and /sys/block/<device>/queue/iosched/write_expire for writes after which I/O must be submitted to disk. This I/O scheduler can provide a superior throughput over the CFQ I/O scheduler in cases where several threads read and write and fairness is not an issue. For example, for several parallel readers from a SAN or some database-like loads.

13.2. I/O Barrier Tuning

Most file systems (XFS, ext3, ext4, reiserfs) send write barriers to disk after fsync or during transaction commits. Write barriers enforce proper ordering of writes, making volatile disk write caches safe to use, at some performance penalty. If your disks are battery-backed in one way or another, disabling barriers may safely improve performance.

Sending write barriers can be disabled using the barrier=0 mount option (for ext3, ext4, and reiserfs), or using the nobarrier mount option (for XFS).

[Warning]

Disabling barriers when disks cannot guarantee caches are properly written in case of power failure can lead to severe file system corruption and data loss.