Skip to content

Commit 580644e

Browse files
authored
Merge pull request #55760 from tmalove/etcd-ocpbugs-7283-tlove
[OCPBUGS-7283]: Revert etcd disk latency to 10ms
2 parents d5e0ff5 + 55230af commit 580644e

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

modules/recommended-etcd-practices.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Although etcd is not particularly I/O intensive, it requires a low latency block
1111

1212
Those latencies can cause etcd to miss heartbeats, not commit new proposals to the disk on time, and ultimately experience request timeouts and temporary leader loss. High write latencies also lead to an OpenShift API slowness, which affects cluster performance. Because of these reasons, avoid colocating other workloads on the control-plane nodes.
1313

14-
In terms of latency, run etcd on top of a block device that can write at least 50 IOPS of 8000 bytes long sequentially. That is, with a latency of 20ms, keep in mind that uses fdatasync to synchronize each write in the WAL. For heavy loaded clusters, sequential 500 IOPS of 8000 bytes (2 ms) are recommended. To measure those numbers, you can use a benchmarking tool, such as fio.
14+
In terms of latency, run etcd on top of a block device that can write at least 50 IOPS of 8000 bytes long sequentially. That is, with a latency of 10ms, keep in mind that uses fdatasync to synchronize each write in the WAL. For heavy loaded clusters, sequential 500 IOPS of 8000 bytes (2 ms) are recommended. To measure those numbers, you can use a benchmarking tool, such as fio.
1515

1616
To achieve such performance, run etcd on machines that are backed by SSD or NVMe disks with low latency and high throughput. Consider single-level cell (SLC) solid-state drives (SSDs), which provide 1 bit per memory cell, are durable and reliable, and are ideal for write-intensive workloads.
1717

@@ -65,7 +65,7 @@ $ sudo docker run --volume /var/lib/etcd:/var/lib/etcd:Z quay.io/openshift-scale
6565
----
6666
--
6767

68-
The output reports whether the disk is fast enough to host etcd by comparing the 99th percentile of the fsync metric captured from the run to see if it is less than 20 ms. A few of the most important etcd metrics that might affected by I/O performance are as follow:
68+
The output reports whether the disk is fast enough to host etcd by comparing the 99th percentile of the fsync metric captured from the run to see if it is less than 10 ms. A few of the most important etcd metrics that might affected by I/O performance are as follow:
6969

7070
* `etcd_disk_wal_fsync_duration_seconds_bucket` metric reports the etcd's WAL fsync duration
7171
* `etcd_disk_backend_commit_duration_seconds_bucket` metric reports the etcd backend commit latency duration

0 commit comments

Comments
 (0)