Skip to content

Commit b0a4b49

Browse files
authored
Merge pull request #51709 from sheriff-rh/etcd-fix
2 parents 2ebd84f + 301caa0 commit b0a4b49

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

modules/etcd-defrag.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99

1010
For large and dense clusters, etcd can suffer from poor performance if the keyspace grows too large and exceeds the space quota. Periodically maintain and defragment etcd to free up space in the data store. Monitor Prometheus for etcd metrics and defragment it when required; otherwise, etcd can raise a cluster-wide alarm that puts the cluster into a maintenance mode that accepts only key reads and deletes.
1111

12-
.Monitor these key metrics:
12+
Monitor these key metrics:
1313

1414
* `etcd_server_quota_backend_bytes`, which is the current quota limit
1515
* `etcd_mvcc_db_total_size_in_use_in_bytes`, which indicates the actual database usage after a history compaction

modules/recommended-etcd-practices.adoc

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -60,9 +60,9 @@ $ sudo docker run --volume /var/lib/etcd:/var/lib/etcd:Z quay.io/openshift-scale
6060

6161
The output reports whether the disk is fast enough to host etcd by comparing the 99th percentile of the fsync metric captured from the run to see if it is less than 20 ms. A few of the most important etcd metrics that might affected by I/O performance are as follow:
6262

63-
- `etcd_disk_wal_fsync_duration_seconds_bucket` metric reports the etcd's WAL fsync duration.
64-
- `etcd_disk_backend_commit_duration_seconds_bucket` metric reports the etcd backend commit latency duration.
65-
- `etcd_server_leader_changes_seen_total` metric reports the leader changes.
63+
* `etcd_disk_wal_fsync_duration_seconds_bucket` metric reports the etcd's WAL fsync duration
64+
* `etcd_disk_backend_commit_duration_seconds_bucket` metric reports the etcd backend commit latency duration
65+
* `etcd_server_leader_changes_seen_total` metric reports the leader changes
6666
6767
Because etcd replicates the requests among all the members, its performance strongly depends on network input/output (I/O) latency. High network latencies result in etcd heartbeats taking longer than the election timeout, which results in leader elections that are disruptive to the cluster. A key metric to monitor on a deployed {product-title} cluster is the 99th percentile of etcd network peer latency on each etcd cluster member. Use Prometheus to track the metric.
6868

0 commit comments

Comments
 (0)