Skip to content

Commit 757bf6e

Browse files
authored
Merge pull request #46777 from sftim/20240611_improve_etcd_management
Reorganize some callouts in etcd admin task
2 parents d9949e1 + f1a709e commit 757bf6e

File tree

1 file changed

+21
-21
lines changed

1 file changed

+21
-21
lines changed

content/en/docs/tasks/administer-cluster/configure-upgrade-etcd.md

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -385,6 +385,20 @@ for information on how to add members into an existing cluster.
385385

386386
## Restoring an etcd cluster
387387

388+
{{< caution >}}
389+
If any API servers are running in your cluster, you should not attempt to
390+
restore instances of etcd. Instead, follow these steps to restore etcd:
391+
392+
- stop *all* API server instances
393+
- restore state in all etcd instances
394+
- restart all API server instances
395+
396+
The Kubernetes project also recommends restarting Kubernetes components (`kube-scheduler`,
397+
`kube-controller-manager`, `kubelet`) to ensure that they don't rely on some
398+
stale data. In practice the restore takes a bit of time. During the
399+
restoration, critical components will lose leader lock and restart themselves.
400+
{{< /caution >}}
401+
388402
etcd supports restoring from snapshots that are taken from an etcd process of
389403
the [major.minor](http://semver.org/) version. Restoring a version from a
390404
different patch version of etcd is also supported. A restore operation is
@@ -443,42 +457,28 @@ current state. Although the scheduled pods might continue to run, no new pods
443457
can be scheduled. In such cases, recover the etcd cluster and potentially
444458
reconfigure Kubernetes API servers to fix the issue.
445459

446-
{{< note >}}
447-
If any API servers are running in your cluster, you should not attempt to
448-
restore instances of etcd. Instead, follow these steps to restore etcd:
449-
450-
- stop *all* API server instances
451-
- restore state in all etcd instances
452-
- restart all API server instances
453-
454-
We also recommend restarting any components (e.g. `kube-scheduler`,
455-
`kube-controller-manager`, `kubelet`) to ensure that they don't rely on some
456-
stale data. Note that in practice, the restore takes a bit of time. During the
457-
restoration, critical components will lose leader lock and restart themselves.
458-
{{< /note >}}
459460

460461
## Upgrading etcd clusters
461462

463+
{{< caution >}}
464+
Before you start an upgrade, back up your etcd cluster first.
465+
{{< /caution >}}
462466

463-
For more details on etcd upgrade, please refer to the [etcd upgrades](https://etcd.io/docs/latest/upgrades/) documentation.
464-
465-
{{< note >}}
466-
Before you start an upgrade, please back up your etcd cluster first.
467-
{{< /note >}}
467+
For details on etcd upgrade, refer to the [etcd upgrades](https://etcd.io/docs/latest/upgrades/) documentation.
468468

469469
## Maintaining etcd clusters
470470

471471
For more details on etcd maintenance, please refer to the [etcd maintenance](https://etcd.io/docs/latest/op-guide/maintenance/) documentation.
472472

473+
### Cluster defragmentation
474+
473475
{{% thirdparty-content single="true" %}}
474476

475-
{{< note >}}
476477
Defragmentation is an expensive operation, so it should be executed as infrequently
477478
as possible. On the other hand, it's also necessary to make sure any etcd member
478479
will not exceed the storage quota. The Kubernetes project recommends that when
479480
you perform defragmentation, you use a tool such as [etcd-defrag](https://github.com/ahrtr/etcd-defrag).
480481

481482
You can also run the defragmentation tool as a Kubernetes CronJob, to make sure that
482483
defragmentation happens regularly. See [`etcd-defrag-cronjob.yaml`](https://github.com/ahrtr/etcd-defrag/blob/main/doc/etcd-defrag-cronjob.yaml)
483-
for details.
484-
{{< /note >}}
484+
for details.

0 commit comments

Comments
 (0)