@@ -385,6 +385,20 @@ for information on how to add members into an existing cluster.
385
385
386
386
## Restoring an etcd cluster
387
387
388
+ {{< caution >}}
389
+ If any API servers are running in your cluster, you should not attempt to
390
+ restore instances of etcd. Instead, follow these steps to restore etcd:
391
+
392
+ - stop * all* API server instances
393
+ - restore state in all etcd instances
394
+ - restart all API server instances
395
+
396
+ The Kubernetes project also recommends restarting Kubernetes components (` kube-scheduler ` ,
397
+ ` kube-controller-manager ` , ` kubelet ` ) to ensure that they don't rely on some
398
+ stale data. In practice the restore takes a bit of time. During the
399
+ restoration, critical components will lose leader lock and restart themselves.
400
+ {{< /caution >}}
401
+
388
402
etcd supports restoring from snapshots that are taken from an etcd process of
389
403
the [ major.minor] ( http://semver.org/ ) version. Restoring a version from a
390
404
different patch version of etcd is also supported. A restore operation is
@@ -443,42 +457,28 @@ current state. Although the scheduled pods might continue to run, no new pods
443
457
can be scheduled. In such cases, recover the etcd cluster and potentially
444
458
reconfigure Kubernetes API servers to fix the issue.
445
459
446
- {{< note >}}
447
- If any API servers are running in your cluster, you should not attempt to
448
- restore instances of etcd. Instead, follow these steps to restore etcd:
449
-
450
- - stop * all* API server instances
451
- - restore state in all etcd instances
452
- - restart all API server instances
453
-
454
- We also recommend restarting any components (e.g. ` kube-scheduler ` ,
455
- ` kube-controller-manager ` , ` kubelet ` ) to ensure that they don't rely on some
456
- stale data. Note that in practice, the restore takes a bit of time. During the
457
- restoration, critical components will lose leader lock and restart themselves.
458
- {{< /note >}}
459
460
460
461
## Upgrading etcd clusters
461
462
463
+ {{< caution >}}
464
+ Before you start an upgrade, back up your etcd cluster first.
465
+ {{< /caution >}}
462
466
463
- For more details on etcd upgrade, please refer to the [ etcd upgrades] ( https://etcd.io/docs/latest/upgrades/ ) documentation.
464
-
465
- {{< note >}}
466
- Before you start an upgrade, please back up your etcd cluster first.
467
- {{< /note >}}
467
+ For details on etcd upgrade, refer to the [ etcd upgrades] ( https://etcd.io/docs/latest/upgrades/ ) documentation.
468
468
469
469
## Maintaining etcd clusters
470
470
471
471
For more details on etcd maintenance, please refer to the [ etcd maintenance] ( https://etcd.io/docs/latest/op-guide/maintenance/ ) documentation.
472
472
473
+ ### Cluster defragmentation
474
+
473
475
{{% thirdparty-content single="true" %}}
474
476
475
- {{< note >}}
476
477
Defragmentation is an expensive operation, so it should be executed as infrequently
477
478
as possible. On the other hand, it's also necessary to make sure any etcd member
478
479
will not exceed the storage quota. The Kubernetes project recommends that when
479
480
you perform defragmentation, you use a tool such as [ etcd-defrag] ( https://github.com/ahrtr/etcd-defrag ) .
480
481
481
482
You can also run the defragmentation tool as a Kubernetes CronJob, to make sure that
482
483
defragmentation happens regularly. See [ ` etcd-defrag-cronjob.yaml ` ] ( https://github.com/ahrtr/etcd-defrag/blob/main/doc/etcd-defrag-cronjob.yaml )
483
- for details.
484
- {{< /note >}}
484
+ for details.
0 commit comments