Skip to content

Commit 742c0d7

Browse files
committed
update KEP for PRR.
1 parent d3a4065 commit 742c0d7

File tree

2 files changed

+65
-21
lines changed
  • keps
    • prod-readiness/sig-cloud-provider
    • sig-cloud-provider/2436-controller-manager-leader-migration

2 files changed

+65
-21
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
kep-number: 2436
2+
alpha:
3+
approver: "@deads2k"

keps/sig-cloud-provider/2436-controller-manager-leader-migration/README.md

Lines changed: 62 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Cloud Controller Manager Migration
1+
# Controller Manager Leader Migration
22

33
## Table of Contents
44

@@ -9,7 +9,7 @@
99
- [Goals](#goals)
1010
- [Non-Goals](#non-goals)
1111
- [Proposal](#proposal)
12-
- [Implementation Details/Notes/Constraints [optional]](#implementation-detailsnotesconstraints-optional)
12+
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
1313
- [Migration Configuration](#migration-configuration)
1414
- [Default LeaderMigrationConfiguration](#default-leadermigrationconfiguration)
1515
- [Component Flags](#component-flags)
@@ -18,41 +18,43 @@
1818
- [Upgrade the Control Plane](#upgrade-the-control-plane)
1919
- [Disable Leader Migration](#disable-leader-migration)
2020
- [Risks and Mitigations](#risks-and-mitigations)
21+
- [Design Details](#design-details)
2122
- [Test Plan](#test-plan)
2223
- [Graduation Criteria](#graduation-criteria)
2324
- [Alpha -> Beta Graduation](#alpha---beta-graduation)
2425
- [Beta -> GA Graduation](#beta---ga-graduation)
2526
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
2627
- [Version Skew Strategy](#version-skew-strategy)
28+
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
29+
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
2730
- [Implementation History](#implementation-history)
31+
- [Drawbacks](#drawbacks)
32+
- [Alternatives](#alternatives)
2833
<!-- /toc -->
2934

3035
## Release Signoff Checklist
3136

32-
**ACTION REQUIRED:** In order to merge code into a release, there must be an issue in [kubernetes/enhancements] referencing this KEP and targeting a release milestone **before [Enhancement Freeze](https://github.com/kubernetes/sig-release/tree/master/releases)
33-
of the targeted release**.
37+
Items marked with (R) are required *prior to targeting to a milestone / release*.
3438

35-
For enhancements that make changes to code or processes/procedures in core Kubernetes i.e., [kubernetes/kubernetes], we require the following Release Signoff checklist to be completed.
36-
37-
Check these off as they are completed for the Release Team to track. These checklist items _must_ be updated for the enhancement to be released.
38-
39-
- [X] kubernetes/enhancements issue in release milestone, which links to KEP (this should be a link to the KEP location in kubernetes/enhancements, not the initial KEP PR)
40-
- [X] KEP approvers have set the KEP status to `implementable`
41-
- [X] Design details are appropriately documented
42-
- [X] Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
43-
- [X] Graduation criteria is in place
44-
- [X] "Implementation History" section is up-to-date for milestone
39+
- [X] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
40+
- [X] (R) KEP approvers have approved the KEP status as `implementable`
41+
- [X] (R) Design details are appropriately documented
42+
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
43+
- [X] (R) Graduation criteria is in place
44+
- [X] (R) Production readiness review completed
45+
- [X] (R) Production readiness review approved
46+
- [ ] "Implementation History" section is up-to-date for milestone
4547
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
46-
- [ ] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
47-
48-
**Note:** Any PRs to move a KEP to `implementable` or significant changes once it is marked `implementable` should be approved by each of the KEP approvers. If any of those approvers is no longer appropriate than changes to that list should be approved by the remaining approvers and/or the owning SIG (or SIG-arch for cross cutting KEPs).
48+
- [X] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
4949

50+
<!--
5051
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
52+
-->
5153

5254
[kubernetes.io]: https://kubernetes.io/
53-
[kubernetes/enhancements]: https://github.com/kubernetes/enhancements/issues
54-
[kubernetes/kubernetes]: https://github.com/kubernetes/kubernetes
55-
[kubernetes/website]: https://github.com/kubernetes/website
55+
[kubernetes/enhancements]: https://git.k8s.io/enhancements
56+
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
57+
[kubernetes/website]: https://git.k8s.io/website
5658

5759
## Summary
5860

@@ -139,7 +141,7 @@ will prevent any of the v1.18 CCMs from claiming the lock. When the current hold
139141

140142

141143

142-
### Implementation Details/Notes/Constraints [optional]
144+
### Notes/Constraints/Caveats (Optional)
143145

144146
#### Migration Configuration
145147

@@ -289,6 +291,8 @@ unsetting the `--enable-migration-config` flag.
289291
* Increased apiserver load due to new leader election resource per migration configuration.
290292
* User error could result in cloud controllers not running in any component at all.
291293

294+
## Design Details
295+
292296
### Test Plan
293297

294298
- Unit Testing:
@@ -323,10 +327,47 @@ does not change incompatibly across those versions.
323327

324328
Version skew is handled as long as the leader name is consistent across all control plane nodes during upgrade.
325329

330+
## Production Readiness Review Questionnaire
331+
332+
### Feature Enablement and Rollback
333+
334+
###### How can this feature be enabled / disabled in a live cluster?
335+
336+
- [X] Other
337+
- Describe the mechanism: this feature must be explicitly enabled by `--enable-leader-migration` flag
338+
- Will enabling / disabling the feature require downtime of the control plane? No
339+
- Will enabling / disabling the feature require downtime or re-provisioning of a node? No
340+
341+
###### Does enabling the feature change any default behavior?
342+
343+
No. The user must explicitly add `--enable-leader-migration` flag to enable this feature. If the user enables this
344+
feature without providing a configuration, the default configuration will reflect default situation and "just works".
345+
346+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
347+
348+
No. The feature is enabled and disabled solely with a flag
349+
350+
###### Are there any tests for feature enablement/disablement?
351+
352+
Yes. Unit & integration tests include flag/configuration parsing. E2E test will have cases with the feature enabled and
353+
disabled.
354+
326355
## Implementation History
327356

328357
- 07-25-2019 `Summary` and `Motivation` sections were merged signaling SIG acceptance
329358
- 01-21-2019 Implementation details are proposed to move KEP to `implementable` state.
330359
- 09-30-2020 `LeaderMigrationConfiguration` and `ControllerLeaderConfiguration` schemas merged as #94205.
331360
- 11-04-2020 Registration of both types merged as #96133
332361
- 12-28-2020 Parsing and validation merged as #96226
362+
363+
## Drawbacks
364+
365+
A single-node control plane does not need this feature. If downtime is allowed during control plane upgrade, KCM and CCM
366+
can have no migration mechanism at all.
367+
368+
## Alternatives
369+
370+
Change all controllers so that they can handle a situation where two instances of the same controller are running in
371+
both KCM and CCM. This requires a massive change to all controllers and potentially require other kinds of
372+
synchronization. It would be better that the controller manager provides migration mechanism instead of relying on each
373+
controller.

0 commit comments

Comments
 (0)