Skip to content

Commit 8049670

Browse files
authored
Merge pull request kubernetes#3183 from jiahuif-forks/kep/2436/to-ga
KEP-2436 Leader Migration to GA
2 parents 9f5142a + c897463 commit 8049670

File tree

3 files changed

+28
-19
lines changed

3 files changed

+28
-19
lines changed

keps/prod-readiness/sig-cloud-provider/2436.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,5 @@ alpha:
33
approver: "@deads2k"
44
beta:
55
approver: "@deads2k"
6+
stable:
7+
approver: "@deads2k"

keps/sig-cloud-provider/2436-controller-manager-leader-migration/README.md

Lines changed: 22 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
4848
- [X] (R) Graduation criteria is in place
4949
- [X] (R) Production readiness review completed
5050
- [X] (R) Production readiness review approved
51-
- [ ] "Implementation History" section is up-to-date for milestone
51+
- [X] "Implementation History" section is up-to-date for milestone
5252
- [X] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
5353
- [X] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
5454

@@ -162,8 +162,9 @@ type LeaderMigrationConfiguration struct {
162162
LeaderName string `json:"leaderName"`
163163

164164
// ResourceLock indicates the resource object type that will be used to lock
165-
// Must be either "leases" or "endpoints", defaults to 'leases'
166-
// No other types (e.g. "endpointsleases" or "configmapsleases") are allowed
165+
// Must be "leases", default to "leases". This field is retained only for
166+
// compatibility with previous releases.
167+
// This field will be removed in stable (v1) API.
167168
ResourceLock string
168169

169170
// ControllerLeaders contains a list of migrating leader lock configurations
@@ -235,7 +236,7 @@ The default LeaderMigrationConfiguration can be represented as follows:
235236

236237
```yaml
237238
kind: LeaderMigrationConfiguration
238-
apiVersion: controllermanager.config.k8s.io/v1alpha1
239+
apiVersion: controllermanager.config.k8s.io/v1
239240
leaderName: cloud-provider-extraction-migration
240241
resourceLock: leases
241242
controllerLeaders:
@@ -292,9 +293,10 @@ unsetting the `--enable-leader-migration` flag.
292293
- test resource registration, parsing, and validation against the Schema APIs
293294
- test interactions with the leader election APIs
294295
- E2E Testing
295-
- In a single-node control plane with leader election setting, test control plane upgrade, assert controller managers
296+
- In a replicated control plane, test control plane upgrade, assert controller managers
296297
become health and ready after upgrade
297-
- In a multi-node control plane setting, test control plane upgrade, assert availability throughout the upgrade
298+
- In a replicated control plane, test control plane upgrade, assert no controllers
299+
become active in both controller managers.
298300

299301
### Graduation Criteria
300302

@@ -305,7 +307,10 @@ The default migration configuration is implemented and tested.
305307

306308
##### Beta -> GA Graduation
307309

308-
Leader migration configuration works on all in-tree cloud providers.
310+
- Leader Migration works on all in-tree cloud providers that require migration.
311+
- Leader Migration has an automated upgrade test on a replicated control plane, with Leader Migration enabled, of the following cases
312+
- Upgrade from KCM only to KCM + CCM
313+
- Rollback from KCM + CCM to KCM only
309314

310315
### Upgrade / Downgrade Strategy
311316

@@ -350,7 +355,8 @@ disabled.
350355

351356
###### How can a rollout or rollback fail? Can it impact already running workloads?
352357

353-
The rollout may fail if the configuration file does not represent correct controller-to-manager.
358+
The rollout may fail if the configuration file does not represent correct controller-to-manager assignment
359+
or configurations mismatch between controller managers.
354360
This can cause controllers referred in the configuration file to either be unavailable or run in multiple instances.
355361

356362
The rollback may fail if the leader election of the controller manager is not properly configured.
@@ -383,6 +389,8 @@ N/A. This feature is never used by any user workloads.
383389
- The `Lease` resource used in the migration can be watched for transition of leadership and timing information.
384390
- logs and metrics can directly indicate the status of migration.
385391

392+
Note that this feature is intended for cluster administrators, who should have access to metrics during the upgrade.
393+
386394
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
387395

388396
Leader Migration is designed to ensure availability of controller managers during upgrade,
@@ -391,14 +399,11 @@ and this feature will not affect SLOs of controller managers.
391399
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
392400

393401
- [X] Metrics
394-
- leader_active
395-
- other per-controller availability metrics.
402+
- per-controller health checks in both controller managers.
403+
- Components exposing the metric: kube-controller-manager, cloud-controller-manager
396404

397405
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
398406

399-
It would help if every controller that the controller manager hosts expose metrics about their availability.
400-
However, per-controller metrics are out of scope of this KEP.
401-
402407
Status of the migration lease, provided by the API server, can help observe the transition of holders
403408
if exposed as resource metrics.
404409

@@ -422,7 +427,7 @@ If the service accounts are not granted access to the lease resources, the RBAC
422427

423428
###### Will enabling / using this feature result in introducing new API types?
424429

425-
Type: `controllermanager.config.k8s.io/v1alpha1.LeaderMigrationConfiguration`
430+
Type: `controllermanager.config.k8s.io/v1.LeaderMigrationConfiguration`
426431
This resource is only for configuration file parsing. The resource should never reach the API server.
427432

428433
###### Will enabling / using this feature result in any new calls to the cloud provider?
@@ -466,7 +471,9 @@ N/A.
466471
- 12-28-2020 Parsing and validation merged as #96226
467472
- 03-10-2021 Implementation for alpha state completed, released in 1.21.
468473
- 03-30-2021 User guide published as kubernetes/website#26970
469-
- 05-11-2021 KEP updated to target beta.
474+
- 05-11-2021 KEP updated to target beta.
475+
- 01-21-2022 KEP updated to target GA.
476+
- 01-25-2022 Testing and monitoring revised for GA.
470477

471478
## Drawbacks
472479

keps/sig-cloud-provider/2436-controller-manager-leader-migration/kep.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,18 +20,18 @@ see-also:
2020
- "/keps/sig-cloud-provider/20180530-cloud-controller-manager.md"
2121

2222
# The target maturity stage in the current dev cycle for this KEP.
23-
stage: beta
23+
stage: stable
2424

2525
# The most recent milestone for which work toward delivery of this KEP has been
2626
# done. This can be the current (upcoming) milestone, if it is being actively
2727
# worked on.
28-
latest-milestone: "v1.22"
28+
latest-milestone: "v1.24"
2929

3030
# The milestone at which this feature was, or is targeted to be, at each stage.
3131
milestone:
3232
alpha: "v1.21"
3333
beta: "v1.22"
34-
stable: "v1.23"
34+
stable: "v1.24"
3535

3636
# The following PRR answers are required at alpha release
3737
# List the feature gate name and the components for which it must be enabled
@@ -44,4 +44,4 @@ disable-supported: true
4444

4545
# The following PRR answers are required at beta release
4646
metrics:
47-
- leader_active
47+
- per controller health checks

0 commit comments

Comments
 (0)