Skip to content

Commit 125e35b

Browse files
committed
Add upgrade/downgrade and version skew strategy sections
1 parent b05f77e commit 125e35b

File tree

2 files changed

+44
-20
lines changed
  • keps/sig-node
    • 1287-in-place-update-pod-resources
    • 2273-kubelet-container-resources-cri-api-changes

2 files changed

+44
-20
lines changed

keps/sig-node/1287-in-place-update-pod-resources/README.md

Lines changed: 34 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -23,16 +23,18 @@
2323
- [Affected Components](#affected-components)
2424
- [Future Enhancements](#future-enhancements)
2525
- [Risks and Mitigations](#risks-and-mitigations)
26-
- [Test Plan](#test-plan)
27-
- [Unit Tests](#unit-tests)
28-
- [Pod Resize E2E Tests](#pod-resize-e2e-tests)
29-
- [Resource Quota and Limit Ranges](#resource-quota-and-limit-ranges)
30-
- [Resize Policy Tests](#resize-policy-tests)
31-
- [Backward Compatibility and Negative Tests](#backward-compatibility-and-negative-tests)
32-
- [Graduation Criteria](#graduation-criteria)
33-
- [Alpha](#alpha)
34-
- [Beta](#beta)
35-
- [Stable](#stable)
26+
- [Test Plan](#test-plan)
27+
- [Unit Tests](#unit-tests)
28+
- [Pod Resize E2E Tests](#pod-resize-e2e-tests)
29+
- [Resource Quota and Limit Ranges](#resource-quota-and-limit-ranges)
30+
- [Resize Policy Tests](#resize-policy-tests)
31+
- [Backward Compatibility and Negative Tests](#backward-compatibility-and-negative-tests)
32+
- [Graduation Criteria](#graduation-criteria)
33+
- [Alpha](#alpha)
34+
- [Beta](#beta)
35+
- [Stable](#stable)
36+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
37+
- [Version Skew Strategy](#version-skew-strategy)
3638
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
3739
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
3840
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
@@ -504,14 +506,14 @@ Other components:
504506
keep compatibility, PodResourceAllocation admission controller mutates such
505507
an update by copying non-nil values from the old Pod to current Pod.
506508

507-
## Test Plan
509+
### Test Plan
508510

509-
### Unit Tests
511+
#### Unit Tests
510512

511513
Unit tests will cover the sanity of code changes that implements the feature,
512514
and the policy controls that are introduced as part of this feature.
513515

514-
### Pod Resize E2E Tests
516+
#### Pod Resize E2E Tests
515517

516518
End-to-End tests resize a Pod via PATCH to Pod's Spec.Containers[i].Resources.
517519
The e2e tests use docker as container runtime.
@@ -569,7 +571,7 @@ E2E tests for Guaranteed class Pod with three containers (c1, c2, c3):
569571
1. Increase CPU for c1 & c3, decrease c2 - net CPU increase for Pod.
570572
1. Increase memory for c1 & c3, decrease c2 - net memory increase for Pod.
571573

572-
### Resource Quota and Limit Ranges
574+
#### Resource Quota and Limit Ranges
573575

574576
Setup a namespace with ResourceQuota and a single, valid Pod.
575577
1. Resize the Pod within resource quota - CPU only.
@@ -586,7 +588,7 @@ Setup a namespace with min and max LimitRange and create a single, valid Pod.
586588
1. Increase memory to exceed max value.
587589
1. Decrease memory to go below min value.
588590

589-
### Resize Policy Tests
591+
#### Resize Policy Tests
590592

591593
Setup a guaranteed class Pod with two containers (c1 & c2).
592594
1. No resize policy specified, defaults to RestartNotRequired. Verify that CPU and
@@ -600,7 +602,7 @@ Setup a guaranteed class Pod with two containers (c1 & c2).
600602
1. RestartNotRequired cpu, Restart memory policy for c1. Resize c1 CPU & memory,
601603
verify container is resized with restart.
602604

603-
### Backward Compatibility and Negative Tests
605+
#### Backward Compatibility and Negative Tests
604606

605607
1. Verify that Node is allowed to update only a Pod's ResourcesAllocated field.
606608
1. Verify that only Node account is allowed to udate ResourcesAllocated field.
@@ -615,28 +617,40 @@ Setup a guaranteed class Pod with two containers (c1 & c2).
615617

616618
TODO: Identify more cases
617619

618-
## Graduation Criteria
620+
### Graduation Criteria
619621

620-
### Alpha
622+
#### Alpha
621623
- In-Place Pod Resouces Update functionality is implemented for running Pods,
622624
- LimitRanger and ResourceQuota handling are added,
623625
- Resize Policies functionality is implemented,
624626
- Unit tests and E2E tests covering basic functionality are added,
625627
- E2E tests covering multiple containers are added.
626628

627-
### Beta
629+
#### Beta
628630
- VPA alpha integration of feature completed and any bugs addressed,
629631
- E2E tests covering Resize Policy, LimitRanger, and ResourceQuota are added,
630632
- Negative tests are identified and added.
631633
- A "/resize" subresource is defined and implemented.
632634
- Pod-scoped resources are handled if that KEP is past alpha
633635

634-
### Stable
636+
#### Stable
635637
- VPA integration of feature moved to beta,
636638
- User feedback (ideally from atleast two distinct users) is green,
637639
- No major bugs reported for three months.
638640
- Pod-scoped resources are handled if that KEP is past alpha
639641

642+
### Upgrade / Downgrade Strategy
643+
Scheduler and API server should be updated before Kubelets in that order.
644+
Kubelet and the runtime versions should use the same CRI version in lock-step.
645+
Upgrade involves draining all pods from a node, installing a CRI runtime with this
646+
version of the API and update to a matching kubelet and making node schedulable again.
647+
Downgrade involves doing the above in reverse.
648+
649+
### Version Skew Strategy
650+
Kubelet and the CRI runtime versions are expected to match so we don't have to worry about.
651+
Previous versions of clients that are unaware of the new ResizePolicy fields would set them
652+
to nil. API server mutates such updates by copying non-nil values from old Pod to current Pod
653+
640654
## Production Readiness Review Questionnaire
641655

642656
<!--

keps/sig-node/2273-kubelet-container-resources-cri-api-changes/README.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@
1515
- [Alpha](#alpha)
1616
- [Beta](#beta)
1717
- [Stable](#stable)
18+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
19+
- [Version Skew Strategy](#version-skew-strategy)
1820
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
1921
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
2022
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
@@ -225,6 +227,14 @@ TBD
225227

226228
* No major bugs reported for three months.
227229

230+
### Upgrade / Downgrade Strategy
231+
Kubelet and the runtime versions should use the same CRI version in lock-step.
232+
Upgrade involves draining all pods from a node, installing a CRI runtime with this
233+
version of the API and update to a matching kubelet and making node schedulable again.
234+
235+
### Version Skew Strategy
236+
Kubelet and the CRI runtime versions are expected to match so we don't have to worry about.
237+
228238
## Production Readiness Review Questionnaire
229239

230240
<!--

0 commit comments

Comments
 (0)