Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions keps/prod-readiness/sig-node/4540.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@ alpha:
approver: "@soltysh"
beta:
approver: "@soltysh"
stable:
approver: "@soltysh"
21 changes: 9 additions & 12 deletions keps/sig-node/4540-strict-cpu-reservation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
- [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
- [x] (R) KEP approvers have approved the KEP status as `implementable`
- [x] (R) Design details are appropriately documented
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- [x] e2e Tests for all Beta API Operations (endpoints)
- [x] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [x] (R) Minimum Two Week Window for GA e2e tests to prove flake free
Expand Down Expand Up @@ -129,7 +129,7 @@ When `strict-cpu-reservation` is enabled:

### Risks and Mitigations

The feature is isolated to a specific policy option `strict-cpu-reservation` under `cpuManagerPolicyOptions` and is protected by feature gate `CPUManagerPolicyBetaOptions` before the feature graduates to `Stable` i.e. always enabled.
The feature is isolated to a specific policy option `strict-cpu-reservation` under `cpuManagerPolicyOptions`.

Concern for feature impact on best-effort workloads, the workloads that do not have resource requests, is brought up.

Expand Down Expand Up @@ -313,8 +313,8 @@ No new integration tests for kubelet are planned.

#### GA

- [ ] Allow time for feedback (1 year).
- [ ] Make sure all risks have been addressed.
- [X] Allow time for feedback (two releases).
- [X] Make sure all risks have been addressed.

### Upgrade / Downgrade Strategy

Expand All @@ -332,9 +332,6 @@ The `/var/lib/kubelet/cpu_manager_state` needs to be removed when enabling or di

###### How can this feature be enabled / disabled in a live cluster?

- [X] Feature gate (also fill in values in `kep.yaml`)
- Feature gate name: `CPUManagerPolicyBetaOptions`
- Components depending on the feature gate: `kubelet`
- [X] Change the kubelet configuration to set a `CPUManager` policy of `static` and a `CPUManager` policy option of `strict-cpu-reservation`
- Will enabling / disabling the feature require downtime of the control plane? No
- Will enabling / disabling the feature require downtime or reprovisioning of a node? No -- removing `/var/lib/kubelet/cpu_manager_state` and restarting kubelet are enough.
Expand All @@ -346,13 +343,13 @@ Yes. Reserved CPU cores will be strictly used for system daemons and interrupt p

The feature is only enabled when all following conditions are met:
1. The `static` `CPUManager` policy is selected
2. The `CPUManagerPolicyBetaOptions` feature gate is enabled and the `strict-cpu-reservation` policy option is selected
2. The `strict-cpu-reservation` policy option is selected
3. The `reservedSystemCPUs` is not empty

###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

Yes, the feature can be disabled by:
1. Disable feature gate `CPUManagerPolicyBetaOptions` or remove `strict-cpu-reservation` from the list of `CPUManager` policy options
Yes, the feature can be disabled by the following steps:
1. Remove `strict-cpu-reservation` from the list of `CPUManager` policy options
2. Remove `/var/lib/kubelet/cpu_manager_state` and restart kubelet

###### What happens if we reenable the feature if it was previously rolled back?
Expand All @@ -361,7 +358,7 @@ The feature will be enabled regardless it is enabled for the first time or not.

###### Are there any tests for feature enablement/disablement?

- A specific e2e test will demonstrate that the default behaviour is preserved when the feature gate is disabled, or when the feature is not used (2 separate tests)
- A specific e2e test will demonstrate that the default behaviour is preserved when the feature is not used (2 separate tests)

### Rollout, Upgrade and Rollback Planning

Expand Down Expand Up @@ -561,7 +558,7 @@ You can safely disable the feature.
- 2024-03-08: Initial KEP created
- 2024-10-07: KEP gets LGTM and Approval
- 2025-02-03: KEP updated with Beta criteria

- 2025-09-30: KEP updated with GA criteria

## Drawbacks

Expand Down
10 changes: 7 additions & 3 deletions keps/sig-node/4540-strict-cpu-reservation/kep.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ title: Add CPUManager policy option to restrict reservedSystemCPUs to system dae
kep-number: 4540
authors:
- "@jingczhang"
- "@psasnal"
owning-sig: sig-node
participating-sigs: []
status: implementable
Expand All @@ -15,12 +16,14 @@ see-also: []
replaces: []

# The target maturity stage in the current dev cycle for this KEP.
stage: beta
# If the purpose of this KEP is to deprecate a user-visible feature
# and a Deprecated feature gates are added, they should be deprecated|disabled|removed.
stage: stable

# The most recent milestone for which work toward delivery of this KEP has been
# done. This can be the current (upcoming) milestone, if it is being actively
# worked on.
latest-milestone: "v1.33"
latest-milestone: "v1.35"

# The milestone at which this feature was, or is targeted to be, at each stage.
milestone:
Expand All @@ -31,10 +34,11 @@ milestone:
# The following PRR answers are required at alpha release
# List the feature gate name and the components for which it must be enabled
feature-gates:
- name: "CPUManagerPolicyOptions"
- name: "CPUManagerPolicyAlphaOptions"
- name: "CPUManagerPolicyBetaOptions"
components:
- kubelet

disable-supported: true

# The following PRR answers are required at beta release
Expand Down