Skip to content

Commit 482a5f7

Browse files
authored
KEP-5067: promote Pod Generation to beta (#5377)
* KEP-5067: promote Pod Generation to beta * address kep review feedback * update details about mirror pods * add conformance to GA criteria * correct detail about mirror pods * update prr reviewer * clarify steps if SLOs are not being met * address prr feedback * add metrics to kep.yaml
1 parent 7db96f6 commit 482a5f7

File tree

3 files changed

+124
-33
lines changed

3 files changed

+124
-33
lines changed
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
kep-number: 5067
22
alpha:
33
approver: "johnbelamaric"
4+
beta:
5+
approver: "soltysh"

keps/sig-node/5067-pod-generation/README.md

Lines changed: 117 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -142,20 +142,20 @@ checklist items _must_ be updated for the enhancement to be released.
142142

143143
Items marked with (R) are required *prior to targeting to a milestone / release*.
144144

145-
* [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
146-
* [ ] (R) KEP approvers have approved the KEP status as `implementable`
147-
* [ ] (R) Design details are appropriately documented
148-
* [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
149-
+ [ ] e2e Tests for all Beta API Operations (endpoints)
145+
* [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
146+
* [x] (R) KEP approvers have approved the KEP status as `implementable`
147+
* [x] (R) Design details are appropriately documented
148+
* [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
149+
+ [x] e2e Tests for all Beta API Operations (endpoints)
150150
+ [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
151151
+ [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
152-
* [ ] (R) Graduation criteria is in place
152+
* [x] (R) Graduation criteria is in place
153153
+ [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
154154
* [ ] (R) Production readiness review completed
155155
* [ ] (R) Production readiness review approved
156-
* [ ] "Implementation History" section is up-to-date for milestone
157-
* [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
158-
* [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
156+
* [x] "Implementation History" section is up-to-date for milestone
157+
* [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
158+
* [x] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
159159

160160
<!--
161161
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
@@ -457,12 +457,15 @@ regressions back to decreasing values.
457457

458458
#### Mirror pods
459459

460-
The kubelet currently computes the internal pod UID of a mirror pod using a hash of the podspec,
461-
meaning that any update to the podspec results in the kubelet seeing it as a pod deletion followed
462-
by creation of a new pod. To fully support generation for mirror pods more changes to the kubelet's logic
463-
will be expected. For now, we will not treat mirror pods in any special way. This means that due
464-
to the way that mirror pods are implemented, the generation (and observedGeneration) of a mirror pod
465-
will always be 1.
460+
For this KEP, we will not treat mirror pods in any special way. Due to the way they are currently implemented in the
461+
kubelet and apiserver, this means:
462+
463+
1. If a mirror pod's spec is modified manually by a client via the apiserver, its `metadata.generation` will be bumped accordingly.
464+
1. If a static pod's manifest is updated, the kubelet treats this as a pod deletion followed by a pod creation,
465+
which will reset the `metadata.generation` of the corresponding mirror pod to 1.
466+
1. The kubelet does not currently propagate the mirror pod's `metadata.generation` to the place where
467+
the pod status is updated today, so the `observedGeneration` fields of mirror pods will remain
468+
unpopulated.
466469

467470
#### Future enhancements
468471

@@ -517,6 +520,13 @@ extending the production code to implement this enhancement.
517520
Unit tests will be implemented to cover code changes that implement the feature,
518521
in the API server code and the kubelet code.
519522

523+
Core packages touched:
524+
* `pkg/registry/core/pod/strategy.go`: `2025-06-16` - `71.1`
525+
* `pkg/registry/core/pod/util.go`: `2025-06-16` - `74`
526+
* `pkg/apis/core/validation/validation.go`: `2025-06-16` - `84.6`
527+
* `pkg/kubelet`: `2025-06-16` - `71`
528+
* `pkg/kubelet/status`: `2025-06-16` - `86.8`
529+
520530
##### Integration tests
521531

522532
<!--
@@ -552,8 +562,19 @@ E2E tests will be implemented to cover the following cases:
552562
* Verify that newly created pods have a `metadata.generation` set to 1.
553563
* Verify that PodSpec updates (such as tolerations or container images), resize requests, adding ephemeral containers, and binding requests cause the `metadata.generation` to be incremented by 1 for each update.
554564
* Verify that deletion of a pod causes the `metadata.generation` to be incremented by 1.
555-
* Issue ~1000 pod updates (1 every 100ms) and verify that `metadata.generation` and `status.observedGeneration` converge to the final expected value.
565+
* Issue ~500 pod updates (1 every 100ms) and verify that `metadata.generation` and `status.observedGeneration` converge to the final expected value.
556566
* Verify that various conditions each have `observedGeneration` populated.
567+
* Verify that static pods have `metadata.generation` and `observedGeneration` fields set to 1, and that
568+
they never change.
569+
570+
Added tests:
571+
`pod generation should start at 1 and increment per update`: SIG Node, https://storage.googleapis.com/k8s-triage/index.html?test=Pod%20Generation
572+
`custom-set generation on new pods and graceful delete`: SIG Node, https://storage.googleapis.com/k8s-triage/index.html?test=Pod%20Generation
573+
`issue 500 podspec updates and verify generation and observedGeneration eventually converge`: SIG Node, https://storage.googleapis.com/k8s-triage/index.html?test=Pod%20Generation
574+
`pod rejected by kubelet should have updated generation and observedGeneration`: SIG Node, https://storage.googleapis.com/k8s-triage/index.html?test=Pod%20Generation
575+
`pod observedGeneration field set in pod conditions`: SIG Node, https://storage.googleapis.com/k8s-triage/index.html?test=Pod%20Generation
576+
`pod-resize-scheduler-tests`: SIG Node, https://storage.googleapis.com/k8s-triage/index.html?test=pod-resize-scheduler-tests
577+
557578

558579
### Graduation Criteria
559580

@@ -625,16 +646,17 @@ in back-to-back releases.
625646
* `metadata.generation` functionality implemented
626647
* `status.observedGeneration` functionality implemented behind feature flag
627648
* `status.conditions[i].observedGeneration` field added to the API
649+
* `status.conditions[i].observedGeneration` functionality implemented behind feature flag
628650

629651
#### Beta
630652

631653
* `metadata.generation`, `status.observedGeneration`, `status.conditions[i].observedGeneration` functionality have been implemented and running as alpha for at least one release
632-
* `status.conditions[i].observedGeneration` functionality implemented behind feature flag
633654

634655
#### GA
635656

636657
* No major bugs reported for three months.
637-
* User feedback (ideally from at least two distinct users) is green.
658+
* No negative user feedback.
659+
* Promote the [primary e2e tests](https://github.com/kubernetes/kubernetes/blob/08ee8bde594a42bc1a222c9fd25726352a1e6049/test/e2e/node/pods.go#L422-L719) to Conformance.
638660

639661
### Upgrade / Downgrade Strategy
640662

@@ -815,6 +837,11 @@ https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05
815837
Unit tests will be added to cover the code that implements the feature, and will
816838
cover the cases of the feature gate being both enabled and disabled.
817839

840+
The following unit test covers what happens if I disable a feature gate after having
841+
objects written with the new field (in this case, the field should persist).
842+
843+
* https://github.com/kubernetes/kubernetes/blob/74210dd399c14582754e933de83a9e44b1d69c69/pkg/api/pod/util_test.go#L1228
844+
818845
### Rollout, Upgrade and Rollback Planning
819846

820847
<!--
@@ -833,13 +860,30 @@ rollout. Similarly, consider large clusters and how enablement/disablement
833860
will rollout across nodes.
834861
-->
835862

863+
A rollout or rollback won't have significant impact on any components, even
864+
if they restart mid-rollout. Already running workloads likewise won't be
865+
significantly impacted.
866+
836867
###### What specific metrics should inform a rollback?
837868

838869
<!--
839870
What signals should users be paying attention to when the feature is young
840871
that might indicate a serious problem?
841872
-->
842873

874+
If users see the `metadata.generation` and `status.observedGeneration` fields
875+
are not being updated or are significantly misaligned, that indicates that
876+
the feature is not working as expected.
877+
878+
Some metrics to look at that could indicate a problem include:
879+
- `kubelet_pod_start_total_duration_seconds`
880+
- `kubelet_pod_status_sync_duration_seconds`
881+
- `kubelet_pod_worker_duration_seconds`
882+
883+
You could also check the [Pod Startup Latency SLI](https://github.com/kubernetes/community/blob/master/sig-scalability/slos/pod_startup_latency.md).
884+
885+
Any of these being significantly elevated could indicate an issue with the feature.
886+
843887
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
844888

845889
<!--
@@ -848,12 +892,27 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
848892
are missing a bunch of machinery and tooling and can't do that now.
849893
-->
850894

895+
Testing steps:
896+
897+
1. Create test pod with old version of API server and node; expected outcome: `generation` and `observedGeneration` fields are not populated
898+
1. Upgrade API server
899+
1. Send an update request to the running pod; expected outcome: `generation` is set to 1 and `observedGeneration` fields are not populated
900+
1. Create a new pod; expected outcome: `generation` is set to 1 and `observedGeneration` fields are not populated
901+
1. Create upgraded node
902+
1. Create second test pod on the upgraded node; expected outcome: `generation` and `observedGeneration` fields are set to 1
903+
1. Restart the upgraded node with the feature disabled
904+
1. Send an update request to the second pod; expected outcome: `generation` and `observedGeneration` continue to be updated so are set to 2
905+
1. Restart the upgraded node with the feature enabled
906+
1. Send an update request to the second pod; expected outcome: `generation` and `observedGeneration` are set to 3
907+
851908
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
852909

853910
<!--
854911
Even if applying deprecation policies, they may still surprise some users.
855912
-->
856913

914+
No.
915+
857916
### Monitoring Requirements
858917

859918
<!--
@@ -871,6 +930,9 @@ checking if there are objects with field X set) may be a last resort. Avoid
871930
logs or events for this purpose.
872931
-->
873932

933+
They can check if `metadata.generation` is set on the pod and that `observedGeneration`
934+
is being updated.
935+
874936
###### How can someone using this feature know that it is working for their instance?
875937

876938
<!--
@@ -882,13 +944,16 @@ and operation of this feature.
882944
Recall that end users cannot usually observe component logs or access metrics.
883945
-->
884946

885-
* [ ] Events
886-
+ Event Reason:
887-
* [ ] API .status
888-
+ Condition name:
889-
+ Other field:
890-
* [ ] Other (treat as last resort)
891-
+ Details:
947+
* [x] API .status
948+
+ Other field: `metadata.generation`, `status.observedGeneration`, `status.conditions[].observedGeneration`
949+
950+
Each pod should have its `metadata.generation` set, starting at 1 and incremented by 1 for each update.
951+
952+
Each pod's `status.observedGeneration` should be populated to reflect the `metadata.generation` that was last
953+
observed by the kubelet.
954+
955+
Each pod's `status.conditions[].observedGeneration` should be populated to reflect the `metadata.generation`
956+
that was last observed by the component owning the corresponding condition.
892957

893958
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
894959

@@ -908,18 +973,15 @@ These goals will help you determine what you need to measure (SLIs) in the next
908973
question.
909974
-->
910975

976+
We can reuse the [Pod Startup Latency SLI/SLO](https://github.com/kubernetes/community/blob/master/sig-scalability/slos/pod_startup_latency.md) here.
977+
911978
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
912979

913980
<!--
914981
Pick one more of these and delete the rest.
915982
-->
916983

917-
* [ ] Metrics
918-
+ Metric name:
919-
+ [Optional] Aggregation method:
920-
+ Components exposing the metric:
921-
* [ ] Other (treat as last resort)
922-
+ Details:
984+
We can reuse the [Pod Startup Latency SLI/SLO](https://github.com/kubernetes/community/blob/master/sig-scalability/slos/pod_startup_latency.md) here.
923985

924986
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
925987

@@ -928,6 +990,8 @@ Describe the metrics themselves and the reasons why they weren't added (e.g., co
928990
implementation difficulties, etc.).
929991
-->
930992

993+
N/A
994+
931995
### Dependencies
932996

933997
<!--
@@ -951,6 +1015,8 @@ and creating new ones, as well as about cluster-level services (e.g. DNS):
9511015
- Impact of its degraded performance or high-error rates on the feature:
9521016
-->
9531017

1018+
No.
1019+
9541020
### Scalability
9551021

9561022
<!--
@@ -1076,6 +1142,9 @@ details). For now, we leave it here.
10761142

10771143
###### How does this feature react if the API server and/or etcd is unavailable?
10781144

1145+
The feature depends on the API server. If the API server is unavailable, the
1146+
new fields will not be updated.
1147+
10791148
###### What are other known failure modes?
10801149

10811150
<!--
@@ -1094,8 +1163,21 @@ For each of them, fill in the following information by copying the below templat
10941163
- Testing: Are there any tests for failure mode? If not, describe why.
10951164
-->
10961165

1166+
Other failure modes are described under Risks and Mitigations.
1167+
1168+
Detection and mitigation of the infinite status-update loop by a badly-behaving
1169+
admission webhook is covered in these docs: https://kubernetes.io/docs/concepts/cluster-administration/admission-webhooks-good-practices/#why-good-webhook-design-matters.
1170+
10971171
###### What steps should be taken if SLOs are not being met to determine the problem?
10981172

1173+
One could disable the feature gate and restart the API server. Additionally,
1174+
one could investigate the apiserver and/or kubelet logs errors.
1175+
1176+
Detection and mitigation of the infinite status-update loop by a badly-behaving
1177+
admission webhook is covered in [these docs](https://kubernetes.io/docs/concepts/cluster-administration/admission-webhooks-good-practices/#why-good-webhook-design-matters). Specifically,
1178+
the section about [detecting loops caused by competing controllers](https://kubernetes.io/docs/concepts/cluster-administration/admission-webhooks-good-practices/#prevent-loops-competing-controllers)
1179+
can be helpful.
1180+
10991181
## Implementation History
11001182

11011183
<!--
@@ -1109,6 +1191,10 @@ Major milestones might include:
11091191
* when the KEP was retired or superseded
11101192
-->
11111193

1194+
2025-01-21: initial KEP draft created
1195+
2025-02-12: PR feedback addressed, KEP moved to "implementable" and merged
1196+
2025-06-05: proposed promotion to beta
1197+
11121198
## Drawbacks
11131199

11141200
<!--

keps/sig-node/5067-pod-generation/kep.yaml

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,12 +19,12 @@ see-also:
1919
replaces:
2020

2121
# The target maturity stage in the current dev cycle for this KEP.
22-
stage: alpha
22+
stage: beta
2323

2424
# The most recent milestone for which work toward delivery of this KEP has been
2525
# done. This can be the current (upcoming) milestone, if it is being actively
2626
# worked on.
27-
latest-milestone: "v1.33"
27+
latest-milestone: "v1.34"
2828

2929
# The milestone at which this feature was, or is targeted to be, at each stage.
3030
milestone:
@@ -44,3 +44,6 @@ disable-supported: true
4444

4545
# The following PRR answers are required at beta release
4646
metrics:
47+
- kubelet_pod_start_total_duration_seconds
48+
- kubelet_pod_status_sync_duration_seconds
49+
- kubelet_pod_worker_duration_seconds

0 commit comments

Comments
 (0)