You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Items marked with (R) are required *prior to targeting to a milestone / release*.
44
45
45
-
-[] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
46
-
-[] (R) KEP approvers have approved the KEP status as `implementable`
47
-
-[] (R) Design details are appropriately documented
46
+
-[x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
47
+
-[x] (R) KEP approvers have approved the KEP status as `implementable`
48
+
-[x] (R) Design details are appropriately documented
48
49
-[ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
49
50
-[ ] e2e Tests for all Beta API Operations (endpoints)
50
51
-[ ] (R) Ensure GA e2e tests for meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
@@ -53,7 +54,7 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
53
54
-[ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
54
55
-[ ] (R) Production readiness review completed
55
56
-[ ] (R) Production readiness review approved
56
-
-[] "Implementation History" section is up-to-date for milestone
57
+
-[x] "Implementation History" section is up-to-date for milestone
57
58
-[ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
58
59
-[ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
59
60
@@ -333,6 +334,8 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
333
334
-->
334
335
335
336
- Existing Pod Lifecycle tests must pass fine even after increasing the relisting frequency.
@@ -341,6 +344,10 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
341
344
- Feature implemented behind a feature flag
342
345
- Existing `node e2e` tests around pod lifecycle must pass
343
346
347
+
#### Beta
348
+
- Add E2E Node Conformance presubmit job in CI
349
+
- Add E2E Node Conformance periodic job in CI
350
+
344
351
### Upgrade / Downgrade Strategy
345
352
346
353
N/A
@@ -379,8 +386,7 @@ If reenabled, kubelet will again start updating container statuses using CRI eve
379
386
380
387
###### Are there any tests for feature enablement/disablement?
381
388
382
-
Yes, unit tests for the feature when enabled and disabled will be implemented in both kubelet
383
-
389
+
These [unit test](https://github.com/kubernetes/kubernetes/blob/ca70940ba8c375bc69091822a9d52bcb7925de3b/pkg/kubelet/pleg/evented_test.go#L47) performs a health check on Evented PLEG.
384
390
### Rollout, Upgrade and Rollback Planning
385
391
386
392
<!--
@@ -409,14 +415,35 @@ that might indicate a serious problem?
409
415
-->
410
416
411
417
If users observe incosistancy in the container statuses reported by the kubelet and the CRI runtime (e.g. using a tool like `crictl`) after enabling this feature, they should consider rolling back the feature.
418
+
419
+
Apart from that cluster admins can monitor the state of evented PLEG's connection with the CRI runtime using following metrics,
420
+
421
+
*`evented_pleg_connection_error_count` - The count of errors encountered during the establishment of streaming connection with the CRI runtime.
422
+
*`evented_pleg_connection_success_count` - The count of successful streaming connections with the CRI runtime.
423
+
*`evented_pleg_connection_latency_seconds` - The latency of streaming connection with the CRI runtime, measured in seconds.
424
+
*`evented_pleg_notifications_received` - The number of notifications received through streaming connection with the CRI runtime.
425
+
412
426
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
413
427
414
428
<!--
415
429
Describe manual testing that was done and the outcomes.
416
430
Longer term, we may want to require automated upgrade/rollback tests, but we
417
431
are missing a bunch of machinery and tooling and can't do that now.
418
432
-->
419
-
N/A for alpha release. But we will add the tests for beta release.
433
+
434
+
Following scenarios were tested in manual tests,
435
+
436
+
Scenario 1: Kubelet Upgrade without Corresponding CRI Runtime Upgrade
437
+
438
+
Step 1: Kubelet is upgraded but CRI runtime remains unchanged. Kubelet falls back to using the Generic PLEG as the CRI runtime does not emit any CRI events.
439
+
Step 2: Kubelet is downgraded, but the CRI runtime version remains the same. Kubelet continues to work with the existing Generic PLEG.
440
+
Step 3: If the Kubelet is upgraded again, it behaves similarly to step 1.
441
+
442
+
Scenario 2: Kubelet and CRI Runtime Upgrade Together
443
+
444
+
Step 1: Both the Kubelet and CRI runtime are upgraded. Since the CRI runtime emits CRI events, Kubelet uses the Evented PLEG with an increased relisting period for the Generic PLEG.
445
+
Step 2: Kubelet and CRI runtime are downgraded. Kubelet defaults to using the Generic PLEG.
446
+
Step 3: If the Kubelet is upgraded again, it behaves similarly to Scenario 1, Step 1.
420
447
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
421
448
422
449
<!--
@@ -564,6 +591,10 @@ No.
564
591
565
592
No.
566
593
594
+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
595
+
596
+
No.
597
+
567
598
### Troubleshooting
568
599
569
600
###### How does this feature react if the API server and/or etcd is unavailable?
@@ -589,6 +620,8 @@ Disabling this feature in the kubelet will revert to the existing relisting PLEG
589
620
## Implementation History
590
621
591
622
- PR for required CRI changes - https://github.com/kubernetes/kubernetes/pull/110165
623
+
- PR for presubmit Node e2e job - https://github.com/kubernetes/test-infra/pull/28366
624
+
- PR for periodic Node e2e job - https://github.com/kubernetes/test-infra/pull/28592
0 commit comments