|
51 | 51 |
|
52 | 52 | Items marked with (R) are required *prior to targeting to a milestone / release*. |
53 | 53 |
|
54 | | -- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) |
55 | | -- [ ] (R) KEP approvers have approved the KEP status as `implementable` |
56 | | -- [ ] (R) Design details are appropriately documented |
57 | | -- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) |
58 | | - - [ ] e2e Tests for all Beta API Operations (endpoints) |
| 54 | +- [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) |
| 55 | +- [x] (R) KEP approvers have approved the KEP status as `implementable` |
| 56 | +- [x] (R) Design details are appropriately documented |
| 57 | +- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) |
| 58 | + - [x] e2e Tests for all Beta API Operations (endpoints) |
59 | 59 | - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) |
60 | 60 | - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free |
61 | 61 | - [ ] (R) Graduation criteria is in place |
62 | 62 | - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) |
63 | | -- [ ] (R) Production readiness review completed |
64 | | -- [ ] (R) Production readiness review approved |
65 | | -- [ ] "Implementation History" section is up-to-date for milestone |
66 | | -- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] |
| 63 | +- [x] (R) Production readiness review completed |
| 64 | +- [x] (R) Production readiness review approved |
| 65 | +- [x] "Implementation History" section is up-to-date for milestone |
| 66 | +- [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] |
67 | 67 | - [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes |
68 | 68 |
|
69 | 69 | <!-- |
@@ -815,8 +815,8 @@ ensure `ExtendedResourceName`s are handled by the scheduler as described in this |
815 | 815 |
|
816 | 816 | #### Beta |
817 | 817 |
|
818 | | -- Reevaluate where to create the special resource claim, in scheduler or some |
819 | | - other controller, based on feedback from Alpha and the nomination concept. |
| 818 | +- The basic scoring in NodeResourcesFit has to be implemented and that the queueing hints have to work efficiently. |
| 819 | +- Keep the Alpha behavior to create the special resource claim in scheduler. |
820 | 820 | - Gather feedback from developers and surveys |
821 | 821 | - 3 examples of vendors making use of the extensions proposed in this KEP |
822 | 822 | - Scalability tests that mirror real-world usage as determined by user feedback |
@@ -996,7 +996,7 @@ Recall that end users cannot usually observe component logs or access metrics. |
996 | 996 | - Details: |
997 | 997 | --> |
998 | 998 | - [x] API .status |
999 | | - - Other field: `.status.extendedResourceClaimStatus` will have a list of resource claims that are created for |
| 999 | + - Other field: Pod's `.status.extendedResourceClaimStatus` will have a list of resource claims that are created for |
1000 | 1000 | DRA extended resources. |
1001 | 1001 |
|
1002 | 1002 | ###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? |
@@ -1067,7 +1067,8 @@ Pick one more of these and delete the rest. |
1067 | 1067 | - Type: Counter |
1068 | 1068 | - Labels: `status` ("failure", "success") |
1069 | 1069 | - SLI Usage: Calculate success rate to monitor the reliability of automatic resource claim creation. High failure rates indicate potential issues with extended resource configuration. |
1070 | | - - Because the resource claim is created in the scheduler, we need a different metric from `resourceclaim_controller_creates_total`. |
| 1070 | + - Because the resource claim is created in the scheduler PreBind phase by making k8s API call, we need a different metric from `resourceclaim_controller_creates_total`. |
| 1071 | + - The metric is incremented accordingly based on the API call outcome, either success or failure. |
1071 | 1072 |
|
1072 | 1073 | ###### Are there any missing metrics that would be useful to have to improve observability of this feature? |
1073 | 1074 |
|
@@ -1156,7 +1157,7 @@ still applies. |
1156 | 1157 | ###### How does this feature react if the API server and/or etcd is unavailable? |
1157 | 1158 |
|
1158 | 1159 | The Kubernetes control plane will be down, so no new Pods get scheduled. kubelet may |
1159 | | -still be able to start or or restart containers if it already received all the relevant |
| 1160 | +still be able to start or restart containers if it already received all the relevant |
1160 | 1161 | updates (Pod, ResourceClaim, etc.). |
1161 | 1162 |
|
1162 | 1163 | ###### What are other known failure modes? |
|
0 commit comments