Skip to content

Commit ea8d532

Browse files
authored
Merge pull request kubernetes#2932 from swatisehgal/update-getallocatable-to-beta
KEP-2403: Update PodResource API GetAllocatableResource 1.23 Beta
2 parents eff73b4 + a6ad3fe commit ea8d532

File tree

3 files changed

+25
-5
lines changed

3 files changed

+25
-5
lines changed
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
kep-number: 2403
22
alpha:
33
approver: "@johnbelamaric"
4+
beta:
5+
approver: "@johnbelamaric"

keps/sig-node/2403-pod-resources-allocatable-resources/README.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
4343
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
4444
- [X] (R) Graduation criteria is in place
4545
- [X] (R) Production readiness review completed
46-
- [X] Production readiness review approved
46+
- [X] (R) Production readiness review approved
4747
- [X] "Implementation History" section is up-to-date for milestone
4848
- ~~ [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] ~~
4949
- [X] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
@@ -102,6 +102,10 @@ The GRPC Service will expose an additional endpoint:
102102
- 'GetAllocatableResources`, which returns a single AllocatableResourcesResponse, enabling monitor applications to query for the allocatable set of resources available on the node.
103103
This endpoint will return error if the corresponding feature gate is disabled.
104104

105+
NOTE:
106+
107+
- `GetAllocatableResources` should only be used to evaluate [allocatable](https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable) resources on a node. If the goal is to evaluate free/unallocated resources it should be used in conjunction with the List() endpoint. The result obtained by `GetAllocatableResources` would remain the same unless the underlying resources exposed to kubelet change. This happens rarely but when it does (e.g. CPUs onlined/offlined, devices added/removed), client is expected to call `GetAlloctableResources` endpoint.
108+
105109
The extended interface is shown in proto below:
106110
```protobuf
107111
// PodResources is a service provided by the kubelet that provides information about the
@@ -139,6 +143,7 @@ message ContainerResources {
139143
string name = 1;
140144
repeated ContainerDevices devices = 2;
141145
repeated int64 cpu_ids = 3;
146+
repeated ContainerMemory memory = 4;
142147
}
143148
144149
// Topology describes hardware topology of the resource
@@ -165,6 +170,8 @@ message ContainerDevices {
165170
The implementation PR adds a suite of E2E tests which cover both the existing `List` endpoint already implemented in the podresources API and
166171
the new proposed `GetAllocatableResources` API.
167172

173+
Add additional tests to prove that unhealthy devices are skipped as part of GetAllocatable and empty NUMA topology is not returned.
174+
168175
### Graduation Criteria
169176

170177
#### Alpha
@@ -174,6 +181,13 @@ the new proposed `GetAllocatableResources` API.
174181
#### Alpha to Beta Graduation
175182
- [X] The new API is consumed by other public software components (e.g. NFD).
176183
- [X] No major bugs reported in the previous cycle.
184+
- [X] Ensure that empty NUMA topology is handled properly.
185+
- [X] Ensure that unhealthy devices are skipped in GetAllocatable.
186+
- [X] External clients are using this capability in their solutions
187+
Topology aware Scheduling is one of the primary use cases of GetAllocatableResource podresource endpoint. As part of this initiative an exporter populates CRs per node to expose the information of resources available per NUMA. Pod Resource API `List` and `GetAllocatableResources` API endpoints are used to obtain resource allocation of running pods along with the underlying hardware topology (NUMA) information. Topology aware scheduler can be configured such that users can create custom exporters or use already existing exporters to expose the NodeResourceTopology information as CRs and then [Topology aware Scheduler](https://github.com/kubernetes-sigs/scheduler-plugins/tree/master/pkg/noderesourcetopology) uses this information to make a NUMA aware placement decision leading to the reduction of occurrence of Topology affinity Errors highlighted in the issue [here](https://github.com/kubernetes/kubernetes/issues/84869).
188+
Examples of two such exporters are:
189+
- [Node feature Discovery](https://github.com/kubernetes-sigs/node-feature-discovery) for exposing resource topology information as part of the initiative here: [Introducing NFD Topology Updater exposing Resource hardware Topology info through CRs](https://github.com/kubernetes-sigs/node-feature-discovery/pull/525).
190+
- [Resource Topology Exporter](https://github.com/k8stopologyawareschedwg/resource-topology-exporter)
177191

178192
#### Beta to G.A Graduation
179193
- [X] Allowing time for feedback (1 year).
@@ -253,6 +267,7 @@ Feature only collects data when requests comes in, data is then garbage collecte
253267
- 2021-02-02: KEP extracted from [previous iteration](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments)
254268
- 2021-02-04: KEP polished, added feature gate, clarified the graduation criteria.
255269
- 2021-02-08: KEP updated adding per-specific-endpoint metrics to the podresources API and clarifying failure modes.
270+
- 2021-09-02: KEP updated to explicitly clarify the behavior of `GetAllocatableResources` and graduate to Beta in 1.23.
256271

257272
## Alternatives
258273

keps/sig-node/2403-pod-resources-allocatable-resources/kep.yaml

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,16 @@ kep-number: 2403
33
authors:
44
- "@fromanirh"
55
- "@alexeyperevalov"
6+
- "@swatisehgal"
67
owning-sig: sig-node
78
participating-sigs: []
89
status: implementable
910
creation-date: "2021-02-02"
11+
last-updated: "2021-09-02"
1012
reviewers:
1113
- "@derekwaynecarr"
1214
- "@renaudwastaken"
15+
- "@klueska"
1316
approvers:
1417
- "@sig-node-leads"
1518
prr-approvers: []
@@ -19,18 +22,18 @@ see-also:
1922
replaces: []
2023

2124
# The target maturity stage in the current dev cycle for this KEP.
22-
stage: alpha
25+
stage: beta
2326

2427
# The most recent milestone for which work toward delivery of this KEP has been
2528
# done. This can be the current (upcoming) milestone, if it is being actively
2629
# worked on.
27-
latest-milestone: "v1.21"
30+
latest-milestone: "v1.23"
2831

2932
# The milestone at which this feature was, or is targeted to be, at each stage.
3033
milestone:
3134
alpha: "v1.21"
32-
beta: "v1.22"
33-
stable: "v1.23"
35+
beta: "v1.23"
36+
stable: "v1.24"
3437

3538
# The following PRR answers are required at alpha release
3639
# List the feature gate name and the components for which it must be enabled

0 commit comments

Comments
 (0)