Skip to content

Commit f62d58d

Browse files
authored
Merge pull request #39978 from moshe010/pod-resource-api-dra-doc-upstream
Extend PodResources API for Dynamic Resource Allocation
2 parents c55b7f2 + eaf9199 commit f62d58d

File tree

3 files changed

+75
-0
lines changed

3 files changed

+75
-0
lines changed

content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,7 @@ for these devices:
213213
service PodResourcesLister {
214214
rpc List(ListPodResourcesRequest) returns (ListPodResourcesResponse) {}
215215
rpc GetAllocatableResources(AllocatableResourcesRequest) returns (AllocatableResourcesResponse) {}
216+
rpc Get(GetPodResourcesRequest) returns (GetPodResourcesResponse) {}
216217
}
217218
```
218219

@@ -223,6 +224,14 @@ id of exclusively allocated CPUs, device id as it was reported by device plugins
223224
the NUMA node where these devices are allocated. Also, for NUMA-based machines, it contains the
224225
information about memory and hugepages reserved for a container.
225226

227+
Starting from Kubernetes v1.27, the `List` enpoint can provide information on resources
228+
of running pods allocated in `ResourceClaims` by the `DynamicResourceAllocation` API. To enable
229+
this feature `kubelet` must be started with the following flags:
230+
231+
```
232+
--feature-gates=DynamicResourceAllocation=true,KubeletPodResourcesDynamiceResources=true
233+
```
234+
226235
```gRPC
227236
// ListPodResourcesResponse is the response returned by List function
228237
message ListPodResourcesResponse {
@@ -242,6 +251,7 @@ message ContainerResources {
242251
repeated ContainerDevices devices = 2;
243252
repeated int64 cpu_ids = 3;
244253
repeated ContainerMemory memory = 4;
254+
repeated DynamicResource dynamic_resources = 5;
245255
}
246256
247257
// ContainerMemory contains information about memory and hugepages assigned to a container
@@ -267,6 +277,28 @@ message ContainerDevices {
267277
repeated string device_ids = 2;
268278
TopologyInfo topology = 3;
269279
}
280+
281+
// DynamicResource contains information about the devices assigned to a container by Dynamic Resource Allocation
282+
message DynamicResource {
283+
string class_name = 1;
284+
string claim_name = 2;
285+
string claim_namespace = 3;
286+
repeated ClaimResource claim_resources = 4;
287+
}
288+
289+
// ClaimResource contains per-plugin resource information
290+
message ClaimResource {
291+
repeated CDIDevice cdi_devices = 1 [(gogoproto.customname) = "CDIDevices"];
292+
}
293+
294+
// CDIDevice specifies a CDI device information
295+
message CDIDevice {
296+
// Fully qualified CDI device name
297+
// for example: vendor.com/gpu=gpudevice1
298+
// see more details in the CDI specification:
299+
// https://github.com/container-orchestrated-devices/container-device-interface/blob/main/SPEC.md
300+
string name = 1;
301+
}
270302
```
271303
{{< note >}}
272304
cpu_ids in the `ContainerResources` in the `List` endpoint correspond to exclusive CPUs allocated
@@ -333,6 +365,36 @@ Support for the `PodResourcesLister service` requires `KubeletPodResources`
333365
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) to be enabled.
334366
It is enabled by default starting with Kubernetes 1.15 and is v1 since Kubernetes 1.20.
335367

368+
### `Get` gRPC endpoint {#grpc-endpoint-get}
369+
370+
{{< feature-state state="alpha" for_k8s_version="v1.27" >}}
371+
372+
The `Get` endpoint provides information on resources of a running Pod. It exposes information
373+
similar to those described in the `List` endpoint. The `Get` endpoint requires `PodName`
374+
and `PodNamespace` of the running Pod.
375+
376+
```gRPC
377+
// GetPodResourcesRequest contains information about the pod
378+
message GetPodResourcesRequest {
379+
string pod_name = 1;
380+
string pod_namespace = 2;
381+
}
382+
```
383+
384+
To enable this feature, you must start your kubelet services with the following flag:
385+
386+
```
387+
--feature-gates=KubeletPodResourcesGet=true
388+
```
389+
390+
The `Get` endpoint can provide Pod information related to dynamic resources
391+
allocated by the dynamic resource allocation API. To enable this feature, you must
392+
ensure your kubelet services are started with the following flags:
393+
394+
```
395+
--feature-gates=KubeletPodResourcesGet=true,DynamicResourceAllocation=true,KubeletPodResourcesDynamiceResources=true
396+
```
397+
336398
## Device plugin integration with the Topology Manager
337399

338400
{{< feature-state for_k8s_version="v1.18" state="beta" >}}

content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,12 @@ gets scheduled onto one node and then cannot run there, which is bad because
162162
such a pending Pod also blocks all other resources like RAM or CPU that were
163163
set aside for it.
164164
165+
## Monitoring resources
166+
167+
The kubelet provides a gRPC service to enable discovery of dynamic resources of
168+
running Pods. For more information on the gRPC endpoints, see the
169+
[resource allocation reporting](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources).
170+
165171
## Limitations
166172
167173
The scheduler plugin must be involved in scheduling Pods which use

content/en/docs/reference/command-line-tools-reference/feature-gates.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,8 +127,10 @@ For a reference to old feature gates that are removed, please refer to
127127
| `KubeletInUserNamespace` | `false` | Alpha | 1.22 | |
128128
| `KubeletPodResources` | `false` | Alpha | 1.13 | 1.14 |
129129
| `KubeletPodResources` | `true` | Beta | 1.15 | |
130+
| `KubeletPodResourcesGet` | `false` | Alpha | 1.27 | |
130131
| `KubeletPodResourcesGetAllocatable` | `false` | Alpha | 1.21 | 1.22 |
131132
| `KubeletPodResourcesGetAllocatable` | `true` | Beta | 1.23 | |
133+
| `KubeletPodResourcesDynamicResources` | `false` | Alpha | 1.27 | |
132134
| `KubeletTracing` | `false` | Alpha | 1.25 | |
133135
| `LegacyServiceAccountTokenTracking` | `false` | Alpha | 1.26 | 1.26 |
134136
| `LegacyServiceAccountTokenTracking` | `true` | Beta | 1.27 | |
@@ -610,9 +612,14 @@ Each feature gate is designed for enabling/disabling a specific feature:
610612
- `KubeletPodResources`: Enable the kubelet's pod resources gRPC endpoint. See
611613
[Support Device Monitoring](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/606-compute-device-assignment/README.md)
612614
for more details.
615+
- `KubeletPodResourcesGet`: Enable the `Get` gRPC endpoint on kubelet's for Pod resources.
616+
This API augments the [resource allocation reporting](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources).
613617
- `KubeletPodResourcesGetAllocatable`: Enable the kubelet's pod resources
614618
`GetAllocatableResources` functionality. This API augments the
615619
[resource allocation reporting](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources)
620+
- `KubeletPodResourcesDynamiceResources`: Extend the kubelet's pod resources gRPC endpoint to
621+
to include resources allocated in `ResourceClaims` via `DynamicResourceAllocation` API.
622+
See [resource allocation reporting](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources) for more details.
616623
with informations about the allocatable resources, enabling clients to properly
617624
track the free compute resources on a node.
618625
- `KubeletTracing`: Add support for distributed tracing in the kubelet.

0 commit comments

Comments
 (0)