Skip to content

Commit 03d880c

Browse files
committed
DRA: remove "classic DRA" (DRAControlPlaneController)
This matches kubernetes/kubernetes#128003
1 parent 0ec5a94 commit 03d880c

File tree

2 files changed

+17
-88
lines changed

2 files changed

+17
-88
lines changed

content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md

Lines changed: 14 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,8 @@ weight: 65
99

1010
<!-- overview -->
1111

12-
Core Dynamic Resource Allocation with structured parameters:
13-
1412
{{< feature-state feature_gate_name="DynamicResourceAllocation" >}}
1513

16-
Dynamic Resource Allocation with control plane controller:
17-
18-
{{< feature-state feature_gate_name="DRAControlPlaneController" >}}
19-
2014
Dynamic resource allocation is an API for requesting and sharing resources
2115
between pods and containers inside a pod. It is a generalization of the
2216
persistent volumes API for generic resources. Typically those resources
@@ -28,8 +22,10 @@ resources handled by Kubernetes via _structured parameters_ (introduced in Kuber
2822
Different kinds of resources support arbitrary parameters for defining requirements and
2923
initialization.
3024

31-
When a driver provides a _control plane controller_, the driver itself
32-
handles allocation in cooperation with the Kubernetes scheduler.
25+
Kubernetes v1.26 through to 1.31 included an (alpha) implementation of _classic DRA_,
26+
which is no longer supported. This documentation, which is for Kubernetes
27+
v{{< skew currentVersion >}}, explains the current approach to dynamic resource
28+
allocation within Kubernetes.
3329

3430
## {{% heading "prerequisites" %}}
3531

@@ -65,25 +61,14 @@ DeviceClass
6561
when installing a resource driver. Each request to allocate a device
6662
in a ResourceClaim must reference exactly one DeviceClass.
6763

68-
PodSchedulingContext
69-
: Used internally by the control plane and resource drivers
70-
to coordinate pod scheduling when ResourceClaims need to be allocated
71-
for a Pod and those ResourceClaims use a control plane controller.
72-
7364
ResourceSlice
74-
: Used with structured parameters to publish information about resources
65+
: Used by DRA drivers to publish information about resources
7566
that are available in the cluster.
7667

77-
The developer of a resource driver decides whether they want to handle
78-
allocation themselves with a control plane controller or instead rely on allocation
79-
through Kubernetes with structured parameters. A
80-
custom controller provides more flexibility, but cluster autoscaling is not
81-
going to work reliably for node-local resources. Structured parameters enable
82-
cluster autoscaling, but might not satisfy all use-cases.
83-
84-
When a driver uses structured parameters, all parameters that select devices
85-
are defined in the ResourceClaim and DeviceClass with in-tree types. Configuration
86-
parameters can be embedded there as arbitrary JSON objects.
68+
All parameters that select devices are defined in the ResourceClaim and
69+
DeviceClass with in-tree types. Configuration parameters can be embedded there.
70+
What configuration parameters are valid depends on the DRA driver, Kubernetes
71+
only passes them through without interpreting them.
8772

8873
The `core/v1` `PodSpec` defines ResourceClaims that are needed for a Pod in a
8974
`resourceClaims` field. Entries in that list reference either a ResourceClaim
@@ -151,51 +136,7 @@ spec:
151136
152137
## Scheduling
153138
154-
### With control plane controller
155-
156-
In contrast to native resources (CPU, RAM) and extended resources (managed by a
157-
device plugin, advertised by kubelet), without structured parameters
158-
the scheduler has no knowledge of what
159-
dynamic resources are available in a cluster or how they could be split up to
160-
satisfy the requirements of a specific ResourceClaim. Resource drivers are
161-
responsible for that. They mark ResourceClaims as "allocated" once resources
162-
for it are reserved. This also then tells the scheduler where in the cluster a
163-
ResourceClaim is available.
164-
165-
When a pod gets scheduled, the scheduler checks all ResourceClaims needed by a Pod and
166-
creates a PodScheduling object where it informs the resource drivers
167-
responsible for those ResourceClaims about nodes that the scheduler considers
168-
suitable for the Pod. The resource drivers respond by excluding nodes that
169-
don't have enough of the driver's resources left. Once the scheduler has that
170-
information, it selects one node and stores that choice in the PodScheduling
171-
object. The resource drivers then allocate their ResourceClaims so that the
172-
resources will be available on that node. Once that is complete, the Pod
173-
gets scheduled.
174-
175-
As part of this process, ResourceClaims also get reserved for the
176-
Pod. Currently ResourceClaims can either be used exclusively by a single Pod or
177-
an unlimited number of Pods.
178-
179-
One key feature is that Pods do not get scheduled to a node unless all of
180-
their resources are allocated and reserved. This avoids the scenario where a Pod
181-
gets scheduled onto one node and then cannot run there, which is bad because
182-
such a pending Pod also blocks all other resources like RAM or CPU that were
183-
set aside for it.
184-
185-
{{< note >}}
186-
187-
Scheduling of pods which use ResourceClaims is going to be slower because of
188-
the additional communication that is required. Beware that this may also impact
189-
pods that don't use ResourceClaims because only one pod at a time gets
190-
scheduled, blocking API calls are made while handling a pod with
191-
ResourceClaims, and thus scheduling the next pod gets delayed.
192-
193-
{{< /note >}}
194-
195-
### With structured parameters
196-
197-
When a driver uses structured parameters, the scheduler takes over the
198-
responsibility of allocating resources to a ResourceClaim whenever a pod needs
139+
The scheduler is responsible for allocating resources to a ResourceClaim whenever a pod needs
199140
them. It does so by retrieving the full list of available resources from
200141
ResourceSlice objects, tracking which of those resources have already been
201142
allocated to existing ResourceClaims, and then selecting from those resources
@@ -235,14 +176,9 @@ later.
235176
Such a situation can also arise when support for dynamic resource allocation
236177
was not enabled in the scheduler at the time when the Pod got scheduled
237178
(version skew, configuration, feature gate, etc.). kube-controller-manager
238-
detects this and tries to make the Pod runnable by triggering allocation and/or
239-
reserving the required ResourceClaims.
240-
241-
{{< note >}}
242-
243-
This only works with resource drivers that don't use structured parameters.
244-
245-
{{< /note >}}
179+
detects this and tries to make the Pod runnable by reserving the required
180+
ResourceClaims. However, this only works if those were allocated by
181+
the scheduler for some other pod.
246182

247183
It is better to avoid bypassing the scheduler because a Pod that is assigned to a node
248184
blocks normal resources (RAM, CPU) that then cannot be used for other Pods
@@ -273,10 +209,6 @@ are enabled. For details on that, see the `--feature-gates` and `--runtime-confi
273209
[kube-apiserver parameters](/docs/reference/command-line-tools-reference/kube-apiserver/).
274210
kube-scheduler, kube-controller-manager and kubelet also need the feature gate.
275211

276-
When a resource driver uses a control plane controller, then the
277-
`DRAControlPlaneController` feature gate has to be enabled in addition to
278-
`DynamicResourceAllocation`.
279-
280212
A quick check whether a Kubernetes cluster supports the feature is to list
281213
DeviceClass objects with:
282214

@@ -297,11 +229,6 @@ If not supported, this error is printed instead:
297229
error: the server doesn't have a resource type "deviceclasses"
298230
```
299231

300-
A control plane controller is supported when it is possible to create a
301-
ResourceClaim where the `spec.controller` field is set. When the
302-
`DRAControlPlaneController` feature is disabled, that field automatically
303-
gets cleared when storing the ResourceClaim.
304-
305232
The default configuration of kube-scheduler enables the "DynamicResources"
306233
plugin if and only if the feature gate is enabled and when using
307234
the v1 configuration API. Custom configurations may have to be modified to
@@ -314,5 +241,4 @@ be installed. Please refer to the driver's documentation for details.
314241

315242
- For more information on the design, see the
316243
[Dynamic Resource Allocation with Structured Parameters](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4381-dra-structured-parameters)
317-
and the
318-
[Dynamic Resource Allocation with Control Plane Controller](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/3063-dynamic-resource-allocation/README.md) KEPs.
244+
KEP.

content/en/docs/reference/command-line-tools-reference/feature-gates/dra-control-plane-controller.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@ stages:
99
- stage: alpha
1010
defaultValue: false
1111
fromVersion: "1.26"
12+
toVersion: "1.31"
13+
14+
removed: true
1215
---
1316
Enables support for resources with custom parameters and a lifecycle
1417
that is independent of a Pod. Allocation of resources is handled

0 commit comments

Comments
 (0)