Skip to content

Commit e61fc51

Browse files
committed
DRA: review feedback
1 parent d028f77 commit e61fc51

File tree

2 files changed

+35
-32
lines changed
  • keps/sig-node

2 files changed

+35
-32
lines changed

keps/sig-node/3063-dynamic-resource-allocation/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -769,11 +769,11 @@ to simulate the effect of allocating claims as part of scheduling and of
769769
creating or removing nodes.
770770

771771
This is not possible with opaque parameters as described in this KEP. If a DRA
772-
driver developer wants to support Cluster Autoscaler, they have to use semantic
773-
parameters. Semantic parameters are an extension of this KEP that is defined in
774-
[KEP #4381](https://github.com/kubernetes/enhancements/issues/4381).
772+
driver developer wants to support Cluster Autoscaler, they have to use
773+
structured parameters as defined in [KEP
774+
#4381](https://github.com/kubernetes/enhancements/issues/4381).
775775

776-
Semantic parameters are not necessary for network-attached resources because
776+
Structured parameters are not necessary for network-attached resources because
777777
adding or removing nodes doesn't change their availability and thus Cluster
778778
Autoscaler does not need to understand their parameters.
779779

@@ -934,7 +934,7 @@ For beta:
934934

935935
- In normal scenarios, scheduling pods with claims must not block scheduling of
936936
other pods by doing blocking API calls
937-
- Implement integration with Cluster Autoscaler through semantic parameters
937+
- Implement integration with Cluster Autoscaler through structured parameters
938938
- Gather feedback from developers and surveys
939939
- Positive acknowledgment from 3 would-be implementors of a resource driver,
940940
from a diversity of companies or projects

keps/sig-node/4381-dra-structured-parameters/README.md

Lines changed: 30 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -297,9 +297,9 @@ address limitations of the current approach for the following use cases:
297297

298298
- *Device initialization*: When starting a workload that uses
299299
an accelerator like an FPGA, I’d like to have the accelerator
300-
reconfigured or reprogrammed for the workload before the workload
301-
itself starts. For security reasons, workloads should not be able to
302-
reconfigure devices directly.
300+
reconfigured or reprogrammed without having to deploy my application
301+
with full hardware access and/or root privileges. Running applications
302+
with less privileges is better for overall security of the cluster.
303303

304304
*Limitation*: Currently, it’s impossible to specify the desired
305305
device properties that are required for reconfiguring devices.
@@ -315,28 +315,22 @@ address limitations of the current approach for the following use cases:
315315

316316
*Limitation*: Post-stop actions are not supported.
317317

318-
- *Partial allocation*: When deploying a container I’d like to be able
319-
to use part of the shareable device inside a container and other
320-
containers should be able to use other free resources on the same
321-
device.
322-
323-
*Limitation*: For example, newer generations of NVIDIA GPUs have a mode of
324-
operation called MIG, that allow them to be sub-divided into a set of
325-
mini-GPUs (called MIG devices) with varying amounts of memory and compute
326-
resources provided by each. From a hardware-standpoint, configuring a GPU
327-
into a set of MIG devices is highly-dynamic and creating a MIG device
328-
tailored to the resource needs of a particular application is well
329-
supported. However, with the current device plugin API, the only way to make
330-
use of this feature is to pre-partition a GPU into a set of MIG devices and
331-
advertise them to the kubelet in the same way a full / static GPU is
332-
advertised. The user must then pick from this set of pre-partitioned MIG
333-
devices instead of having one created for them on the fly based on their
334-
particular resource constraints. Without the ability to create MIG devices
335-
dynamically (i.e. at the time they are requested) the set of pre-defined MIG
336-
devices must be carefully tuned to ensure that GPU resources do not go unused
337-
because some of the pre-partioned devices are in low-demand. It also puts
338-
the burden on the user to pick a particular MIG device type, rather than
339-
declaring the resource constraints more abstractly.
318+
- *Partial allocation*: When workloads use only a portion of the device
319+
capabilities, devices can be partitioned (e.g. using Nvidia MIG or SR-IOV) to
320+
better match workload needs. Sharing the devices in this way can greatly
321+
increase HW utilization / reduce costs.
322+
323+
- *Limitation*: currently there's no API to request partial device
324+
allocation. With the current device plugin API, devices need to be
325+
pre-partitioned and advertised in the same way a full / static devices
326+
are. User must then select a pre-partitioned device instead of having one
327+
created for them on the fly based on their particular resource
328+
constraints. Without the ability to create devices dynamically (i.e. at the
329+
time they are requested) the set of pre-defined devices must be carefully
330+
tuned to ensure that device resources do not go unused because some of the
331+
pre-partioned devices are in low-demand. It also puts the burden on the user
332+
to pick a particular device type, rather than declaring the resource
333+
constraints more abstractly.
340334

341335
- *Optional allocation*: When deploying a workload I’d like to specify
342336
soft(optional) device requirements. If a device exists and it’s
@@ -873,8 +867,8 @@ parameters differently for reporting resource availability.
873867

874868
This is the result of an attack against the resource driver, either from a
875869
container which uses a resource exposed by the driver, a compromised kubelet
876-
which interacts with the plugin, or through a successful attack against the
877-
node which led to root access.
870+
which interacts with the plugin, or due to resource driver running on a node
871+
with a compromised root account.
878872

879873
The resource driver plugin only needs read access to objects described in this
880874
KEP, so compromising it does not interfere with dynamic resource allocation for
@@ -1161,6 +1155,15 @@ resources.
11611155

11621156
### API
11631157

1158+
```
1159+
<<[UNRESOLVED @pohly @johnbelamaric]>>
1160+
Before 1.31, we need to re-evaluate the API, including, but not limited to:
1161+
- Do we really need a separate ResourceClaim?
1162+
- Does "Device" instead of "Resource" make the API easier to understand?
1163+
- Avoid separate parameter objects if and when possible.
1164+
<<[/UNRESOLVED]>>
1165+
```
1166+
11641167
The PodSpec gets extended. To minimize the changes in core/v1, all new types
11651168
get defined in a new resource group. This makes it possible to revise those
11661169
more experimental parts of the API in the future. The new fields in the

0 commit comments

Comments
 (0)