Skip to content

Commit 948ca17

Browse files
committed
5018-dra-adminaccess
Signed-off-by: Rita Zhang <[email protected]>
1 parent 9130a91 commit 948ca17

File tree

4 files changed

+856
-66
lines changed

4 files changed

+856
-66
lines changed

keps/sig-node/4381-dra-structured-parameters/README.md

Lines changed: 7 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -1455,29 +1455,8 @@ type DeviceRequest struct {
14551455
}
14561456
```
14571457

1458-
Admin access to devices is a privileged operation because it grants users
1459-
access to devices that are in use by other users. Drivers might also remove
1460-
other restrictions when preparing the device.
1461-
1462-
In Kubernetes 1.31, an example validating admission policy [was
1463-
provided](https://github.com/kubernetes/kubernetes/blob/4aeaf1e99e82da8334c0d6dddd848a194cd44b4f/test/e2e/dra/test-driver/deploy/example/admin-access-policy.yaml#L1-L11)
1464-
which restricts access to this option. It is the responsibility of cluster
1465-
admins to ensure that such a policy is installed if the cluster shouldn't allow
1466-
unrestricted access.
1467-
1468-
Long term, a Kubernetes cluster should disable usage of this field by default
1469-
and only allow it for users with additional privileges. More time is needed to
1470-
figure out how that should work, therefore the field is placed behind a
1471-
separate `DRAAdminAccess` feature gate which remains in alpha. A separate
1472-
KEP will be created to push this forward.
1473-
1474-
The `DRAAdminAccess` feature gate controls whether users can set the field to
1475-
true when requesting devices. That is checked in the apiserver. In addition,
1476-
the scheduler refuses to allocate claims with admin access when the feature is
1477-
turned off and somehow the field was set (for example, set in 1.31 when it
1478-
was available unconditionally, or set while the feature gate was enabled).
1479-
A similar check in the kube-controller-manager prevents creating a
1480-
ResourceClaim when the ResourceClaimTemplate has admin access enabled.
1458+
For more details about `AdminAccess`, please refer to [KEP 5018]([KEP
1459+
#5018 DRA AdminAccess](https://kep.k8s.io/5018))
14811460

14821461
```yaml
14831462
const (
@@ -1870,22 +1849,17 @@ type DeviceRequestAllocationResult struct {
18701849
// +required
18711850
Device string
18721851

1873-
// AdminAccess is a copy of the AdminAccess value in the
1874-
// request which caused this device to be allocated.
1875-
//
1876-
// New allocations are required to have this set when the DRAAdminAccess
1877-
// feature gate is enabled. Old allocations made
1878-
// by Kubernetes 1.31 do not have it yet. Clients which want to
1879-
// support Kubernetes 1.31 need to look up the request and retrieve
1880-
// the value from there if this field is not set.
1852+
// AdminAccess indicates that this device was allocated for
1853+
// administrative access. See the corresponding request field
1854+
// for a definition of mode.
18811855
//
18821856
// This is an alpha field and requires enabling the DRAAdminAccess
18831857
// feature gate. Admin access is disabled if this field is unset or
18841858
// set to false, otherwise it is enabled.
18851859
//
18861860
// +optional
18871861
// +featureGate=DRAAdminAccess
1888-
AdminAccess *bool
1862+
AdminAccess *bool `json:"adminAccess" protobuf:"bytes,5,name=adminAccess"`
18891863
}
18901864

18911865
// DeviceAllocationConfiguration gets embedded in an AllocationResult.
@@ -2103,10 +2077,6 @@ per claim is limited to `AllocationResultsMaxSize = 32`. The quota mechanism
21032077
uses that as the worst-case upper bound, so `allocationMode: all` is treated
21042078
like `allocationMode: exactCount` with `count: 32`.
21052079

2106-
Requests asking for "admin access" contribute to the quota. In practice,
2107-
namespaces where such access is allowed will typically not have quotas
2108-
configured.
2109-
21102080
### kube-controller-manager
21112081

21122082
The code that creates a ResourceClaim from a ResourceClaimTemplate started
@@ -2785,8 +2755,7 @@ skew are less likely to occur.
27852755

27862756
### Feature Enablement and Rollback
27872757

2788-
The initial answer in this section is for the core DRA. The second answer is
2789-
marked with DRAAdminAccess and applies to that sub-feature.
2758+
The answer in this section is for the core DRA.
27902759

27912760
###### How can this feature be enabled / disabled in a live cluster?
27922761

@@ -2797,42 +2766,22 @@ marked with DRAAdminAccess and applies to that sub-feature.
27972766
- kubelet
27982767
- kube-scheduler
27992768
- kube-controller-manager
2800-
- [X] Feature gate
2801-
- Feature gate name: DRAAdminAccess
2802-
- Components depending on the feature gate:
2803-
- kube-apiserver
2804-
2805-
28062769

28072770
###### Does enabling the feature change any default behavior?
28082771

28092772
No.
28102773

2811-
DRAAdminAccess: no.
2812-
28132774
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
28142775

28152776
Yes. Applications that were already deployed and are running will continue to
28162777
work, but they will stop working when containers get restarted because those
28172778
restarted containers won't have the additional resources.
28182779

2819-
DRAAdminAccess: Workloads which were deployed with admin access will continue
2820-
to run with it. They need to be deleted to remove usage of the feature.
2821-
If they were not running, then the feature gate checks in kube-scheduler will prevent
2822-
scheduling and in kube-controller-manager will prevent creating the ResourceClaim from
2823-
a ResourceClaimTemplate. In both cases, usage of the feature is prevented.
2824-
28252780
###### What happens if we reenable the feature if it was previously rolled back?
28262781

28272782
Pods might have been scheduled without handling resources. Those Pods must be
28282783
deleted to ensure that the re-created Pods will get scheduled properly.
28292784

2830-
DRAAdminAccess: Workloads which were deployed with admin access enabled are not
2831-
affected by a rollback. If the pods were already running, they keep running. If
2832-
they pods where kept as unschedulable because the scheduler refused to allocate
2833-
claims, they might now get scheduled.
2834-
2835-
28362785
###### Are there any tests for feature enablement/disablement?
28372786

28382787
<!--
@@ -2852,9 +2801,6 @@ Tests for apiserver will cover disabling the feature. This primarily matters
28522801
for the extended PodSpec: the new fields must be preserved during updates even
28532802
when the feature is disabled.
28542803

2855-
DRAAdminAccess: Tests for apiserver will cover disabling the feature. A test
2856-
that the DaemonSet controller tolerates keeping pods as pending is needed.
2857-
28582804
### Rollout, Upgrade and Rollback Planning
28592805

28602806
###### How can a rollout or rollback fail? Can it impact already running workloads?

keps/sig-node/4381-dra-structured-parameters/kep.yaml

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -41,11 +41,6 @@ feature-gates:
4141
- kube-controller-manager
4242
- kube-scheduler
4343
- kubelet
44-
- name: DRAAdminAccess
45-
components:
46-
- kube-apiserver
47-
- kube-controller-manager
48-
- kube-scheduler
4944
disable-supported: true
5045

5146
# The following PRR answers are required at beta release

0 commit comments

Comments
 (0)