You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tags, and then generate with `hack/update-toc.sh`.
@@ -147,18 +77,18 @@ checklist items _must_ be updated for the enhancement to be released.
147
77
148
78
Items marked with (R) are required *prior to targeting to a milestone / release*.
149
79
150
-
-[] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
151
-
-[] (R) KEP approvers have approved the KEP status as `implementable`
152
-
-[] (R) Design details are appropriately documented
80
+
-[x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
81
+
-[x] (R) KEP approvers have approved the KEP status as `implementable`
82
+
-[x] (R) Design details are appropriately documented
153
83
-[ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
154
84
-[ ] e2e Tests for all Beta API Operations (endpoints)
155
85
-[ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
156
86
-[ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
157
87
-[ ] (R) Graduation criteria is in place
158
88
-[ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
159
-
-[] (R) Production readiness review completed
160
-
-[] (R) Production readiness review approved
161
-
-[] "Implementation History" section is up-to-date for milestone
89
+
-[x] (R) Production readiness review completed
90
+
-[x] (R) Production readiness review approved
91
+
-[x] "Implementation History" section is up-to-date for milestone
162
92
-[ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
163
93
-[ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
164
94
@@ -173,25 +103,6 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
173
103
174
104
## Summary
175
105
176
-
<!--
177
-
This section is incredibly important for producing high-quality, user-focused
178
-
documentation such as release notes or a development roadmap. It should be
179
-
possible to collect this information before implementation begins, in order to
180
-
avoid requiring implementors to split their attention between writing release
181
-
notes and implementing the feature itself. KEP editors and SIG Docs
182
-
should help to ensure that the tone and content of the `Summary` section is
183
-
useful for a wide audience.
184
-
185
-
A good summary is probably at least a paragraph in length.
186
-
187
-
Both in this section and below, follow the guidelines of the [documentation
188
-
style guide]. In particular, wrap lines to a reasonable length, to make it
189
-
easier for reviewers to cite specific portions, and to minimize diff churn on
The KEP seeks to provide a way to choose correct behavior with how Container Runtimes (Containerd and CRI-O) are applying `SupplementalGroups` to the first container processes. The KEP describes the work needed to be done in Kubernetes or connected projects to make sure customers have a clear migration path - including detection and safe upgrade - if any of their workflows took a dependency on this arguably erroneous behavior.
As described above, how supplemental groups attached to the first container process is complicated and not OCI image spec compliant.
260
162
261
163
Moreover, this causes security considerations as follows. When a cluster enforces some security policy for pods that protects the value of `RunAsGroup` and `SupplementalGroups`, the effect of its enforcement is limited, i.e., cluster users can easily bypass the policy enforcement just by using a custom image. If such a bypass happened, it would be unexpected behavior for most cluster administrators because the enforcement is almost useless. Moreover, the bypass will cause unexpected file access permission. In some use cases, the unexpected file access permission will be a security concern. For example, using `hostPath` volumes could be a severe problem because UID/GIDs matter in accessing files/directories in the volumes.
@@ -266,36 +168,17 @@ Thus, this KEP proposes to offer a new API field named `SupplementalGroupsPolicy
266
168
267
169
### Goals
268
170
269
-
<!--
270
-
List the specific goals of the KEP. What is it trying to achieve? How will we
271
-
know that this has succeeded?
272
-
-->
273
-
274
171
- To Provide a new API field to control exactly which groups the container process belongs to
275
172
- Ensure there are clear steps documented for end users to detect if their workload is affected
276
173
- (Optional) provide helper APIs and/or tooling to simplify the detection
277
174
278
175
### Non-Goals
279
176
280
-
<!--
281
-
What is out of scope for this KEP? Listing non-goals helps to focus discussion
282
-
and make progress.
283
-
-->
284
-
285
177
- To provide a cluster-wide control method.
286
178
- To change the default behavior (a potentially breaking change)
287
179
288
180
## Proposal
289
181
290
-
<!--
291
-
This is where we get down to the specifics of what the proposal actually is.
292
-
This should have enough detail that reviewers can understand exactly what
293
-
you're proposing, but should not include things like API designs or
294
-
implementation. What is the desired outcome and how do we measure success?.
295
-
The "Design Details" section below is for the real
296
-
nitty-gritty.
297
-
-->
298
-
299
182
This KEP proposes changes both on Kubernets API and CRI levels.
300
183
301
184
### Kubernetes API
@@ -351,14 +234,6 @@ message ContainerUser {
351
234
352
235
### User Stories (Optional)
353
236
354
-
<!--
355
-
Detail the things that people will be able to do if this KEP is implemented.
356
-
Include as much detail as possible so that people can understand the "how" of
357
-
the system. The goal here is to make this feel real for users without getting
358
-
bogged down.
359
-
-->
360
-
361
-
362
237
#### Story 1: Deploy a Security Policy to enforce `SupplementalGroupsPolicy` field
363
238
364
239
Assume a multi-tenant kubernetes cluster with `hostPath` volumes below situations:
@@ -394,42 +269,16 @@ Please note that a security policy without `supplementalGroupsPolicy` would lead
394
269
395
270
### Notes/Constraints/Caveats (Optional)
396
271
397
-
<!--
398
-
What are the caveats to the proposal?
399
-
What are some important details that didn't come across above?
400
-
Go in to as much detail as necessary here.
401
-
This might be a good place to talk about core concepts and how they relate.
402
-
-->
403
-
404
272
The proposal affects to the CRI implementations (e.g., containerd, cri-o, gVisor, etc.)
405
273
406
274
### Risks and Mitigations
407
275
408
-
<!--
409
-
What are the risks of this proposal, and how do we mitigate? Think broadly.
410
-
For example, consider both security and how this will impact the larger
411
-
Kubernetes ecosystem.
412
-
413
-
How will security be reviewed, and by whom?
414
-
415
-
How will UX be reviewed, and by whom?
416
-
417
-
Consider including folks who also work outside the SIG or subproject.
418
-
-->
419
-
420
276
- How to track the support status in CRI implementations of this proposal?
421
277
- This feature is mainly implemented inside each CRI implementation.
422
278
- How to feature-gate this feature in CRI implementations?
423
279
424
280
## Design Details
425
281
426
-
<!--
427
-
This section should contain enough information that the specifics of your
428
-
change are understandable. This may include API specs (though not always
429
-
required) or even code snippets. If there's any ambiguity about HOW your
430
-
proposal will be implemented, this is the place to discuss them.
431
-
-->
432
-
433
282
### Kubernetes API
434
283
435
284
#### SupplementalGroupsPolicy in PodSecurityContext
@@ -648,34 +497,6 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
648
497
649
498
### Graduation Criteria
650
499
651
-
<!--
652
-
**Note:** *Not required until targeted at a release.*
653
-
654
-
Define graduation milestones.
655
-
656
-
These may be defined in terms of API maturity, [feature gate] graduations, or as
657
-
something else. The KEP should keep this high-level with a focus on what
658
-
signals will be looked at to determine graduation.
659
-
660
-
Consider the following in developing the graduation criteria for this enhancement:
Below are some examples to consider, in addition to the aforementioned [maturity levels][maturity-levels].
677
-
-->
678
-
679
500
Because this KEP's core implementation(i.e. `SupplementalGroupsPolicy` handling) lies inside of CRI implementations(e.g. containerd, cri-o), the graduation criteria contains the support statuses of the updated CRI by container runtimes.
680
501
681
502
#### Alpha
@@ -700,33 +521,8 @@ Because this KEP's core implementation(i.e. `SupplementalGroupsPolicy` handling)
700
521
701
522
### Upgrade / Downgrade Strategy
702
523
703
-
<!--
704
-
If applicable, how will the component be upgraded and downgraded? Make sure
705
-
this is in the test plan.
706
-
707
-
Consider the following in developing an upgrade/downgrade strategy for this
708
-
enhancement:
709
-
- What changes (in invocations, configurations, API use, etc.) is an existing
710
-
cluster required to make on upgrade, in order to maintain previous behavior?
711
-
- What changes (in invocations, configurations, API use, etc.) is an existing
712
-
cluster required to make on upgrade, in order to make use of the enhancement?
713
-
-->
714
-
715
524
### Version Skew Strategy
716
525
717
-
<!--
718
-
If applicable, how will the component handle version skew with other
719
-
components? What are the guarantees? Make sure this is in the test plan.
720
-
721
-
Consider the following in developing a version skew strategy for this
722
-
enhancement:
723
-
- Does this enhancement involve coordinating behavior in the control plane and
724
-
in the kubelet? How does an n-2 kubelet without this feature available behave
725
-
when this feature is used?
726
-
- Will any other components on the node change? For example, changes to CSI,
727
-
CRI or CNI may require updating that component before the kubelet.
728
-
-->
729
-
730
526
- CRI must support this feature, especially when using `SupplementalGroupsPolicy=Strict`.
731
527
- kubelet must be at least the version of control-plane components.
732
528
@@ -1062,6 +858,8 @@ Are there any tests that were run/should be run to understand performance charac
1062
858
and validate the declared limits?
1063
859
-->
1064
860
861
+
No.
862
+
1065
863
### Troubleshooting
1066
864
1067
865
<!--
@@ -1107,6 +905,8 @@ Major milestones might include:
1107
905
- when the KEP was retired or superseded
1108
906
-->
1109
907
908
+
- 2023-02-10: Initial KEP published.
909
+
1110
910
## Drawbacks
1111
911
1112
912
<!--
@@ -1117,12 +917,6 @@ N/A
1117
917
1118
918
## Alternatives
1119
919
1120
-
<!--
1121
-
What other approaches did you consider, and why did you rule them out? These do
1122
-
not need to be as detailed as the proposal, but should include enough
1123
-
information to express the idea and why it was not acceptable.
1124
-
-->
1125
-
1126
920
### Introducing `RutimeClass`
1127
921
1128
922
As described in the [Motivation](#motivation) section, cluster administrators would need to deploy a custom low-level container runtime(e.g., [pfnet-research/strict-supplementalgroups-container-runtime](https://github.com/pfnet-research/strict-supplementalgroups-container-runtime)) that modifies OCI container runtime spec(`config.json`) produced by CRI implementations (e.g., containerd, cri-o). A custom `RuntimeClass` would be introduced for it.
@@ -1137,10 +931,4 @@ We could just fix CRI implementations directly without introducing new APIs. Th
1137
931
1138
932
## Infrastructure Needed (Optional)
1139
933
1140
-
<!--
1141
-
Use this section if you need things from the project/SIG. Examples include a
1142
-
new subproject, repos requested, or GitHub details. Listing these here allows a
1143
-
SIG to get the process for these resources started right away.
0 commit comments