You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-node/3619-supplemental-groups-policy/README.md
+96-29Lines changed: 96 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -672,8 +672,44 @@ Because this KEP's core implementation(i.e. `SupplementalGroupsPolicy` handling)
672
672
673
673
### Version Skew Strategy
674
674
675
-
- CRI must support this feature, especially when using `SupplementalGroupsPolicy=Strict`.
676
-
- kubelet must be at least the version of control-plane components.
675
+
Existing pods will still work as intended, as the new field is missing there
676
+
(i.e. no `SupplementalGroupsPolicy` fields in existing Pods' spec).
677
+
678
+
For upgrade, it will not change any current behaviors. But, please note that if you plan to use `Strict` SupplementalGroupsPolicy after the upgrade,
679
+
we assume your CRI runtime in the cluster also support this feature (See ["Dependencies"](#dependencies) section).
680
+
If there are some nodes whose CRI runtime does NOT support this feature,
681
+
- the creation of pods with `Strict` policy will be rejected depending if the feature levels of the upgraded version was beta or above,
682
+
- the `Strict` policy will fallback to `Merge` silently if the feature level of the upgraded version was alpha.
683
+
Please see the below matrix for more details.
684
+
685
+
For downgrade, when the functionality wasn't yet used, downgrade will not be affected. But, when the functionality, especially `Strict` SupplementalGroupsPolicy, was already used, there need to be caution:
686
+
- the running containers will continue to run with its effective policy as long as the container was not recreated.
687
+
- However, when the containers in such pods are recreated in the node, the behavior will be varied by downgraded version, the downgraded feature gate value, and its CRI runtime support status (see the below matrix).
688
+
689
+
The below matrix summarizes what will happen by upgraded/downgraded target versions, target feature gate, target CRI runtime support status:
690
+
691
+
| Target<br />kubelet version | Target<br/>Feature Gate | Target<br/>CRI runtime<br /> support the feature? | Pod's policy | Effective Policy | Rejected By Kubelet? |`.containerStatuses.user` reported? |
| <1.31<br/>(does not know the field) | N/A | Yes/No |`Strict`|`Merge`<br />(fallback silently) | NO | NO |
694
+
||||`Merge`<br />/(not set) |`Merge`| NO | NO |
695
+
| 1.31 or 1.32<br/>(Alpha) |`True`| YES |`Strict`|`Strict`| NO | YES |
696
+
||||`Merge`<br />/(not set) |`Merge`| NO | YES |
697
+
||| NO |`Strict`|`Merge`<br />(fallback silently) | NO | NO |
698
+
||||`Merge`<br />/(not set) |`Merge`| NO | NO |
699
+
||`False`| YES |`Strict`<br />(set when the feature was on) |`Strict`| NO | NO |
700
+
||||`Merge`<br />/(not set) |`Merge`| NO | NO |
701
+
||| NO |`Strict`<br />(set when the feature was on) |`Merge`<br />(fallback silently) | NO | NO |
702
+
||||`Merge`<br />/(not set) |`Merge`| NO | NO |
703
+
| >=1.33<br />(Beta or above) |`True`<br />(default) | YES |`Strict`|`Strict`| NO | YES |
704
+
||||`Merge`<br />/(not set) |`Merge`| NO | YES |
705
+
||| NO |`Strict`| - |__REJECTED__(*) | NO |
706
+
||||`Merge`<br />/(not set) |`Merge`| NO | NO |
707
+
||`False`| YES |`Strict`<br />(set when the feature was) |`Strict`| NO | NO |
708
+
||||`Merge`<br />/(not set) |`Merge`| NO | NO |
709
+
||| NO |`Strict`<br />(set when the feature was) | - |__REJECTED__(*) | NO |
710
+
||||`Merge`<br />/(not set) |`Merge`| NO | NO |
711
+
712
+
_(*): See ["What specific metrics should inform a rollback?"](#what-specific-metrics-should-inform-a-rollback) for details_
677
713
678
714
## Production Readiness Review Questionnaire
679
715
@@ -749,11 +785,18 @@ feature.
749
785
NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
750
786
-->
751
787
752
-
Yes. It can be disabled after enabled. However, users should pay attention that gids of container processes in pods with `Strict` policy would change. It means the action might break the application in permission. We plan to provide a way for users to detect which pods are affected.
788
+
Yes. It can be disabled after enabled.
789
+
When disabled, you can not create pods with `SupplementalGroupsPolicy` fields and no `.status.containerStatuses[*].user` will be reported in pod status.
790
+
Please note if there are pods that have been created with `Strict` policy, the policy of the containers in such pods will keep enforced even after its disablement.
791
+
792
+
See ["Version Skew Strategy"](#version-skew-strategy) for more complex cases (including upgrading/downgrading).
753
793
754
794
###### What happens if we reenable the feature if it was previously rolled back?
755
795
756
-
Just the policy `Stcict` is reenabled. Users should pay attention that gids of containers in pods with `Stcict` policy would change. It means that the action might break the application in permission. We plan to provide a way for users to detect which pods are affected.
796
+
The `SupplementalGroupsPolicy` field in pod spec and `.status.containerStatuses[*].user` in pod status will be available again.
797
+
As described above section, for pods that have been created with `Strict` policy before, the policy of the containers in such pods will still keep enforced after its re-enablement.
798
+
799
+
See ["Version Skew Strategy"](#version-skew-strategy) for more complex cases (including upgrading/downgrading).
757
800
758
801
###### Are there any tests for feature enablement/disablement?
759
802
@@ -790,23 +833,10 @@ rollout. Similarly, consider large clusters and how enablement/disablement
790
833
will rollout across nodes.
791
834
-->
792
835
793
-
A rollout may fail when at least one of the following components are too old because this KEP introduces the new Kubernetes API field:
794
-
795
-
| Component |`supplementalGroupsPolicy` value that will cause an error |
As long as you does not use the `SupplementalGroupsPolicy` fields, rollout or rollback will be safe. And, there is no impact to already running workloads because the feature have backward compatible.
800
837
801
-
802
-
For example, an error will be returned like this if kube-apiserver is too old:
803
-
```console
804
-
$ kubectl apply -f supplementalgroupspolicy.yaml
805
-
Error from server (BadRequest): error when creating "supplementalgroupspolicy.yaml": Pod in version "v1" cannot be handled as a Pod:
806
-
strict decoding error: unknown field "spec.securityContext.supplementalGroupsPolicy"
807
-
```
808
-
809
-
No impact on already running workloads.
838
+
However, if there exist pods with `SupplementalGroupsPolicy` fields when to rollout/rollback, there need to be caution.
839
+
Please see the matrix in ["Version Skew Strategy"](#version-skew-strategy) section for details.
810
840
811
841
###### What specific metrics should inform a rollback?
812
842
@@ -815,19 +845,41 @@ What signals should users be paying attention to when the feature is young
815
845
that might indicate a serious problem?
816
846
-->
817
847
818
-
Look for an event saying indicating SupplementalGroupsPolicy is not supported by the runtime.
848
+
As long as you does not use the `SupplementalGroupsPolicy` fields, rollout or rollback will be safe as described in the above section.
849
+
850
+
However, if there exist pods with `SupplementalGroupsPolicy` fields when to rollout/rollback, pod creation rejection might happen when
851
+
- the feature level of rollout-ed/rollback-ed version is beta or above, and
852
+
- pods with `Strict` policy (set when the feature gate was on previously) are scheduled to the nodes whose CRI runtime does NOT support this feature.
853
+
854
+
In that case, please look for an event saying indicating SupplementalGroupsPolicy is not supported by the node as the rollback signal.
"message": "Error: SupplementalGroupsPolicy is not supported in this node.",
826
863
...
827
864
}
828
865
...
829
866
```
830
867
868
+
Also, the following kubelet metrics are also useful to check:
869
+
870
+
-`kubelet_running_pods`: Shows the actual number of pods running
871
+
-`kubelet_desired_pods`: The number of pods the kubelet is trying to run
872
+
873
+
If these metrics are different, it means there are desired pods that can't be set to running.
874
+
If that is the case, checking the pod events to see if they are failing for SupplementalGroupsPolicy reasons
875
+
(like the errors shown in above) is advised, in which case it is recommended to rollback.
876
+
877
+
Even this KEP does NOT include kube-scheduler integration to ensure to let the scheduler place pods requires
878
+
the feature(`Strict` policy) to the nodes which support this feature, you can use node labels and
879
+
pod's `nodeSelector`/`nodeAffinity` to mitigate pod rejection or error events. Please see
880
+
["Are there any missing metrics that would be useful to have to improve observability of this feature?"](#are-there-any-missing-metrics-that-would-be-useful-to-have-to-improve-observability-of-this-feature)
881
+
section below for details.
882
+
831
883
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
832
884
833
885
<!--
@@ -912,6 +964,8 @@ question.
912
964
-->
913
965
914
966
-`supplementalGroupsPolicy=Strict`: 100% of pods were scheduled into a node with the feature supported.
967
+
Even this KEP does NOT include scheduler integration, please see
968
+
["Are there any missing metrics that would be useful to have to improve observability of this feature?"](#are-there-any-missing-metrics-that-would-be-useful-to-have-to-improve-observability-of-this-feature) section for this.
915
969
916
970
-`supplementalGroupsPolicy=Merge`: 100% of pods were scheduled into a node with or without the feature supported.
917
971
@@ -937,14 +991,23 @@ Describe the metrics themselves and the reasons why they weren't added (e.g., co
937
991
implementation difficulties, etc.).
938
992
-->
939
993
940
-
Potentially, kube-scheduler could be implemented to avoid scheduling a pod with `supplementalGroupsPolicy: Strict`
941
-
to a node running CRI runtime which does not supported this feature.
994
+
Potentially, kube-scheduler could implement a rule to avoid scheduling a pod with `supplementalGroupsPolicy: Strict`
995
+
to a node not supporting this feature.
942
996
943
-
In this way, the Event metric described above would not happen, and users would instead see `Pending`pods
944
-
as an error metric.
997
+
However, this is not covered by this KEP. It is because that more generic way would be nice in Kubernetes so that scheduler can schedule pods which requires node feature X
998
+
to the nodes which support node feature X.
945
999
946
-
However, this is not planned to be implemented in kube-scheduler, as it seems overengineering.
947
-
Users may use `nodeSelector`, `nodeAffinity`, etc. to workaround this.
1000
+
As of v1.33, although kubernetes does not offer such generic way to do this, cluster admins can maintain node labels and use `nodeSelector`/`nodeAffinity` in pods instead.
1001
+
1002
+
There are several way to automate them:
1003
+
1004
+
- By Mutating Webhook:
1005
+
- for nodes, which transforms `Node.Status.Feature.SupplementalGroupsPolicy` field to some node label(say `supplementalgroupspolicy-supported: "true" | "false"`),
1006
+
- for pods, which mutates an additional `.spec.nodeSelector: { "supplementalgroupspolicy-supported": "true" }` when the pod specifies `Strict` policy.
1007
+
- By Mutating Admission Policy:
1008
+
- although the feature is still alpha as of v1.32, you can write the equivalent policy to do this.
1009
+
1010
+
If you appropriately managed the node labels and pods' `nodeSelector`/`nodeAffinity`, the error events or pod rejection will not expect to happen. Instead, you will need to watch `Pending` pods if there are sufficient number of nodes supporting SupplementalGroupsPolicy in the cluster.
948
1011
949
1012
### Dependencies
950
1013
@@ -995,7 +1058,11 @@ This may affect scalability.
995
1058
996
1059
To evaluate this risk, users may run
997
1060
`kubectl get nodes -o json | jq '[.items[].status.features]'`
998
-
to see how many nodes support `supplementalGroupsPolicy: true`.
1061
+
to see how many nodes support `supplementalGroupsPolicy: true` before using `Strict` policy.
1062
+
1063
+
To mitigate this probability, you can also manage node labels and pod's `nodeSelector`/`nodeAffinity` to
1064
+
ensure pods with `Strict` policy to the nodes which support SupplementalGroupPolicy feature.
1065
+
Please see ["Are there any missing metrics that would be useful to have to improve observability of this feature?"](#are-there-any-missing-metrics-that-would-be-useful-to-have-to-improve-observability-of-this-feature) section.
999
1066
1000
1067
###### Will enabling / using this feature result in any new API calls?
0 commit comments