Skip to content

Commit 73b7b14

Browse files
authored
Merge pull request #46921 from everpeace/blog-KEP-3619-SupplementalGroupsPolicy
blog post for KEP-3619: Fine-grained SupplementalGroups control
2 parents 440faa5 + fee94ce commit 73b7b14

File tree

3 files changed

+194
-0
lines changed

3 files changed

+194
-0
lines changed
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
apiVersion: v1
2+
kind: Pod
3+
metadata:
4+
name: implicit-groups
5+
spec:
6+
securityContext:
7+
runAsUser: 1000
8+
runAsGroup: 3000
9+
supplementalGroups: [4000]
10+
containers:
11+
- name: ctr
12+
image: registry.k8s.io/e2e-test-images/agnhost:2.45
13+
command: [ "sh", "-c", "sleep 1h" ]
14+
securityContext:
15+
allowPrivilegeEscalation: false
Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
---
2+
layout: blog
3+
title: 'Kubernetes 1.31: Fine-grained SupplementalGroups control'
4+
date: 2024-08-22
5+
slug: fine-grained-supplementalgroups-control
6+
author: >
7+
Shingo Omura (Woven By Toyota)
8+
9+
---
10+
11+
This blog discusses a new feature in Kubernetes 1.31 to improve the handling of supplementary groups in containers within Pods.
12+
13+
14+
## Motivation: Implicit group memberships defined in `/etc/group` in the container image
15+
16+
Although this behavior may not be popular with many Kubernetes cluster users/admins, kubernetes, by default, _merges_ group information from the Pod with information defined in `/etc/group` in the container image.
17+
18+
Let's see an example, below Pod specifies `runAsUser=1000`, `runAsGroup=3000` and `supplementalGroups=4000` in the Pod's security context.
19+
20+
{{% code_sample file="implicit-groups.yaml" %}}
21+
22+
What is the result of `id` command in the `ctr` container?
23+
24+
```console
25+
# Create the Pod:
26+
$ kubectl apply -f https://k8s.io/blog/2024-08-22-Fine-grained-SupplementalGroups-control/implicit-groups.yaml
27+
28+
# Verify that the Pod's Container is running:
29+
$ kubectl get pod implicit-groups
30+
31+
# Check the id command
32+
$ kubectl exec implicit-groups -- id
33+
```
34+
35+
Then, output should be similar to this:
36+
37+
```none
38+
uid=1000 gid=3000 groups=3000,4000,50000
39+
```
40+
41+
Where does group ID `50000` in supplementary groups (`groups` field) come from, even though `50000` is not defined in the Pod's manifest at all? The answer is `/etc/group` file in the container image.
42+
43+
Checking the contents of `/etc/group` in the container image should show below:
44+
45+
```console
46+
$ kubectl exec implicit-groups -- cat /etc/group
47+
...
48+
user-defined-in-image:x:1000:
49+
group-defined-in-image:x:50000:user-defined-in-image
50+
```
51+
52+
Aha! The container's primary user `1000` belongs to the group `50000` in the last entry.
53+
54+
Thus, the group membership defined in `/etc/group` in the container image for the container's primary user is _implicitly_ merged to the information from the Pod. Please note that this was a design decision the current CRI implementations inherited from Docker, and the community never really reconsidered it until now.
55+
56+
### What's wrong with it?
57+
58+
The _implicitly_ merged group information from `/etc/group` in the container image may cause some concerns particularly in accessing volumes (see [kubernetes/kubernetes#112879](https://issue.k8s.io/112879) for details) because file permission is controlled by uid/gid in Linux. Even worse, the implicit gids from `/etc/group` can not be detected/validated by any policy engines because there is no clue for the implicit group information in the manifest. This can also be a concern for Kubernetes security.
59+
60+
## Fine-grined SupplementalGroups control in a Pod: `SupplementaryGroupsPolicy`
61+
62+
To tackle the above problem, Kubernetes 1.31 introduces new field `supplementalGroupsPolicy` in Pod's `.spec.securityContext`.
63+
64+
This field provies a way to control how to calculate supplementary groups for the container processes in a Pod. The available policy is below:
65+
66+
* _Merge_: The group membership defined in `/etc/group` for the container's primary user will be merged. If not specified, this policy will be applied (i.e. as-is behavior for backword compatibility).
67+
68+
* _Strict_: it only attaches specified group IDs in `fsGroup`, `supplementalGroups`, or `runAsGroup` fields as the supplementary groups of the container processes. This means no group membership defined in `/etc/group` for the container's primary user will be merged.
69+
70+
Let's see how `Strict` policy works.
71+
72+
{{% code_sample file="strict-supplementalgroups-policy.yaml" %}}
73+
74+
```console
75+
# Create the Pod:
76+
$ kubectl apply -f https://k8s.io/blog/2024-08-22-Fine-grained-SupplementalGroups-control/strict-supplementalgroups-policy.yaml
77+
78+
# Verify that the Pod's Container is running:
79+
$ kubectl get pod strict-supplementalgroups-policy
80+
81+
# Check the process identity:
82+
kubectl exec -it strict-supplementalgroups-policy -- id
83+
```
84+
85+
The output should be similar to this:
86+
87+
```none
88+
uid=1000 gid=3000 groups=3000,4000
89+
```
90+
91+
You can see `Strict` policy can exclude group `50000` from `groups`!
92+
93+
Thus, ensuring `supplementalGroupsPolicy: Merge` (enforced by some policy mechanism) helps prevent the implicit supplementary groups in a Pod.
94+
95+
{{<note>}}
96+
Actually, this is not enough because container with sufficient privileges / capability can change its process identity. Please see the following section for details.
97+
{{</note>}}
98+
99+
## Attached process identity in Pod status
100+
101+
This feature also exposes the process identity attached to the first container process of the container
102+
via `.status.containerStatuses[].user.linux` field. It would be helpful to see if implicit group IDs are attached.
103+
104+
```yaml
105+
...
106+
status:
107+
containerStatuses:
108+
- name: ctr
109+
user:
110+
linux:
111+
gid: 3000
112+
supplementalGroups:
113+
- 3000
114+
- 4000
115+
uid: 1000
116+
...
117+
```
118+
119+
{{<note>}}
120+
Please note that the values in `status.containerStatuses[].user.linux` field is _the firstly attached_
121+
process identity to the first container process in the container. If the container has sufficient privilege
122+
to call system calls related to process identity (e.g. [`setuid(2)`](https://man7.org/linux/man-pages/man2/setuid.2.html), [`setgid(2)`](https://man7.org/linux/man-pages/man2/setgid.2.html) or [`setgroups(2)`](https://man7.org/linux/man-pages/man2/setgroups.2.html), etc.), the container process can change its identity. Thus, the _actual_ process identity will be dynamic.
123+
{{</note>}}
124+
125+
## Feature availability
126+
127+
To enable `supplementalGroupsPolicy` field, the following components have to be used:
128+
129+
- Kubernetes: v1.31 or later, with the `SupplementalGroupsPolicy` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) enabled. As of v1.31, the gate is marked as alpha.
130+
- CRI runtime:
131+
- containerd: v2.0 or later
132+
- CRI-O: v1.31 or later
133+
134+
You can see if the feature is supported in the Node's `.status.features.supplementalGroupsPolicy` field.
135+
136+
```yaml
137+
apiVersion: v1
138+
kind: Node
139+
...
140+
status:
141+
features:
142+
supplementalGroupsPolicy: true
143+
```
144+
145+
## What's next?
146+
147+
Kubernetes SIG Node hope - and expect - that the feature will be promoted to beta and eventually
148+
general availability (GA) in future releases of Kubernetes, so that users no longer need to enable
149+
the feature gate manually.
150+
151+
`Merge` policy is applied when `supplementalGroupsPolicy` is not specified, for backwards compatibility.
152+
153+
## How can I learn more?
154+
155+
<!-- https://github.com/kubernetes/website/pull/46920 -->
156+
Please check out the [documentation](/docs/tasks/configure-pod-container/security-context/)
157+
for the further details of `supplementalGroupsPolicy`.
158+
159+
## How to get involved?
160+
161+
This feature is driven by the SIG Node community. Please join us to connect with
162+
the community and share your ideas and feedback around the above feature and
163+
beyond. We look forward to hearing from you!
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
apiVersion: v1
2+
kind: Pod
3+
metadata:
4+
name: strict-supplementalgroups-policy
5+
spec:
6+
securityContext:
7+
runAsUser: 1000
8+
runAsGroup: 3000
9+
supplementalGroups: [4000]
10+
supplementalGroupsPolicy: Strict
11+
containers:
12+
- name: ctr
13+
image: registry.k8s.io/e2e-test-images/agnhost:2.45
14+
command: [ "sh", "-c", "sleep 1h" ]
15+
securityContext:
16+
allowPrivilegeEscalation: false

0 commit comments

Comments
 (0)