@@ -7,7 +7,7 @@ min-kubernetes-server-version: v1.25
7
7
---
8
8
9
9
<!-- overview -->
10
- {{< feature-state for_k8s_version="v1.25 " state="alpha " >}}
10
+ {{< feature-state for_k8s_version="v1.30 " state="beta " >}}
11
11
12
12
This page explains how user namespaces are used in Kubernetes pods. A user
13
13
namespace isolates the user running inside the container from the one
@@ -46,7 +46,26 @@ tmpfs, Secrets use a tmpfs, etc.)
46
46
Some popular filesystems that support idmap mounts in Linux 6.3 are: btrfs,
47
47
ext4, xfs, fat, tmpfs, overlayfs.
48
48
49
- In addition, support is needed in the
49
+ In addition, the container runtime and its underlying OCI runtime must support
50
+ user namespaces. The following OCI runtimes offer support:
51
+
52
+ * [ crun] ( https://github.com/containers/crun ) version 1.9 or greater (it's recommend version 1.13+).
53
+
54
+ <!-- ideally, update this if a newer minor release of runc comes out, whether or not it includes the idmap support -->
55
+ {{< note >}}
56
+ Many OCI runtimes do not include the support needed for using user namespaces in
57
+ Linux pods. If you use a managed Kubernetes, or have downloaded it from packages
58
+ and set it up, it's likely that nodes in your cluster use a runtime that doesn't
59
+ include this support. For example, the most widely used OCI runtime is ` runc ` ,
60
+ and version ` 1.1.z ` of runc doesn't support all the features needed by the
61
+ Kubernetes implementation of user namespaces.
62
+
63
+ If there is a newer release of runc than 1.1 available for use, check its
64
+ documentation and release notes for compatibility (look for idmap mounts support
65
+ in particular, because that is the missing feature).
66
+ {{< /note >}}
67
+
68
+ To use user namespaces with Kubernetes, you also need to use a CRI
50
69
{{< glossary_tooltip text="container runtime" term_id="container-runtime" >}}
51
70
to use this feature with Kubernetes pods:
52
71
@@ -137,20 +156,67 @@ use, see `man 7 user_namespaces`.
137
156
138
157
## Set up a node to support user namespaces
139
158
140
- It is recommended that the host's files and host's processes use UIDs/GIDs in
141
- the range of 0-65535.
159
+ By default, the kubelet assigns pods UIDs/GIDs above the range 0-65535, based on
160
+ the assumption that the host's files and processes use UIDs/GIDs within this
161
+ range, which is standard for most Linux distributions. This approach prevents
162
+ any overlap between the UIDs/GIDs of the host and those of the pods.
163
+
164
+ Avoiding the overlap is important to mitigate the impact of vulnerabilities such
165
+ as [ CVE-2021 -25741] [ CVE-2021-25741 ] , where a pod can potentially read arbitrary
166
+ files in the host. If the UIDs/GIDs of the pod and the host don't overlap, it is
167
+ limited what a pod would be able to do: the pod UID/GID won't match the host's
168
+ file owner/group.
169
+
170
+ The kubelet can use a custom range for user IDs and group IDs for pods. To
171
+ configure a custom range, the node needs to have:
172
+
173
+ * A user ` kubelet ` in the system (you cannot use any other username here)
174
+ * The binary ` getsubids ` installed (part of [ shadow-utils] [ shadow-utils ] ) and
175
+ in the ` PATH ` for the kubelet binary.
176
+ * A configuration of subordinate UIDs/GIDs for the ` kubelet ` user (see
177
+ [ ` man 5 subuid ` ] ( https://man7.org/linux/man-pages/man5/subuid.5.html ) and
178
+ [ ` man 5 subgid ` ] ( https://man7.org/linux/man-pages/man5/subgid.5.html ) ).
179
+
180
+ This setting only gathers the UID/GID range configuration and does not change
181
+ the user executing the ` kubelet ` .
182
+
183
+ You must follow some constraints for the subordinate ID range that you assign
184
+ to the ` kubelet ` user:
185
+
186
+ * The subordinate user ID, that starts the UID range for Pods, ** must** be a
187
+ multiple of 65536 and must also be greater than or equal to 65536. In other
188
+ words, you cannot use any ID from the range 0-65535 for Pods; the kubelet
189
+ imposes this restriction to make it difficult to create an accidentally insecure
190
+ configuration.
191
+
192
+ * The subordinate ID count must be a multiple of 65536
193
+
194
+ * The subordinate ID count must be at least ` 65536 x <maxPods> ` where ` <maxPods> `
195
+ is the maximum number of pods that can run on the node.
196
+
197
+ * You must assign the same range for both user IDs and for group IDs, It doesn't
198
+ matter if other users have user ID ranges that don't align with the group ID
199
+ ranges.
200
+
201
+ * None of the assigned ranges should overlap with any other assignment.
202
+
203
+ * The subordinate configuration must be only one line. In other words, you can't
204
+ have multiple ranges.
142
205
143
- The kubelet will assign UIDs/GIDs higher than that to pods. Therefore, to
144
- guarantee as much isolation as possible, the UIDs/GIDs used by the host's files
145
- and host's processes should be in the range 0-65535.
206
+ For example, you could define ` /etc/subuid ` and ` /etc/subgid ` to both have
207
+ these entries for the ` kubelet ` user:
146
208
147
- Note that this recommendation is important to mitigate the impact of CVEs like
148
- [ CVE-2021 -25741] [ CVE-2021-25741 ] , where a pod can potentially read arbitrary
149
- files in the hosts. If the UIDs/GIDs of the pod and the host don't overlap, it
150
- is limited what a pod would be able to do: the pod UID/GID won't match the
151
- host's file owner/group.
209
+ ```
210
+ # The format is
211
+ # name:firstID:count of IDs
212
+ # where
213
+ # - firstID is 65536 (the minimum value possible)
214
+ # - count of IDs is 110 (default limit for number of) * 65536
215
+ kubelet:65536:7208960
216
+ ```
152
217
153
218
[ CVE-2021-25741 ] : https://github.com/kubernetes/kubernetes/issues/104980
219
+ [ shadow-utils ] : https://github.com/shadow-maint/shadow
154
220
155
221
## Integration with Pod security admission checks
156
222
0 commit comments