You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-storage/1710-selinux-relabeling/README.md
+9-4Lines changed: 9 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -196,8 +196,11 @@ This KEP changes behavior of Kubernetes when two pods with different SELinux con
196
196
Let PodA with SELinux context X runs and PodB with SELinux context Y is about to start on the same node and both use the same volume.
197
197
198
198
* *Before this KEP*: PodA suddenly starts getting "permission denied" errors when accessing files on the volume, because the container runtime re-labeled all files on it with label Y when starting pod B. PodB will start just fine and can access the volume.
199
-
* *As proposed in this KEP*: PodB won't even start, because the volume is already mounted with `-o context=X`. When kubelet tries to mount the same volume with `-o context=Y`, this mount fails. The Pod B with be `ContainerCreating` until Pod A is deleted and its volumes unmounted.
200
-
* Exact error message will depend on the CSI driver, if it uses `/bin/mount`, it will likely show a generic message like `mount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error`. `/bind/mount` / kernel is not able to tell which mount option is wrong.
199
+
* *As proposed in this KEP*: PodB won't even start, because the volume is already mounted with `-o context=X`.
200
+
Since kubelet tracks SELinux contexts of all mounts it manages, it will see that a new pod wants to use an already mounted volume with a different context, and it will fail with a message like `volume X is already used by pod Y with another SELinux context`.
201
+
Note that this will not work for mounts of the volume done by something else than kubelet.
202
+
In that case, kubelet will pass `-o context=X` to the CSI driver, the driver will pass it to kernel and kernel will fail with a generic `mount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error`.
203
+
`/bind/mount` / kernel is not able to tell which mount option is wrong.
201
204
202
205
A special case of the previous example is when two pods with different SELinux contexts use the same volume, but different subpaths of it.
203
206
The container runtime then re-labels only these subpaths and as long as the subpaths are different, both pods can run today.
@@ -320,10 +323,12 @@ Apart from the obvious API change and behavior described above, kubelet + volume
320
323
321
324
* Kubelet's VolumeManager needs to track which SELinux label should get a volume in global mount (to call `MountDevice()` with the right mount options).
322
325
* It must call `UnmountDevice()` even when another pod wants to re-use a mounted volume, but it has a different SELinux context.
323
-
* While tracking SELinux labels of volumes, it can emit metrics suggested below.
324
-
* After kubelet restart, kubelet must reconstruct the original SELinux label it used to SetUp (MountDevice) each volume.
326
+
* After kubelet restart, kubelet must reconstruct the original SELinux label it used to SetUp and MountDevice of each volume.
325
327
* Volume reconstruction must be updated to get the SELinux label from mount (in-tree volume plugins) or stored json file (CSI).
326
328
This label must be updated in VolumeManager's ActualStateOfWorld after reconstruction.
329
+
* Reconciler must check also SELinux context used to mount a volume (both mounted devices and volumes) before considering what operation to take on a volume (`MountVolume` or `UnmountVolume`/`UnmountDevice` or nothing).
330
+
It must throw proper error message telling that a Pod can't start because its volume is used by another Pod with a different SELinux context.
331
+
* This is a good point to capture any metrics proposed below.
327
332
* Volume plugins will get SELinux context as a new parameter of `MountDevice` and `SetUp`/`SetupAt`calls (resp. as a new field in `DeviceMounterArgs` / `MounterArgs`).
328
333
* Each volume plugin can choose to use the mount option `-o context=` (e.g. when `CSIDriver.SELinuxRelabelPolicy` is `true`) or ignore it (e.g. in-tree volume plugins for shared filesystems or when `CSIDriver.SELinuxRelabelPolicy` is `false` or `nil`).
329
334
* Each volume plugin then returns `SupportsSELinux` from `GetAttributes()` call, depending on if it wants the container runtime to relabel the volume (`true`) or not (`false`; the volume was already mounted with the right label or it does not support SELinux at all).
0 commit comments