Skip to content

Commit 45b62d8

Browse files
authored
Merge pull request #48515 from jsafrane/selinux-1.32
Document SELinuxChangePolicy and SELinuxMount
2 parents 3876340 + 8e17234 commit 45b62d8

File tree

3 files changed

+73
-6
lines changed

3 files changed

+73
-6
lines changed
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
---
2+
title: SELinuxChangePolicy
3+
content_type: feature_gate
4+
_build:
5+
list: never
6+
render: false
7+
8+
stages:
9+
- stage: alpha
10+
defaultValue: false
11+
fromVersion: "1.32"
12+
---
13+
Enables `spec.securityContext.seLinuxChangePolicy` field.
14+
This field can be used to opt-out from applying the SELinux label to the pod
15+
volumes using mount options. This is required when a single volume that supports
16+
mounting with SELinux mount option is shared between Pods that have different
17+
SELinux labels, such as a privileged and unprivileged Pods.
18+
19+
Enabling the `SELinuxChangePolicy` feature gate requires the feature gate `SELinuxMountReadWriteOncePod` to
20+
be enabled.

content/en/docs/reference/command-line-tools-reference/feature-gates/selinux-mount.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,5 +16,5 @@ recursively.
1616
It widens the performance improvements behind the `SELinuxMountReadWriteOncePod`
1717
feature gate by extending the implementation to all volumes.
1818

19-
Enabling the `SELinuxMount` feature gate requires the feature gate `SELinuxMountReadWriteOncePod` to
20-
be enabled.
19+
Enabling the `SELinuxMount` feature gate requires the feature gates `SELinuxMountReadWriteOncePod`
20+
and `SELinuxChangePolicy` to be enabled.

content/en/docs/tasks/configure-pod-container/security-context.md

Lines changed: 51 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -677,8 +677,8 @@ To assign SELinux labels, the SELinux security module must be loaded on the host
677677
Kubernetes v1.27 introduced an early limited form of this behavior that was only applicable
678678
to volumes (and PersistentVolumeClaims) using the `ReadWriteOncePod` access mode.
679679

680-
As an alpha feature, you can enable the `SELinuxMount`
681-
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) to widen that
680+
As an alpha feature, you can enable the `SELinuxMount` and `SELinuxChangePolicy`
681+
[feature gates](/docs/reference/command-line-tools-reference/feature-gates/) to widen that
682682
performance improvement to other kinds of PersistentVolumeClaims, as explained in detail
683683
below.
684684
{{< /note >}}
@@ -694,7 +694,9 @@ To benefit from this speedup, all these conditions must be met:
694694
and `SELinuxMountReadWriteOncePod` must be enabled.
695695
* Pod must use PersistentVolumeClaim with applicable `accessModes` and [feature gates](/docs/reference/command-line-tools-reference/feature-gates/):
696696
* Either the volume has `accessModes: ["ReadWriteOncePod"]`, and feature gate `SELinuxMountReadWriteOncePod` is enabled.
697-
* Or the volume can use any other access modes and both feature gates `SELinuxMountReadWriteOncePod` and `SELinuxMount` must be enabled.
697+
* Or the volume can use any other access modes and both feature gates
698+
`SELinuxMountReadWriteOncePod`, `SELinuxChangePolicy` and `SELinuxMount` must be enabled
699+
and the Pod has `spec.securityContext.seLinuxChangePolicy` either nil (default) or `MountOption`.
698700
* Pod (or all its Containers that use the PersistentVolumeClaim) must
699701
have `seLinuxOptions` set.
700702
* The corresponding PersistentVolume must be either:
@@ -706,7 +708,52 @@ To benefit from this speedup, all these conditions must be met:
706708
For any other volume types, SELinux relabelling happens another way: the container
707709
runtime recursively changes the SELinux label for all inodes (files and directories)
708710
in the volume.
709-
The more files and directories in the volume, the longer that relabelling takes.
711+
712+
{{< feature-state feature_gate_name="SELinuxChangePolicy" >}}
713+
For Pods that want to opt-out from relabeling using mount options, they can set
714+
`spec.securityContext.seLinuxChangePolicy` to `Recursive`. This is required
715+
when multiple pods share a single volume on the same node, but they run with
716+
different SELinux labels that allows simultaneous access to the volume. For example, a privileged pod
717+
running with label `spc_t` and an unprivileged pod running with the default label `container_file_t`.
718+
With unset `spec.securityContext.seLinuxChangePolicy` (or with the default value `MountOption`),
719+
only one of such pods is able to run on a node, the other one gets ContainerCreating with error
720+
`conflicting SELinux labels of volume <name of the volume>: <label of the running pod> and <label of the pod that can't start>`.
721+
722+
#### SELinuxWarningController
723+
To make it easier to identify Pods that are affected by the change in SELinux volume relabeling,
724+
a new controller called `SELinuxWarningController` has been introduced in kube-controller-manager.
725+
It is disabled by default and can be enabled by either setting the `--controllers=*,selinux-warning-controller`
726+
[command line flag](/docs/reference/command-line-tools-reference/kube-controller-manager/),
727+
or by setting `genericControllerManagerConfiguration.controllers`
728+
[field in KubeControllerManagerConfiguration](/docs/reference/config-api/kube-controller-manager-config.v1alpha1/#controllermanager-config-k8s-io-v1alpha1-GenericControllerManagerConfiguration).
729+
This controller requires `SELinuxChangePolicy` feature gate to be enabled.
730+
731+
When enabled, the controller observes running Pods and when it detects that two Pods use the same volume
732+
with different SELinux labels:
733+
1. It emits an event to both of the Pods. `kubectl describe pod <pod-name>` the shows
734+
`SELinuxLabel "<label on the pod>" conflicts with pod <the other pod name> that uses the same volume as this pod
735+
with SELinuxLabel "<the other pod label>". If both pods land on the same node, only one of them may access the volume`.
736+
2. Raise `selinux_warning_controller_selinux_volume_conflict` metric. The metric has both pod
737+
names + namespaces as labels to identify the affected pods easily.
738+
739+
A cluster admin can use this information to identify pods affected by the planning change and
740+
proactively opt-out Pods from the optimization (i.e. set `spec.securityContext.seLinuxChangePolicy: Recursive`).
741+
742+
#### Feature gates
743+
744+
The following feature gates control the behavior of SELinux volume relabeling:
745+
746+
* `SELinuxMountReadWriteOncePod`: enables the optimization for volumes with `accessModes: ["ReadWriteOncePod"]`.
747+
This is a very safe feature gate to enable, as it cannot happen that two pods can share one single volume with
748+
this access mode. This feature gate is enabled by default sine v1.28.
749+
* `SELinuxChangePolicy`: enables `spec.securityContext.seLinuxChangePolicy` field in Pod and related SELinuxWarningController
750+
in kube-controller-manager. This feature can be used before enabling `SELinuxMount` to check Pods running on a cluster,
751+
and to pro-actively opt-out Pods from the optimization.
752+
This feature gate requires `SELinuxMountReadWriteOncePod` enabled. It is alpha and disabled by default in 1.32.
753+
* `SELinuxMount` enables the optimization for all eligible volumes. Since it can break existing workloads, we recommend
754+
enabling `SELinuxChangePolicy` feature gate + SELinuxWarningController first to check the impact of the change.
755+
This feature gate requires `SELinuxMountReadWriteOncePod` and `SELinuxChangePolicy` enabled. It is alpha and disabled
756+
by default in 1.32.
710757

711758
## Managing access to the `/proc` filesystem {#proc-access}
712759

0 commit comments

Comments
 (0)