|
| 1 | +#Support "no new privileges" in Kubernetes |
| 2 | + |
| 3 | +##Description |
| 4 | + |
| 5 | +In Linux, the `execve` system call can grant more privileges to a newly-created process than its parent process. Considering security issues, since Linux kernel v3.5, there is a new flag named `no_new_privs` added to prevent those new privileges from being granted to the processes. |
| 6 | + |
| 7 | +`no_new_privs` is inherited across `fork`, `clone` and `execve` and can not be unset. With `no_new_privs` set, `execve` promises not to grant the privilege to do anything that could not have been done without the `execve` call. |
| 8 | + |
| 9 | +Docker started to support `no_new_privs` option since 1.11. Here is the [link](https://github.com/docker/docker/issues/20329) of the ticket in Docker community to support `no_new_privs` option. |
| 10 | + |
| 11 | +We want to support the creation of containers with `no_new_privs` enabled in Kubernetes, which will make the Kubernetes cluster more safe. Here is the [link](https://github.com/kubernetes/kubernetes/issues/38417) of the ticket in Kubernetes community to track this proposal. |
| 12 | + |
| 13 | + |
| 14 | +##Current implementation |
| 15 | + |
| 16 | +###Support in Docker |
| 17 | + |
| 18 | +Since Docker 1.11, user can specify `--security-opt` to enable `no_new_privs` while creating containers, e.g. `docker run --security-opt=no-new-privileges busybox` |
| 19 | + |
| 20 | +For program client, Docker provides an object named `ContainerCreateConfig` defined in package `github.com/docker/engine-api/types` to config container creation parameters. In this object, there is a string array `HostConfig.SecurityOpt` to specify the security options. Client can utilize this field to specify the arguments for security options while creating new containers. |
| 21 | + |
| 22 | +###SecurityContext in Kubernetes |
| 23 | + |
| 24 | +Kubernetes defines `v1.SecurityContext` for `v1.Container` and `v1.PodSecurityContext` for `v1.PodSpec`. `SecurityContext` objects define the related security options for Kubernetes containers, e.g. selinux options. |
| 25 | + |
| 26 | +While creating a container, kubelet parses the security context object and formats the security option strings for Docker. The security options strings will finally be inserted into `ContainerCreateConfig.HostConfig.SecurityOpt` and passed to Docker. Different Kubernetes runtimes now are using different methods to parse and format the security option strings: |
| 27 | +* method `#getSecurityOpts` in `docker_mager_xxxx.go` for Docker runtime |
| 28 | +* method `#getContainerSecurityOpts` in `docker_container.go` for CRI |
| 29 | + |
| 30 | + |
| 31 | +##Proposal to support "no new privileges" |
| 32 | + |
| 33 | +To support "no new privileges" options in Kubernetes, it is proposed to make the following changes: |
| 34 | + |
| 35 | +###Changes of SecurityContext objects |
| 36 | + |
| 37 | +Add a new field named `noNewPrivileges` to both `v1.SecurityContext` definition and `v1.PodSecurityContext` definition. `noNewPrivileges` is of bool type and by default is `false`. Enabling this option means that user wants to create container(pod) with `no-new-privileges` option enabled. |
| 38 | + |
| 39 | +The change of security context API objects requires the update of corresponding Kubernetes documents, need to submit another PR to track this. |
| 40 | + |
| 41 | +###Changes of docker runtime |
| 42 | + |
| 43 | +When parsing the new `SecurityContext` object, kubelet has to take care of `noNewPrivileges` field from security context objects. Once `noNewPrivileges` is `true`, kubelet needs to change `#getSecurityOpts` method in `docker_manager_xxx.go` to add `no-new-privileges` option to `ContainerCreateConfig.HostConfig.SecurityOpt` |
| 44 | + |
| 45 | +###Changes of CRI runtime |
| 46 | + |
| 47 | +When parsing the new `SecurityContext` object, kubelet has to take care of `noNewPrivileges` field from security context objects. Once `noNewPrivileges` is `true`, kubelet needs to change `#getContainerSecurityOpts` method in `docker_container.go` to add `no-new-privileges` option to `ContainerCreateConfig.HostConfig.SecurityOpt` |
| 48 | + |
| 49 | +###Changes of kubectl |
| 50 | + |
| 51 | +This is an additional proposal for kubectl. To improve kubectl user experience, we can add a new flag for kubectl command named `--security-opt`. This flag allows user to create pod with security options configured when using `kubectl run` command. For example, if user issues command like `kubectl run busybox --image=busybox --security-opt=no-new-privileges -- top`, kubernetes shall create a pod with `noNewPrivileges` enabled. |
| 52 | + |
| 53 | +If the proposal of kubectl changes is accepted, the patch can also be submitted as a separate PR. |
| 54 | + |
0 commit comments