Skip to content

Commit bc4ed83

Browse files
committed
engine/security/rootless: split to multiple pages
This commit only splits the page. The content will be updated in subsequent commits. Signed-off-by: Akihiro Suda <[email protected]>
1 parent 3cd14d8 commit bc4ed83

File tree

3 files changed

+336
-323
lines changed

3 files changed

+336
-323
lines changed
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
---
2+
description: Run the Docker daemon as a non-root user (Rootless mode)
3+
keywords: security, namespaces, rootless
4+
title: Rootless mode
5+
weight: 10
6+
---
7+
8+
Rootless mode allows running the Docker daemon and containers as a non-root
9+
user to mitigate potential vulnerabilities in the daemon and
10+
the container runtime.
11+
12+
Rootless mode does not require root privileges even during the installation of
13+
the Docker daemon, as long as the [prerequisites](#prerequisites) are met.
14+
15+
## How it works
16+
17+
Rootless mode executes the Docker daemon and containers inside a user namespace.
18+
This is very similar to [`userns-remap` mode](../userns-remap.md), except that
19+
with `userns-remap` mode, the daemon itself is running with root privileges,
20+
whereas in rootless mode, both the daemon and the container are running without
21+
root privileges.
22+
23+
Rootless mode does not use binaries with `SETUID` bits or file capabilities,
24+
except `newuidmap` and `newgidmap`, which are needed to allow multiple
25+
UIDs/GIDs to be used in the user namespace.
26+
27+
28+
## Prerequisites
29+
30+
- You must install `newuidmap` and `newgidmap` on the host. These commands
31+
are provided by the `uidmap` package on most distributions.
32+
33+
- `/etc/subuid` and `/etc/subgid` should contain at least 65,536 subordinate
34+
UIDs/GIDs for the user. In the following example, the user `testuser` has
35+
65,536 subordinate UIDs/GIDs (231072-296607).
36+
37+
```console
38+
$ id -u
39+
1001
40+
$ whoami
41+
testuser
42+
$ grep ^$(whoami): /etc/subuid
43+
testuser:231072:65536
44+
$ grep ^$(whoami): /etc/subgid
45+
testuser:231072:65536
46+
```
47+
48+
## Install
49+
50+
> [!NOTE]
51+
>
52+
> If the system-wide Docker daemon is already running, consider disabling it:
53+
>```console
54+
>$ sudo systemctl disable --now docker.service docker.socket
55+
>$ sudo rm /var/run/docker.sock
56+
>```
57+
> Should you choose not to shut down the `docker` service and socket, you will need to use the `--force`
58+
> parameter in the next section. There are no known issues, but until you shutdown and disable you're
59+
> still running rootful Docker.
60+
61+
{{< tabs >}}
62+
{{< tab name="With packages (RPM/DEB)" >}}
63+
64+
If you installed Docker 20.10 or later with [RPM/DEB packages](/engine/install), you should have `dockerd-rootless-setuptool.sh` in `/usr/bin`.
65+
66+
Run `dockerd-rootless-setuptool.sh install` as a non-root user to set up the daemon:
67+
68+
```console
69+
$ dockerd-rootless-setuptool.sh install
70+
[INFO] Creating /home/testuser/.config/systemd/user/docker.service
71+
...
72+
[INFO] Installed docker.service successfully.
73+
[INFO] To control docker.service, run: `systemctl --user (start|stop|restart) docker.service`
74+
[INFO] To run docker.service on system startup, run: `sudo loginctl enable-linger testuser`
75+
76+
[INFO] Make sure the following environment variables are set (or add them to ~/.bashrc):
77+
78+
export PATH=/usr/bin:$PATH
79+
export DOCKER_HOST=unix:///run/user/1000/docker.sock
80+
```
81+
82+
If `dockerd-rootless-setuptool.sh` is not present, you may need to install the `docker-ce-rootless-extras` package manually, e.g.,
83+
84+
```console
85+
$ sudo apt-get install -y docker-ce-rootless-extras
86+
```
87+
88+
{{< /tab >}}
89+
{{< tab name="Without packages" >}}
90+
91+
If you do not have permission to run package managers like `apt-get` and `dnf`,
92+
consider using the installation script available at [https://get.docker.com/rootless](https://get.docker.com/rootless).
93+
Since static packages are not available for `s390x`, hence it is not supported for `s390x`.
94+
95+
```console
96+
$ curl -fsSL https://get.docker.com/rootless | sh
97+
...
98+
[INFO] Creating /home/testuser/.config/systemd/user/docker.service
99+
...
100+
[INFO] Installed docker.service successfully.
101+
[INFO] To control docker.service, run: `systemctl --user (start|stop|restart) docker.service`
102+
[INFO] To run docker.service on system startup, run: `sudo loginctl enable-linger testuser`
103+
104+
[INFO] Make sure the following environment variables are set (or add them to ~/.bashrc):
105+
106+
export PATH=/home/testuser/bin:$PATH
107+
export DOCKER_HOST=unix:///run/user/1000/docker.sock
108+
```
109+
110+
The binaries will be installed at `~/bin`.
111+
112+
{{< /tab >}}
113+
{{< /tabs >}}
114+
115+
See [Troubleshooting](./troubleshoot.md) if you faced an error.
Lines changed: 182 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,182 @@
1+
---
2+
description: Tips for the Rootless mode
3+
keywords: security, namespaces, rootless
4+
title: Tips
5+
weight: 20
6+
---
7+
8+
## Usage
9+
10+
### Daemon
11+
12+
{{< tabs >}}
13+
{{< tab name="With systemd (Highly recommended)" >}}
14+
15+
The systemd unit file is installed as `~/.config/systemd/user/docker.service`.
16+
17+
Use `systemctl --user` to manage the lifecycle of the daemon:
18+
19+
```console
20+
$ systemctl --user start docker
21+
```
22+
23+
To launch the daemon on system startup, enable the systemd service and lingering:
24+
25+
```console
26+
$ systemctl --user enable docker
27+
$ sudo loginctl enable-linger $(whoami)
28+
```
29+
30+
Starting Rootless Docker as a systemd-wide service (`/etc/systemd/system/docker.service`)
31+
is not supported, even with the `User=` directive.
32+
33+
{{< /tab >}}
34+
{{< tab name="Without systemd" >}}
35+
36+
To run the daemon directly without systemd, you need to run `dockerd-rootless.sh` instead of `dockerd`.
37+
38+
The following environment variables must be set:
39+
- `$HOME`: the home directory
40+
- `$XDG_RUNTIME_DIR`: an ephemeral directory that is only accessible by the expected user, e,g, `~/.docker/run`.
41+
The directory should be removed on every host shutdown.
42+
The directory can be on tmpfs, however, should not be under `/tmp`.
43+
Locating this directory under `/tmp` might be vulnerable to TOCTOU attack.
44+
45+
{{< /tab >}}
46+
{{< /tabs >}}
47+
48+
Remarks about directory paths:
49+
50+
- The socket path is set to `$XDG_RUNTIME_DIR/docker.sock` by default.
51+
`$XDG_RUNTIME_DIR` is typically set to `/run/user/$UID`.
52+
- The data dir is set to `~/.local/share/docker` by default.
53+
The data dir should not be on NFS.
54+
- The daemon config dir is set to `~/.config/docker` by default.
55+
This directory is different from `~/.docker` that is used by the client.
56+
57+
### Client
58+
59+
You need to specify either the socket path or the CLI context explicitly.
60+
61+
To specify the socket path using `$DOCKER_HOST`:
62+
63+
```console
64+
$ export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/docker.sock
65+
$ docker run -d -p 8080:80 nginx
66+
```
67+
68+
To specify the CLI context using `docker context`:
69+
70+
```console
71+
$ docker context use rootless
72+
rootless
73+
Current context is now "rootless"
74+
$ docker run -d -p 8080:80 nginx
75+
```
76+
77+
## Best practices
78+
79+
### Rootless Docker in Docker
80+
81+
To run Rootless Docker inside "rootful" Docker, use the `docker:<version>-dind-rootless`
82+
image instead of `docker:<version>-dind`.
83+
84+
```console
85+
$ docker run -d --name dind-rootless --privileged docker:25.0-dind-rootless
86+
```
87+
88+
The `docker:<version>-dind-rootless` image runs as a non-root user (UID 1000).
89+
However, `--privileged` is required for disabling seccomp, AppArmor, and mount
90+
masks.
91+
92+
### Expose Docker API socket through TCP
93+
94+
To expose the Docker API socket through TCP, you need to launch `dockerd-rootless.sh`
95+
with `DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS="-p 0.0.0.0:2376:2376/tcp"`.
96+
97+
```console
98+
$ DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS="-p 0.0.0.0:2376:2376/tcp" \
99+
dockerd-rootless.sh \
100+
-H tcp://0.0.0.0:2376 \
101+
--tlsverify --tlscacert=ca.pem --tlscert=cert.pem --tlskey=key.pem
102+
```
103+
104+
### Expose Docker API socket through SSH
105+
106+
To expose the Docker API socket through SSH, you need to make sure `$DOCKER_HOST`
107+
is set on the remote host.
108+
109+
```console
110+
$ ssh -l <REMOTEUSER> <REMOTEHOST> 'echo $DOCKER_HOST'
111+
unix:///run/user/1001/docker.sock
112+
$ docker -H ssh://<REMOTEUSER>@<REMOTEHOST> run ...
113+
```
114+
115+
### Routing ping packets
116+
117+
On some distributions, `ping` does not work by default.
118+
119+
Add `net.ipv4.ping_group_range = 0 2147483647` to `/etc/sysctl.conf` (or
120+
`/etc/sysctl.d`) and run `sudo sysctl --system` to allow using `ping`.
121+
122+
### Exposing privileged ports
123+
124+
To expose privileged ports (< 1024), set `CAP_NET_BIND_SERVICE` on `rootlesskit` binary and restart the daemon.
125+
126+
```console
127+
$ sudo setcap cap_net_bind_service=ep $(which rootlesskit)
128+
$ systemctl --user restart docker
129+
```
130+
131+
Or add `net.ipv4.ip_unprivileged_port_start=0` to `/etc/sysctl.conf` (or
132+
`/etc/sysctl.d`) and run `sudo sysctl --system`.
133+
134+
### Limiting resources
135+
136+
Limiting resources with cgroup-related `docker run` flags such as `--cpus`, `--memory`, `--pids-limit`
137+
is supported only when running with cgroup v2 and systemd.
138+
See [Changing cgroup version](/manuals/engine/containers/runmetrics.md) to enable cgroup v2.
139+
140+
If `docker info` shows `none` as `Cgroup Driver`, the conditions are not satisfied.
141+
When these conditions are not satisfied, rootless mode ignores the cgroup-related `docker run` flags.
142+
See [Limiting resources without cgroup](#limiting-resources-without-cgroup) for workarounds.
143+
144+
If `docker info` shows `systemd` as `Cgroup Driver`, the conditions are satisfied.
145+
However, typically, only `memory` and `pids` controllers are delegated to non-root users by default.
146+
147+
```console
148+
$ cat /sys/fs/cgroup/user.slice/user-$(id -u).slice/user@$(id -u).service/cgroup.controllers
149+
memory pids
150+
```
151+
152+
To allow delegation of all controllers, you need to change the systemd configuration as follows:
153+
154+
```console
155+
# mkdir -p /etc/systemd/system/[email protected]
156+
# cat > /etc/systemd/system/[email protected]/delegate.conf << EOF
157+
[Service]
158+
Delegate=cpu cpuset io memory pids
159+
EOF
160+
# systemctl daemon-reload
161+
```
162+
163+
> [!NOTE]
164+
>
165+
> Delegating `cpuset` requires systemd 244 or later.
166+
167+
#### Limiting resources without cgroup
168+
169+
Even when cgroup is not available, you can still use the traditional `ulimit` and [`cpulimit`](https://github.com/opsengine/cpulimit),
170+
though they work in process-granularity rather than in container-granularity,
171+
and can be arbitrarily disabled by the container process.
172+
173+
For example:
174+
175+
- To limit CPU usage to 0.5 cores (similar to `docker run --cpus 0.5`):
176+
`docker run <IMAGE> cpulimit --limit=50 --include-children <COMMAND>`
177+
- To limit max VSZ to 64MiB (similar to `docker run --memory 64m`):
178+
`docker run <IMAGE> sh -c "ulimit -v 65536; <COMMAND>"`
179+
180+
- To limit max number of processes to 100 per namespaced UID 2000
181+
(similar to `docker run --pids-limit=100`):
182+
`docker run --user 2000 --ulimit nproc=100 <IMAGE> <COMMAND>`

0 commit comments

Comments
 (0)