Rootless mode allows running BuildKit daemon as a non-root user.
- Using the
overlayfssnapshotter requires kernel >= 5.11 or Ubuntu kernel. On kernel >= 4.18, thefuse-overlayfssnapshotter is used instead ofoverlayfs. On kernel < 4.18, thenativesnapshotter is used. - Network mode is always set to
network.host.
RootlessKit needs to be installed.
$ rootlesskit buildkitd$ buildctl --addr unix:///run/user/$UID/buildkit/buildkitd.sock build ...To isolate BuildKit daemon's network namespace from the host (recommended):
$ rootlesskit --net=slirp4netns --copy-up=/etc --disable-host-loopback buildkitdRootlessKit needs to be installed.
Run containerd in rootless mode using rootlesskit following containerd's document.
$ containerd-rootless.sh
Then let buildkitd join the same namespace as containerd.
$ containerd-rootless-setuptool.sh nsenter -- buildkitd --oci-worker=false --containerd-worker=true --containerd-worker-snapshotter=native
$ docker run \
--name buildkitd \
-d \
--security-opt seccomp=unconfined \
--security-opt apparmor=unconfined \
--device /dev/fuse \
moby/buildkit:rootless --oci-worker-no-process-sandbox
$ buildctl --addr docker-container://buildkitd build ...If you don't mind using --privileged (almost safe for rootless), the docker run flags can be shorten as follows:
$ docker run --name buildkitd -d --privileged moby/buildkit:rootlessAdding --device /dev/fuse to the docker run arguments is required only if you want to use fuse-overlayfs snapshotter.
By adding --oci-worker-no-process-sandbox to the buildkitd arguments, BuildKit can be executed in a container without adding --privileged to docker run arguments.
However, you still need to pass --security-opt seccomp=unconfined --security-opt apparmor=unconfined to docker run.
Note that --oci-worker-no-process-sandbox allows build executor containers to kill (and potentially ptrace depending on the seccomp configuration) an arbitrary process in the BuildKit daemon container.
To allow running rootless buildkitd without --oci-worker-no-process-sandbox, run docker run with --security-opt systempaths=unconfined. (For Kubernetes, set securityContext.procMount to Unmasked.)
The --security-opt systempaths=unconfined flag disables the masks for the /proc mount in the container and potentially allows reading and writing dangerous kernel files, but it is safe when you are running buildkitd as non-root.
The moby/buildkit:rootless image has the following UID/GID configuration:
| Actual ID (shown in the host and the BuildKit daemon container) | Mapped ID (shown in build executor containers) |
|---|---|
| 1000 | 0 |
| 100000 | 1 |
| ... | ... |
| 165535 | 65536 |
$ docker exec buildkitd id
uid=1000(user) gid=1000(user)
$ docker exec buildkitd ps aux
PID USER TIME COMMAND
1 user 0:00 rootlesskit buildkitd --addr tcp://0.0.0.0:1234
13 user 0:00 /proc/self/exe buildkitd --addr tcp://0.0.0.0:1234
21 user 0:00 buildkitd --addr tcp://0.0.0.0:1234
29 user 0:00 ps aux
$ docker exec cat /etc/subuid
user:100000:65536
To change the UID/GID configuration, you need to modify and build the BuildKit image manually.
$ vi Dockerfile
$ make images
$ docker run ... moby/buildkit:local-rootless ...
Try running buildkitd with --oci-worker-snapshotter=fuse-overlayfs:
$ rootlesskit buildkitd --oci-worker-snapshotter=fuse-overlayfsTry running buildkitd with --oci-worker-snapshotter=native:
$ rootlesskit buildkitd --oci-worker-snapshotter=nativeSee https://rootlesscontaine.rs/getting-started/common/subuid/
Make sure to mount an emptyDir volume on /home/user/.local/share/buildkit .
Error fork/exec /proc/self/exe: no space left on device with level=warning msg="/proc/sys/user/max_user_namespaces needs to be set to non-zero."
Run sysctl -w user.max_user_namespaces=N (N=positive integer, like 63359) on the host nodes.
See ../examples/kubernetes/sysctl-userns.privileged.yaml.
This error is known to happen when BuildKit is executed in a container without the --oci-worker-no-sandbox flag.
Make sure that --oci-worker-no-process-sandbox is specified (See below).
Using Ubuntu kernel is recommended.
Make sure to have an emptyDir volume below:
spec:
containers:
- name: buildkitd
volumeMounts:
# Dockerfile has `VOLUME /home/user/.local/share/buildkit` by default too,
# but the default VOLUME does not work with rootless on Google's Container-Optimized OS
# as it is mounted with `nosuid,nodev`.
# https://github.com/moby/buildkit/issues/879#issuecomment-1240347038
- mountPath: /home/user/.local/share/buildkit
name: buildkitd
volumes:
- name: buildkitd
emptyDir: {}See also the example manifests.
Needs to set the max user namespaces to a positive integer, through the API settings:
[settings.kernel.sysctl]
"user.max_user_namespaces" = "16384"See ../examples/eksctl/bottlerocket.yaml for an example to configure a Node Group in EKS.
Old distributions
Add kernel.unprivileged_userns_clone=1 to /etc/sysctl.conf (or /etc/sysctl.d) and run sudo sysctl -p.
This step is not needed for Debian GNU/Linux 11 and later.
Add user.max_user_namespaces=28633 to /etc/sysctl.conf (or /etc/sysctl.d) and run sudo sysctl -p.
This step is not needed for RHEL/CentOS 8 and later.
You may have to disable SELinux, or run BuildKit with --oci-worker-snapshotter=fuse-overlayfs.