fix(envd): detect cgroup v2 filesystem before initializing Cgroup2Manager#2263
fix(envd): detect cgroup v2 filesystem before initializing Cgroup2Manager#2263arkamar merged 9 commits intoe2b-dev:mainfrom
Conversation
On cgroup v1 systems, /sys/fs/cgroup is a tmpfs where mkdir and writeFile succeed silently, causing Cgroup2Manager to initialize with fds pointing to regular directories instead of cgroup v2 entries. The kernel then rejects clone3(CLONE_INTO_CGROUP) with EBADF, breaking all fork/exec operations (process start, socat). Add a statfs check for CGROUP2_SUPER_MAGIC before proceeding, so the manager correctly falls back to NoopManager on cgroup v1.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b24d02ead6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
When rootPath points to a not-yet-created subdirectory (e.g. /sys/fs/cgroup/envd), Statfs returns ENOENT causing a silent fallback to NoopManager. Walk up to the nearest existing ancestor before calling Statfs so custom --cgroup-root paths are validated correctly against the underlying filesystem type.
- Replace magic number 0x63677270 with named const cgroup2SuperMagic - Add filepath.Clean before ancestor walk for symlink safety - Add TestNewCgroup2Manager_NonCgroup2FS to verify non-cgroup2 rejection
| checkPath := filepath.Clean(config.rootPath) | ||
| for { | ||
| if _, err := os.Stat(checkPath); err == nil { | ||
| break | ||
| } | ||
| parent := filepath.Dir(checkPath) | ||
| if parent == checkPath { | ||
| break | ||
| } | ||
| checkPath = parent | ||
| } |
There was a problem hiding this comment.
If the cgroup root path doesn't exist, this walks up to a parent that could be a different filesystem entirely (sysfs, rootfs), making the statfs check meaningless. The cgroup root should exist, and NewCgroup2Manager() should just error if it doesn't.
There was a problem hiding this comment.
Thanks for the review, fixed in 162bbda49
The walk-up loop could traverse past the cgroup mount into a different filesystem (sysfs, rootfs), making the statfs check meaningless. The rootPath is the cgroup mount point itself (default /sys/fs/cgroup) and must exist. Sub-cgroup directories are created later by MkdirAll in createCgroup, so there is no need to walk up to an ancestor. If the cgroup root doesn't exist, error out immediately.
Co-authored-by: Petr Vaněk <arkamar@atlas.cz>
Summary
On cgroup v1 hosts,
Cgroup2Managersilently "succeeds" initialization because/sys/fs/cgroupis a tmpfs wheremkdirandwriteFileboth work without error. The resulting file descriptors point to regular directories, not cgroup v2 entries.When Go's
clone3syscall uses these fds withCLONE_INTO_CGROUP, the kernel returnsEBADF, causing allfork/execoperations to fail — including process start and socat port forwarding.This PR adds a
statfscheck forCGROUP2_SUPER_MAGIC(0x63677270) before creating cgroup directories, so the manager correctly returns an error and falls back toNoopManageron cgroup v1 systems.Root Cause
/sys/fs/cgroupon cgroup v1 is a tmpfs, not cgroup2fsos.MkdirAll("/sys/fs/cgroup/ptys")succeeds (creates a regular directory)os.WriteFile("cpu.weight", "200")succeeds (creates a regular file)unix.Open()returns a valid fd — but to a regular directoryUseCgroupFD: trueis set, Go usesclone3(CLONE_INTO_CGROUP)Symptoms
fork/exec /bin/bash: bad file descriptor
fork/exec /usr/bin/socat: bad file descriptor
All child process creation fails; filesystem RPCs (which don't fork) work fine.