-
Notifications
You must be signed in to change notification settings - Fork 2.3k
libct: setns process recreate cmd object before calling the first start #5066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hmm, the code to create a copy of the cmd is kinda fragile, and I guess what Alan suggested is a better way, i.e.
One more alternative is to check whether CLONE_INTO_CGROUP is available, but I'm afraid the check is going to be too expensive, and caching its result is not very reliable either. |
070b6ed to
4cdc0d6
Compare
Hello! Something like this 4cdc0d6 ? I have made this as a Container method. |
|
@everzakov Have you verified whether this issue actually affects In fact, runc doesn’t call Regarding the Go PR (728642), I think the change might be overly strict for users and could potentially be a breaking change. |
|
In fact, I changed the p.cmd.SysProcAttr.CgroupFD = 100 //int(fd.Fd())lifubang@acmcoder /opt/ubuntu $ sudo ./runc exec test echo hello
hellolifubang@acmcoder /opt/ubuntu $ sudo ./runc --debug exec test echo hello
...
DEBU[0000]libcontainer/process_linux.go:359 libcontainer.(*setnsProcess).prepareCgroupFD() using CLONE_INTO_CGROUP "/sys/fs/cgroup/user.slice/user-1000.slice/test"
... libcontainer.(*setnsProcess).startWithCgroupFD() exec with CLONE_INTO_CGROUP failed: fork/exec /proc/self/fd/6: bad file descriptor; retrying without**
...
hello |
Actually there are some calls cmd.Wait() - https://github.com/opencontainers/runc/blob/v1.4.0/libcontainer/process_linux.go#L152 and https://github.com/opencontainers/runc/blob/v1.4.0/libcontainer/process_linux.go#L524 .
Yeah, the main problem to find this fault in runtime is that all cmd.Wait calls' errors are ignored. Also, it is often preceded by process KILL. So, this error can be found only in tests. You can try to run TestEnter test with your change. The error should be the same as from issue.
I have proposed another go pr 728660 . In my PR it will be possible to call the same os/exec Cmd object if the first try was unsuccessful. Also, Cmd.Wait does not return the error. However, i think 728642 will be merged. |
|
You're right — we shouldn’t reuse the Just a discuss, a simple fix would be to move the retry logic into |
|
Additionally, adding a test case to ensure the patch works correctly would be appreciated. |
4cdc0d6 to
1df1ec2
Compare
Hello! Well, i'm not very familiar with cgroups library. I think the test should be where we pass the non cgroup path and the check fails in the kernel. However, this path should exist . There are a lot funcs to clean and check prefixes . In Create we pass nil paths in cgroup manager so the default /sys/fs/cgroup pass will be used. I think it will be impossible to create non cgroup child directory in /sys/fs/cgroup tree. But we can use factory Load method to get a container . In this method cgroup paths are passed that's why we can set our non cgroup path. So, the idea is to patch the dumped container state, load it with changed cgroup paths and turn off the cgroup function to check directory cgroup magic. The first call will fail because we will pass non cgroup dir. The second call doesn't return an error, and the set ns pid will be added successfuly into the "cgroup". The test logs: |
1df1ec2 to
f428344
Compare
|
@everzakov thanks a lot for catching this and for the PR! There is one thing I'm not so sure about, why is this not failing on CI? The wait commands you linked seem to either return the error as a param or just make the function return an error. What am I missing? Also, how did you find the error? I was reviewing at it for a long time and it seems correct. But this is quite security-sensitive code. Do you mind splitting it in more commits, so it is obvious what is done at each step? I mean, check that the way we create the command is mostly c&p from another place in one commit, then maybe remove the cmd of the functions signature in another commit, etc. This way, we can view in detail that each step is correct. It's tricky code, I'd like to reduce the chances of doing a mistake. |
Hello, i have tried to run the new release tests with the old Linux kernels (before 5.7 where CLONE_INTO_CGROUP is proposed). In my case, it was the Ubuntu 20.04. TestEnter tests are failed with the error in the issue (exec_test.go:216: unexpected error: read |0: file already closed). I have found that this is related to the golang issue (we can not reuse the same cmd object to retry cmd start) golang/go#76746 . The syntetic test TestCmdRetry is about to force retry of cmd in the new Linux kernels and fails with the same error. Or you can set the wrong cgroup fd and run tests ( #5066 (comment) ).
I can split these changes into three commits. The first commit will have only the test. I think the CI will fail. The second commit will have the changes to recreate Cmd object. The third commit will be with functions' signature changes. PS. Now this issue is only about test fails. However, https://go-review.googlesource.com/c/go/+/728642 is merged that's why it will be runtime error with the new golang version on old Linux kernels. If golang/go#77075 will be solved then we can again change the code to clone Cmd object :) |
|
SGTM, thanks! |
… cmd.Start method if clone3 do not support CLONE_INTO_CGROUP flag TestCmdRetry is a syntetic test to force retry of cmd.Start method on the new Linux kernels (after 5.7). It can be done if cgroup fd points to the file from non-cgroup tree. In runc there a lot of prefix checks after joining pathes. Also, when container is created from factory, fs2 cgroup manager will choose the path from cgroup tree by default. However, when container is loaded from state.json its path is not checked that's why it is possible to pass the wrong path. Also, we should turn on TestMode in opencontainers/cgroups library. Signed-off-by: Efim Verzakov <efimverzakov@gmail.com>
f428344 to
4724c2a
Compare
In Golang the same exec.Cmd object can not be used for retry Start method. Wait method will return the error even if the second call is successful. More info: golang/go#76746 . This commit adds additional method to recreate cmd Object after the first fail. Signed-off-by: Efim Verzakov <efimverzakov@gmail.com>
Signed-off-by: Efim Verzakov <efimverzakov@gmail.com>
Hello! |
kolyshkin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are thinking about introducing cmd.Clone() method; in the meantime we can do something like this:
diff --git a/libcontainer/process_linux.go b/libcontainer/process_linux.go
index b3487fd0..68474087 100644
--- a/libcontainer/process_linux.go
+++ b/libcontainer/process_linux.go
@@ -380,11 +380,14 @@ func (p *setnsProcess) startWithCgroupFD() error {
defer fd.Close()
}
+ cmdCopy := *p.cmd
err = p.startWithCPUAffinity()
if err != nil && p.cmd.SysProcAttr.UseCgroupFD {
logrus.Debugf("exec with CLONE_INTO_CGROUP failed: %v; retrying without", err)
// SysProcAttr.CgroupFD is never used when UseCgroupFD is unset.
- p.cmd.SysProcAttr.UseCgroupFD = false
+ cmdCopy.SysProcAttr.UseCgroupFD = false
+ // Must not reuse exec.Cmd (https://go-review.googlesource.com/c/go/+/728642).
+ p.cmd = &cmdCopy
err = p.startWithCPUAffinity()
}
Tested to work fine with the test case from this PR, go HEAD (1.26-to-be). It works.
It's a bit hacky because we're also copying private fields which might not work in the future. The more correct way is to copy all public fields explicitly (which is less hacky but still not future proof as there may be more public fields added).
As for this patch, it's correct in theory but practically
- it's quite a big refactoring
- it makes things more complex than they were
These are the two reasons why I hesitate to LGTM it.
| ok(t, err) | ||
|
|
||
| // Dump our "pseudo" cgroup path as dirPath | ||
| state.CgroupPaths[""] = pseudoPath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes (and will work only with) cgroup v2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have checked the cgroup version in the call of cgroups.IsCgroup2UnifiedMode. As for cgroup v1 it is deprecated :) #4955 .
|
Opened #5091 as a much simpler alternative to this. As to testing, I tried Go 1.26 with current CI and a few tests fail before the fix so I guess this is sufficient. PTAL @everzakov |
| // once. | ||
| p.closeClonedExes() | ||
| var safeExe *os.File | ||
| if exeseal.IsSelfExeCloned() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect this commit (" libct: add Container method to recreate cmd Object in setnsProcess") to only move cmd.* assignments to a function that we then just call. And keep all the sealing of the binary untouched.
Is not possible to do something like that, that is super straight-forward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I have kept the sealing untouched. I have deleted only exePath (it is not used in this function). In createCmdObject function I use the created safeExe if runc binary is cloned. I can move createCmdObject after newParentProcess function that's why there won't be any changes in newParentProcess function header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I can keep the function signatures and should create cmd without cgroup fd. In Setns Process Cmd retry it will be ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant is we create a function that does the cmd and the commit is just moving cmd.X = asd lines from one place to the other. That will be quite clear we are not missing anything.
|
@kolyshkin does that work with this PR applied too: https://go-review.googlesource.com/c/go/+/728642 ? I'm skeptical about copying the private fields (specially pointers like that). If we can just move the cmd.X assignments we do to a function and call it, at least in my head, sounds very simple and obviously correct to not do it. |
Yeah, go version go1.26rc2 (in tests matrix) has this commit. Also, it has https://go-review.googlesource.com/c/go/+/734200 where atomic.Bool is changed to bool. |
In short, it works and the method is kind of approved by the upstream. See #5091 for more details, and let's move the further discussion about the copy approach to #5091. |
Currently there is an golang issue golang/go#76746 .
If the cmd.Start is called twice (the first call was unsuccessful) then
Cmd.Wait can return an error (because goroutines' pipes will be closed in the first try).
However, the second cmd.Start can return nil.
After https://go-review.googlesource.com/c/go/+/728642 is merged, then
the second call will return an error. So we need to recreate the Cmd object.
We can not simply copy it
https://go-review.googlesource.com/c/go/+/728642/comments/a837fe92_a5fc5f06 .
Fixes #5060