Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions libpod/container_internal_common.go
Original file line number Diff line number Diff line change
Expand Up @@ -1569,8 +1569,18 @@ func (c *Container) restore(ctx context.Context, options ContainerCheckpointOpti

// Let's try to stat() CRIU's inventory file. If it does not exist, it makes
// no sense to try a restore. This is a minimal check if a checkpoint exists.
if err := fileutils.Exists(filepath.Join(c.CheckpointPath(), "inventory.img")); errors.Is(err, fs.ErrNotExist) {
return nil, 0, fmt.Errorf("a complete checkpoint for this container cannot be found, cannot restore: %w", err)
// For gVisor, the checkpoint image is called "checkpoint.img".
criuInventoryPath := filepath.Join(c.CheckpointPath(), "inventory.img")
gvisorCheckpointPath := filepath.Join(c.CheckpointPath(), "checkpoint.img")

if err := fileutils.Exists(criuInventoryPath); errors.Is(err, fs.ErrNotExist) {
if err2 := fileutils.Exists(gvisorCheckpointPath); errors.Is(err2, fs.ErrNotExist) {
return nil, 0, fmt.Errorf("a complete checkpoint for this container cannot be found (checked %q and %q): %w", criuInventoryPath, gvisorCheckpointPath, err)
} else if err2 != nil {
return nil, 0, fmt.Errorf("error checking checkpoint file %q: %w", gvisorCheckpointPath, err2)
}
} else if err != nil {
return nil, 0, fmt.Errorf("error checking inventory file %q: %w", criuInventoryPath, err)
}

if err := crutils.CRCreateFileWithLabel(c.bundlePath(), "restore.log", c.MountLabel()); err != nil {
Expand Down
15 changes: 14 additions & 1 deletion pkg/checkpoint/crutils/checkpoint_restore_utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -216,9 +216,22 @@ func CRRuntimeSupportsCheckpointRestore(runtimePath string) bool {
if err := cmd.Start(); err != nil {
return false
}
if err := cmd.Wait(); err == nil {

err := cmd.Wait()
if err == nil {
// Exited with status 0 means definitely supported.
return true
}

if exitErr, ok := err.(*exec.ExitError); ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like we should only do this if we know the runtime is runsc; 128 could be a legitimate error in other OCI runtimes. I imagine we could get that from the output of --help but it might also be sufficient to just check that the filename we're executing is runsc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the --help check was because some runtimes (at least crun) have build option to enable/disable it thus the runtime check via --help IIRC.

That said did you ask them why they exit 128 on --help? That just seems really weird and from the applications I tested I never seen this behavior at least if they have --help actually implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, crun has an optional dependency on libcriu and uses dlopen to load the library at runtime. When build without this dependency, crun checkpoint --help will have non-zero exit code.

That said did you ask them why they exit 128 on --help? That just seems really weird and from the applications I tested I never seen this behavior at least if they have --help actually implemented.

gVisor returns subcommands.ExitUsageError (128) when --help is used:

It seems to be specific to google's subcommands pacakge.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that isn't even calling --help then, it actually errors our on incorrect usage because it doesn't seem to respect --help at all which means this check isn't really doing anything useful.

Then we might as well hard code checkpoint support for the runsc name I would say.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this should be fixed by google/gvisor#12331.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nixprime Thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new code should use errors.Is/As() as erros can be wrapped in which case type casting won't work.

status := exitErr.ExitCode()
// "runsc checkpoint --help" exits with 128 (subcommands.ExitUsageError)
// and 127 is used for "command not found".
if status == 128 {
return true
}
}

return false
}

Expand Down