Use skopeo for hermetic bootstrap images#1717
Use skopeo for hermetic bootstrap images#1717praiskup merged 1 commit intorpm-software-management:mainfrom
Conversation
eeb59fe to
e98c1e6
Compare
There was a problem hiding this comment.
Code Review
The pull request successfully transitions from podman pull|save to skopeo for handling hermetic bootstrap images, which is a more robust approach for managing image digests and offline transports. However, there is a critical mismatch between how skopeo sync --scoped prepares the directory structure and how the configuration logic attempts to locate it. Switching to skopeo copy with a predictable destination directory (e.g., bootstrap/) would resolve this and maintain consistency with the previous implementation.
| pull_cmd = ["skopeo", "sync", "--src", "docker", "--dest", "dir", | ||
| "--scoped", "--preserve-digests"] | ||
| if "architecture" in bootstrap_data: | ||
| pull_cmd += ["--override-arch", bootstrap_data["architecture"]] | ||
| if "pull_digest" in bootstrap_data: | ||
| image_specification += "@" + bootstrap_data["pull_digest"] | ||
| if "architecture" in bootstrap_data: | ||
| pull_cmd += ["--arch", bootstrap_data["architecture"]] | ||
| pull_cmd += [image_specification] | ||
| pull_cmd += [image_specification, outputdir] | ||
| log.info("Pulling like: %s", ' '.join(pull_cmd)) |
There was a problem hiding this comment.
The use of skopeo sync --scoped creates a nested directory hierarchy based on the registry and repository (e.g., docker.io/library/fedora), which does not match the flat path construction in mockbuild/config.py. Since this script is intended to fetch a specific bootstrap image for a build, skopeo copy is more appropriate as it allows specifying a predictable destination directory. Additionally, subprocess.check_call is preferred over check_output when the output is not being captured.
| pull_cmd = ["skopeo", "sync", "--src", "docker", "--dest", "dir", | |
| "--scoped", "--preserve-digests"] | |
| if "architecture" in bootstrap_data: | |
| pull_cmd += ["--override-arch", bootstrap_data["architecture"]] | |
| if "pull_digest" in bootstrap_data: | |
| image_specification += "@" + bootstrap_data["pull_digest"] | |
| if "architecture" in bootstrap_data: | |
| pull_cmd += ["--arch", bootstrap_data["architecture"]] | |
| pull_cmd += [image_specification] | |
| pull_cmd += [image_specification, outputdir] | |
| log.info("Pulling like: %s", ' '.join(pull_cmd)) | |
| bootstrap_dir = os.path.join(outputdir, "bootstrap") | |
| pull_cmd = ["skopeo", "copy"] | |
| if "architecture" in bootstrap_data: | |
| pull_cmd += ["--override-arch", bootstrap_data["architecture"]] | |
| if "pull_digest" in bootstrap_data: | |
| image_specification += "@" + bootstrap_data["pull_digest"] | |
| pull_cmd += ["docker://" + image_specification, "dir:" + bootstrap_dir] | |
| log.info("Pulling like: %s", ' '.join(pull_cmd)) | |
| subprocess.check_call(pull_cmd) |
There was a problem hiding this comment.
Should work in every case. docker.io/library/fedora is still the part of boostrap_image string, so nested directories will be part of final path?
mock/py/mockbuild/config.py
Outdated
| image_pullspec = f"{data['config']['bootstrap_image']}@{data['bootstrap']['pull_digest']}" | ||
| bootstrap_dir = os.path.join(final_offline_repo, image_pullspec) | ||
| config_opts["bootstrap_image"] = f"dir:{bootstrap_dir}" |
There was a problem hiding this comment.
The current path construction for bootstrap_dir assumes a flat directory named after the pullspec, which conflicts with the hierarchical structure created by skopeo sync --scoped. To ensure the build can correctly locate the offline image, it is better to use a fixed directory name like bootstrap, matching the suggested change in mock-hermetic-repo.py.
| image_pullspec = f"{data['config']['bootstrap_image']}@{data['bootstrap']['pull_digest']}" | |
| bootstrap_dir = os.path.join(final_offline_repo, image_pullspec) | |
| config_opts["bootstrap_image"] = f"dir:{bootstrap_dir}" | |
| # Use the offline image for bootstrapping. | |
| bootstrap_dir = os.path.join(final_offline_repo, "bootstrap") | |
| config_opts["bootstrap_image"] = f"dir:{bootstrap_dir}" |
|
CI failure: |
| """ | ||
| pull_cmd = ["podman", "pull"] | ||
| pull_cmd = ["skopeo", "sync", "--src", "docker", "--dest", "dir", | ||
| "--scoped", "--preserve-digests"] |
There was a problem hiding this comment.
As discussed before, this worked for me correctly - but I didn't even use --scoped and --preserve-digests 🤷 please document why we need those.
| Store the tarball into the same directory where the RPMs are | ||
| """ | ||
| pull_cmd = ["podman", "pull"] | ||
| pull_cmd = ["skopeo", "sync", "--src", "docker", "--dest", "dir", |
There was a problem hiding this comment.
We probably need skopeo in Recommends.
|
|
#1717 (comment) - interesting - works for me locally, I'll try to debug it more. |
ouch, missed a file to commit |
|
Would you mind squashing the PR? |
| # also it matches with full pullspec from config's bootstrap_image | ||
| # --preserve-digests - it is not needed for most images, but some could end with | ||
| # modified digests (semantically identical copy, but modified digest, | ||
| # e.g. embedded docker reference for v1 images), so better safe than sorry |
086e46b to
22b6859
Compare
649b22c
into
rpm-software-management:main
Better replacement for #1716