feat: Add script to get all upstream images by mvlassis · Pull Request #1418 · canonical/bundle-kubeflow

mvlassis · 2026-03-19T12:51:03Z

Closes #1417.

This PR adds a directory under scripts with a new Python script to gather all images used in a specified release of upstream Kubeflow.

The script is heavily based on https://github.com/kubeflow/manifests/blob/0837fb9cf3ec73f51cbddff656a160cb258eaad5/tests/trivy_scan.py. I tried to introduce as few changes as possible:

Add a new function to temporarily clone the kubeflow/manifests repository and operate from there
Change the default location of the script to be in a directory where the script is called from
Remove the vulnerability scanning part which is not relevant in this case
Add a new --skip argument to skip specific working groups. This is useful since we probably want to skip controllers that we do not deploy as charmed operators.

I would probably do more reformatting of the script, but I suggest we keep the changes as minimal as possible to not deviate much.

I also added a README.md with instructions on how to use the script.

dariofaccin

I left some improvements. Note that upstream CI is utterly broken even on master and has probably never worked/won't work on any tag.

dariofaccin · 2026-03-19T16:36:24Z

scripts/get_upstream_images/README.md

+From inside this directory:
+
+```shell
+python3 extract_images.py


Suggested change

python3 extract_images.py

python3 get_upstream_images.py

suggestion(blocking): probably a typo but the script name doesn't match

dariofaccin · 2026-03-19T16:43:43Z

scripts/get_upstream_images/get_upstream_images.py

+        try:
+            clone_cmd = ["git", "clone", "--depth", "1"]
+            if version != "latest":
+                tag = f"v{version}"


suggestion(blocking): do not hardcode the v prefix. I tried with the 26.03-rc.1 tag and the script failed:

python3 get_upstream_images.py 26.03-rc.1 Cloning repository: https://github.com/kubeflow/manifests.git ERROR: Failed to clone repository. Details: Command '['git', 'clone', '--depth', '1', '--branch', 'v26.03-rc.1', 'https://github.com/kubeflow/manifests.git', 'manifests']' returned non-zero exit status 128.

I would also suggest not to limit the script to tags, but also branches. The difference is negligible but it's more flexible.

dariofaccin · 2026-03-19T16:55:56Z

scripts/get_upstream_images/get_upstream_images.py

+
+def validate_semantic_version(version):
+    # Validates a semantic version string (e.g., "0.1.2" or "latest").
+    regex = r"^[0-9]+\.[0-9]+\.[0-9]+$"


suggestion(blocking): this regex doesn't properly validate a semantic version. For instance, it failed validating both 1.11.0-rc.1 and 26.03-rc.1. I put the regex into a validator and also 26.03 wouldn't match.
Starting from the suggested SemVer regex, I built one which allows to have only major.minor groups and also the prefix v:

^v?(?P<major>0|[1-9]\d*)\.(?P<minor>|[0-9]\d*)\.?(?P<patch>|[0-9]\d*)(?:-(?P<prerelease>(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+(?P<buildmetadata>[0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$

It's fairly complex and it can be checked here.

dariofaccin · 2026-03-19T17:06:05Z

scripts/get_upstream_images/get_upstream_images.py

+def extract_images(version, skip_list=None):
+    if skip_list is None:
+        skip_list = []
+    version = validate_semantic_version(version)


suggestion(blocking): I would run the semver validation even before cloning the repo.
You could actually leverage the type keyword to perform the validation directly from the parser. If you raise an argparse.ArgumentTypeError you get also this nice output:

usage: get_upstream_images.py [-h] [--skip [SKIP ...]] [version] get_upstream_images.py: error: argument version: Invalid semantic version: 'invalid'

dariofaccin · 2026-03-19T17:07:45Z

scripts/get_upstream_images/get_upstream_images.py

+    "version",
+    nargs="?",
+    type=str,
+    default="latest",


suggestion(blocking): I would default to master.

Uhm, we can probably just make this argument required. The user should definitely know which images they want to fetch

deusebio

The script looks fine, but Dario provided sensible comments, especially to make it compatible with branches.

I'm only concerned that this may require also effort to make sure that it is in sync with upstream (or with a given branch). I'm actually wondering if we could:

Fetch the file from upstream (even just with getting the raw data from the given branch)
Import the custom mappings (e.g. wg_dirs) from the file to use them or as sanity check that we are still "compatible"

deusebio · 2026-03-19T17:34:19Z

scripts/get_upstream_images/get_upstream_images.py

+    "version",
+    nargs="?",
+    type=str,
+    default="latest",


Uhm, we can probably just make this argument required. The user should definitely know which images they want to fetch

mvlassis added 2 commits March 19, 2026 14:46

feat: Add script to get all upstream images

1a189b8

Supress output from git clone

b5996ea

mvlassis requested review from dariofaccin, deusebio and misohu and removed request for misohu March 19, 2026 12:59

dariofaccin requested changes Mar 19, 2026

View reviewed changes

deusebio requested changes Mar 19, 2026

View reviewed changes

Merge branch 'main' into kf-8550-gather-upstream-images

d20de01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add script to get all upstream images#1418

feat: Add script to get all upstream images#1418
mvlassis wants to merge 3 commits intomainfrom
kf-8550-gather-upstream-images

mvlassis commented Mar 19, 2026 •

edited

Loading

Uh oh!

dariofaccin left a comment

Uh oh!

dariofaccin Mar 19, 2026

Uh oh!

dariofaccin Mar 19, 2026

Uh oh!

dariofaccin Mar 19, 2026

Uh oh!

dariofaccin Mar 19, 2026

Uh oh!

dariofaccin Mar 19, 2026

Uh oh!

deusebio Mar 19, 2026

Uh oh!

deusebio left a comment

Uh oh!

deusebio Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mvlassis commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dariofaccin left a comment

Choose a reason for hiding this comment

Uh oh!

dariofaccin Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

dariofaccin Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

dariofaccin Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

dariofaccin Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

dariofaccin Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

deusebio Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

deusebio left a comment

Choose a reason for hiding this comment

Uh oh!

deusebio Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mvlassis commented Mar 19, 2026 •

edited

Loading