Conversation
dariofaccin
left a comment
There was a problem hiding this comment.
I left some improvements. Note that upstream CI is utterly broken even on master and has probably never worked/won't work on any tag.
| From inside this directory: | ||
|
|
||
| ```shell | ||
| python3 extract_images.py |
There was a problem hiding this comment.
| python3 extract_images.py | |
| python3 get_upstream_images.py |
suggestion(blocking): probably a typo but the script name doesn't match
| try: | ||
| clone_cmd = ["git", "clone", "--depth", "1"] | ||
| if version != "latest": | ||
| tag = f"v{version}" |
There was a problem hiding this comment.
suggestion(blocking): do not hardcode the v prefix. I tried with the 26.03-rc.1 tag and the script failed:
python3 get_upstream_images.py 26.03-rc.1
Cloning repository: https://github.com/kubeflow/manifests.git
ERROR: Failed to clone repository. Details: Command '['git', 'clone', '--depth', '1', '--branch', 'v26.03-rc.1', 'https://github.com/kubeflow/manifests.git', 'manifests']' returned non-zero exit status 128.
I would also suggest not to limit the script to tags, but also branches. The difference is negligible but it's more flexible.
|
|
||
| def validate_semantic_version(version): | ||
| # Validates a semantic version string (e.g., "0.1.2" or "latest"). | ||
| regex = r"^[0-9]+\.[0-9]+\.[0-9]+$" |
There was a problem hiding this comment.
suggestion(blocking): this regex doesn't properly validate a semantic version. For instance, it failed validating both 1.11.0-rc.1 and 26.03-rc.1. I put the regex into a validator and also 26.03 wouldn't match.
Starting from the suggested SemVer regex, I built one which allows to have only major.minor groups and also the prefix v:
^v?(?P<major>0|[1-9]\d*)\.(?P<minor>|[0-9]\d*)\.?(?P<patch>|[0-9]\d*)(?:-(?P<prerelease>(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+(?P<buildmetadata>[0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
It's fairly complex and it can be checked here.
| def extract_images(version, skip_list=None): | ||
| if skip_list is None: | ||
| skip_list = [] | ||
| version = validate_semantic_version(version) |
There was a problem hiding this comment.
suggestion(blocking): I would run the semver validation even before cloning the repo.
You could actually leverage the type keyword to perform the validation directly from the parser. If you raise an argparse.ArgumentTypeError you get also this nice output:
usage: get_upstream_images.py [-h] [--skip [SKIP ...]] [version]
get_upstream_images.py: error: argument version: Invalid semantic version: 'invalid'
| "version", | ||
| nargs="?", | ||
| type=str, | ||
| default="latest", |
There was a problem hiding this comment.
suggestion(blocking): I would default to master.
There was a problem hiding this comment.
Uhm, we can probably just make this argument required. The user should definitely know which images they want to fetch
deusebio
left a comment
There was a problem hiding this comment.
The script looks fine, but Dario provided sensible comments, especially to make it compatible with branches.
I'm only concerned that this may require also effort to make sure that it is in sync with upstream (or with a given branch). I'm actually wondering if we could:
- Fetch the file from upstream (even just with getting the raw data from the given branch)
- Import the custom mappings (e.g.
wg_dirs) from the file to use them or as sanity check that we are still "compatible"
| "version", | ||
| nargs="?", | ||
| type=str, | ||
| default="latest", |
There was a problem hiding this comment.
Uhm, we can probably just make this argument required. The user should definitely know which images they want to fetch
Closes #1417.
This PR adds a directory under
scriptswith a new Python script to gather all images used in a specified release of upstream Kubeflow.The script is heavily based on https://github.com/kubeflow/manifests/blob/0837fb9cf3ec73f51cbddff656a160cb258eaad5/tests/trivy_scan.py. I tried to introduce as few changes as possible:
kubeflow/manifestsrepository and operate from there--skipargument to skip specific working groups. This is useful since we probably want to skip controllers that we do not deploy as charmed operators.I would probably do more reformatting of the script, but I suggest we keep the changes as minimal as possible to not deviate much.
I also added a
README.mdwith instructions on how to use the script.