-
Notifications
You must be signed in to change notification settings - Fork 0
ROX-30064: Script to generate catalog-template.yaml #137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
ROX-30064: Script to generate catalog-template.yaml #137
Conversation
This comment was marked as off-topic.
This comment was marked as off-topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scratched the surface a bit. Got enough comments for the first round of review.
Please expect more review rounds and more comments.
Co-authored-by: Misha Sugakov <[email protected]>
…https://github.com/stackrox/operator-index into akurlov/add-catalog-generation-from-folder-structure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments on inter-diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple more thoughts for today. Will make another pass after you update the PR.
cmd/generate-catalog/generate.go
Outdated
validForChannel := channel.YStreamVersion != nil && | ||
entry.Version.Major() == channel.YStreamVersion.Major() && | ||
entry.Version.Minor() <= channel.YStreamVersion.Minor() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This must be consistent with what addReplaces
does (ref. versionsWithoutReplaces
). Right now it is not if you consider what happens when we'd introduce 5.0.0
. Do you have ideas how to make these two parts more consistent and at least make sure that when we change one we don't forget the other?
I at this point don't have a suggestion. Need to think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't able to find a nice solution, probably will come with some idea tomorrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add 5.0.0 to versionsWithoutReplaces
that means that we want to introduce yet another non versioned channel (like stable
and latest
). So it's hard to implement a nice solution unless we add a new configuration to bundles.yaml (e.g.:
...
pivot_channels:
- name: latest
start_version: 3.62.0
- name: stable
start_version: 4.0.0
- name: new
start_version: 5.0.0
...
I think it's fine to keep the same deprecation messages for deprecated non versioned (pivotal) channels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe there are plans to do that: introduce another rolling channel for 5.0.0+. stable
will remain and will get new versions.
- image: registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:7fd7595e6a61352088f9a3a345be03a6c0b9caa0bbc5ddd8c61ba1d38b2c3b8e | ||
version: 3.62.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This remains stuck in my head and I keep thinking we need it. Right now, we have no validation that version
is really the correct one for the image
. We can check that by pulling the image's manifest and inspecting its labels. One of them should be version
.
I think, we need such a check in CI. If you agree, could you please create a follow-up ticket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it will protect from one scenario:
if all versions and images are correct but shuffled (e.g. 4.8.2 image tagged with version 4.8.1 and vise versa). This scenario will not break output catalogs though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it's still worth to introduce such check? It will significantly increase script execution time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about the case when I do
- image: registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:<wrong-digest>
version: <correct-version>
If this image really exists, I'm not sure if there's any subsequent check that catches that.
Also, and may be even worse is the situation when we'd test wrong image before the release:
- image: quay.io/rhacs-eng/release-operator-bundle@sha256:<wrong-digest>
version: <correct-version>
Here I may test the wrong thing and conclude it's ready to be released whereas the right thing may not be ready.
Yes, the check will delay the execution but it does not have to be invoked as part of the same generation command, or at least not by default. It can be set to only run in CI.
Besides, we don't need to pull image blobs. Pulling just the manifests should be enough. If we parallelize calls, may be it won't take more than a couple of minutes to gather all images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If digest is wrong and image doesn't exist then valid-catalogs
will fail with:
registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:94a00394bc5a8ef503fb59db0a7d0ae9e1110866e8aee8ba40cd864cea69ea1a: not found
If image is not from registry.redhat.io/advanced-cluster-security
then our CI check-operator-images-are-only-released-images
will fail.
If image exists and it's from registry.redhat.io/advanced-cluster-security
but its image version is not presented in the catalog then valid-catalogs
will fail. I tested it locally with removing 4.8.2 version and using 4.8.2 image for 4.8.1 and I got this error:
package "rhacs-operator", bundle "rhacs-operator.v4.8.2" not found in any channel entries
There is no 4.8.2 version in the bundles.yaml but I used 4.8.2 image for 4.8.1 version
I can add a TODO, I just want to highlight that pulling images and checking them will only protect us from shuffling versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@porridge and I today briefly discussed https://issues.redhat.com/browse/ROX-30604 which is about the same thing. We should discuss it when all three of us are online.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to highlight that pulling images and checking them will only protect us from shuffling versions.
The code currently uses images and versions independently. There's no actual problem (currently) in case versions and images are shuffled.
It seems opm
will fail in a couple of cases such as referencing the same image twice or the test you did with 4.8.1 and 4.8.2.
opm
error messages aren't the most readable ones though.
If we don't implement the check we're at mercy of opm
and have to hope it will keep failing where we need it to fail.
I'm getting more inclined to agree that we shouldn't implement the check, but like I wrote above let's discuss it together.
Co-authored-by: Misha Sugakov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick incomplete check.
cmd/generate-catalog/generate.go
Outdated
if err != nil { | ||
return fmt.Errorf("failed to read bundle list file: %v", err) | ||
} | ||
versions := getAllVersions(input.Images) | ||
versions := getAllVersions(config.Images) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't understand your reply.
If it is more convenient for some functions to work with []*semver.Version
instead of []InputBundleImage
(and I agree it is), provide them with []*semver.Version
.
My concern is that the line versions := getAllVersions(config.Images)
is quite low-level in this high-level generateCatalogTemplateFile()
function so that it makes me feel uncomfortable about it.
My suggestion is to push it lower in the stack: either to declare Versions []*semver.Version
attribute as part of the Input
struct and populate that in the readInputFile()
call, or declare func (i *Input) Versions() []*semver.Version { ... }
to construct that slice dynamically.
Whichever way, the line versions := getAllVersions(config.Images)
will disappear from generateCatalogTemplateFile()
and that's what I advocate for.
- image: registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:7fd7595e6a61352088f9a3a345be03a6c0b9caa0bbc5ddd8c61ba1d38b2c3b8e | ||
version: 3.62.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about the case when I do
- image: registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:<wrong-digest>
version: <correct-version>
If this image really exists, I'm not sure if there's any subsequent check that catches that.
Also, and may be even worse is the situation when we'd test wrong image before the release:
- image: quay.io/rhacs-eng/release-operator-bundle@sha256:<wrong-digest>
version: <correct-version>
Here I may test the wrong thing and conclude it's ready to be released whereas the right thing may not be ready.
Yes, the check will delay the execution but it does not have to be invoked as part of the same generation command, or at least not by default. It can be set to only run in CI.
Besides, we don't need to pull image blobs. Pulling just the manifests should be enough. If we parallelize calls, may be it won't take more than a couple of minutes to gather all images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made another incomplete pass. I still skip unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some things I spotted in the inter-diff
// generatePackageWithIcon creates a new "olm.package" object with an operator icon. | ||
func generatePackageWithIcon() (Package, error) { | ||
data, err := os.ReadFile(iconFile) | ||
if err != nil { | ||
return Package{}, fmt.Errorf("failed to read %s: %v", iconFile, err) | ||
} | ||
iconBase64 := base64.StdEncoding.EncodeToString(data) | ||
|
||
return Package{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is not what I meant in #137 (comment)
I meant that you may have a
func newPackage(iconBase64) {
// doing this and only this
return Package{
Schema: olmPackageSchema,
Name: rhacsOperator,
DefaultChannel: stableChannelName,
Icon: Icon{
Base64data: iconBase64,
MediaType: "image/png",
},
}
}
Reading icon file in types.go
feels wrong.
I suggest to keep generatePackageWithIcon()
in generate.go
but introduce newPackage()
in types.go
.
for i, channel := range channels { | ||
for _, entry := range channelEntries { | ||
if channelShouldHaveEntry(channel, entry) { | ||
channels[i].Entries = append(channels[i].Entries, entry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a mutation of the original channel list.
I can't spot it. Please show it to me.
I see a mutation of .Entries
in blah.Entries
; blah
stays the same, doesn't it?
func (e *ChannelEntry) addSkipRange(version, previousYStreamVersion *semver.Version) { | ||
skipRangeFrom := previousYStreamVersion | ||
skipRangeTo := version | ||
e.SkipRange = fmt.Sprintf(">= %s < %s", skipRangeFrom, skipRangeTo) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably there's no need to alias variables any more
func (e *ChannelEntry) addSkipRange(version, previousYStreamVersion *semver.Version) { | |
skipRangeFrom := previousYStreamVersion | |
skipRangeTo := version | |
e.SkipRange = fmt.Sprintf(">= %s < %s", skipRangeFrom, skipRangeTo) | |
} | |
func (e *ChannelEntry) addSkipRange(skipRangeFrom, skipRangeTo *semver.Version) { | |
e.SkipRange = fmt.Sprintf(">= %s < %s", skipRangeFrom, skipRangeTo) | |
} |
IMPORTANT: I swapped the order of arguments compared to the original one. You also need to change the order in the calls to addSkipRange()
, I'll make a suggestion.
version: version, | ||
} | ||
entry.addReplaces(version, previousEntryVersion) | ||
entry.addSkipRange(version, previousYStreamVersion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entry.addSkipRange(version, previousYStreamVersion) | |
entry.addSkipRange(previousYStreamVersion, version) |
func TestValidateImageReferences(t *testing.T) { | ||
tests := []struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can suggest a couple more cases. Just providing the references to test:
bare-image-name-without-registry@sha256:6cdcf20771f9c46640b466f804190d00eaf2e59caee6d420436e78b283d177bf
example.com/image:my-fancy-tag-v1@sha256:6cdcf20771f9c46640b466f804190d00eaf2e59caee6d420436e78b283d177bf
example.com/image@md5:9241e37fcf7f3f88c5e944bd46b0a268
example.com/image@sha256:0123456789
if currentVersion.Major() != nextVersion.Major() { | ||
if currentVersion.Major()+1 != nextVersion.Major() || nextVersion.Minor() != 0 || nextVersion.Patch() != 0 { | ||
return fmt.Errorf("invalid version %s: a new major version should start from %d.0.0", nextVersion, currentVersion.Major()+1) | ||
} | ||
} | ||
|
||
if currentVersion.Major() == nextVersion.Major() && currentVersion.Minor() != nextVersion.Minor() { | ||
if currentVersion.Minor()+1 != nextVersion.Minor() || nextVersion.Patch() != 0 { | ||
return fmt.Errorf("invalid version %s: a new minor version should be %s", currentVersion, nextVersion) | ||
} | ||
} | ||
|
||
if currentVersion.Major() == nextVersion.Major() && currentVersion.Minor() == nextVersion.Minor() { | ||
if nextVersion.Patch() != currentVersion.Patch()+1 { | ||
return fmt.Errorf("invalid version %s: is not followed by %s", currentVersion, nextVersion) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's quite a bit of repetition and error messages don't seem the most optimal to me.
How about something like this?
if currentVersion.Major() != nextVersion.Major() { | |
if currentVersion.Major()+1 != nextVersion.Major() || nextVersion.Minor() != 0 || nextVersion.Patch() != 0 { | |
return fmt.Errorf("invalid version %s: a new major version should start from %d.0.0", nextVersion, currentVersion.Major()+1) | |
} | |
} | |
if currentVersion.Major() == nextVersion.Major() && currentVersion.Minor() != nextVersion.Minor() { | |
if currentVersion.Minor()+1 != nextVersion.Minor() || nextVersion.Patch() != 0 { | |
return fmt.Errorf("invalid version %s: a new minor version should be %s", currentVersion, nextVersion) | |
} | |
} | |
if currentVersion.Major() == nextVersion.Major() && currentVersion.Minor() == nextVersion.Minor() { | |
if nextVersion.Patch() != currentVersion.Patch()+1 { | |
return fmt.Errorf("invalid version %s: is not followed by %s", currentVersion, nextVersion) | |
} | |
} | |
if currentVersion.Major() != nextVersion.Major() { | |
expectedNextVersion = semver.New(currentVersion.Major()+1, 0, 0, "", "") | |
} | |
if currentVersion.Major() == nextVersion.Major() && currentVersion.Minor() != nextVersion.Minor() { | |
expectedNextVersion = semver.New(currentVersion.Major(), currentVersion.Minor()+1, 0, "", "") | |
} | |
if currentVersion.Major() == nextVersion.Major() && currentVersion.Minor() == nextVersion.Minor() { | |
expectedNextVersion = semver.New(currentVersion.Major(), currentVersion.Minor(), currentVersion.Patch()+1, "", "") | |
} | |
if expectedNextVersion.Major() != nextVersion.Major() || expectedNextVersion.Minor() != nextVersion.Minor() || expectedNextVersion.Patch() != nextVersion.Patch() { | |
return fmt.Errorf("unexpected version sequence [%s, %s]: %s should be followed by %s", currentVersion, nextVersion, currentVersion, expectedNextVersion) | |
} |
if err = hasGapInVersions(versions); err != nil { | ||
return Configuration{}, fmt.Errorf("version gap detected: %v", err) | ||
} | ||
if err = validateBrokenVersions(brokens, versions); err != nil { | ||
return Configuration{}, fmt.Errorf("invalid broken versions: %v", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both hasGapInVersions
and validateBrokenVersions
seem to already produce meaningful error messages. I suggest just passing them through.
if err = hasGapInVersions(versions); err != nil { | |
return Configuration{}, fmt.Errorf("version gap detected: %v", err) | |
} | |
if err = validateBrokenVersions(brokens, versions); err != nil { | |
return Configuration{}, fmt.Errorf("invalid broken versions: %v", err) | |
} | |
if err = hasGapInVersions(versions); err != nil { | |
return Configuration{}, err | |
} | |
if err = validateBrokenVersions(brokens, versions); err != nil { | |
return Configuration{}, err | |
} |
|
||
for _, tt := range tests { | ||
t.Run(tt.name, func(t *testing.T) { | ||
err := validateImageReferences(tt.images) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll likely have less test setup if you'd tested validateImageReference()
instead of validateImageReferences()
. Feel free to resolve this comment as no-op.
version: 4.3.6 | ||
- image: registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:1e8115bdc0ed2d01d7576524e9796606244963a1447eae55872674c0113fb9c4 | ||
version : 4.3.7 | ||
- image: registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:30f9384244819da013358325cec364bb210b3b33b4cf9680698531fe50562aac | ||
version : 4.3.8 | ||
- image: registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:94fcee9ac22671bd18be77381214665d9289151703a9ed78c29cee02b92612f4 | ||
version : 4.4.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the deal about the whitespace between version
and its colon :
- version :
instead of version:
?
- image: registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:7fd7595e6a61352088f9a3a345be03a6c0b9caa0bbc5ddd8c61ba1d38b2c3b8e | ||
version: 3.62.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to highlight that pulling images and checking them will only protect us from shuffling versions.
The code currently uses images and versions independently. There's no actual problem (currently) in case versions and images are shuffled.
It seems opm
will fail in a couple of cases such as referencing the same image twice or the test you did with 4.8.1 and 4.8.2.
opm
error messages aren't the most readable ones though.
If we don't implement the check we're at mercy of opm
and have to hope it will keep failing where we need it to fail.
I'm getting more inclined to agree that we shouldn't implement the check, but like I wrote above let's discuss it together.
catalog-template.yaml
from a newbundles.yaml
file.bundles.yaml
has a lists operator bundle images with versions, theoldest_supported_version
and a list ofbroken_versions
.oldest_supported_version
specifies what is the lowest supported version. Any version or channel beforeoldest_supported_version
will be marked as deprecated.broken_versions
a list of versions which should be skipped. For each broken version X.Y.Z the script adds "skips" for all versions > X.Y.Z and < X.Y+2.0checks
which validates thatcatalog-template.yaml
is up-to-date withbundles.yaml
cmd
folder is changedcatalog-template.yaml changes:
schema: olm.bundle
deprecation references for all version <oldest_supported_version
. So not only channels are deprecated.rhacs-3.63
channelskips
is only added for broken version + 2 minor versionlatest
channel has all 3.X.X versionsstable
channel has all 4.X.X versionsTesting this PR catalogs on OCP 4.19:
image:
quay.io/rhacs-eng/stackrox-operator-index@sha256:9345ef4e16b463205ab9dcf4446c5a34babc8c497ad9cbbeae327fb44b2f356d
commit: ca7b4bd
Now not only channels but versions are also deprecated.
Test locally: