-
Notifications
You must be signed in to change notification settings - Fork 260
validate fail for stranded channel entries #1750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
validate fail for stranded channel entries #1750
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1750 +/- ##
=======================================
Coverage 55.26% 55.27%
=======================================
Files 136 136
Lines 15974 15976 +2
=======================================
+ Hits 8828 8830 +2
Misses 5991 5991
Partials 1155 1155 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: grokspawn The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
alpha/declcfg/declcfg_to_model.go
Outdated
if slices.Contains(entry.Skips, entry.Replaces) { | ||
return nil, fmt.Errorf("invalid package %q, channel %q: entry %q has identical replaces and skips: %q", c.Package, c.Name, entry.Name, entry.Replaces) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Make sense for me my only concern is:
Did we check how many cases do we have that fail in this scenario?
we might need to create a script to validate, what we do if we have FBC catalogs with?
But maybe it will need to see outside of this PR
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do see one instance in the operatorhubio catalog:
./operatorhubio/latest
FATA[0002] invalid package "grafana-operator", channel "v5": entry "grafana-operator.v5.10.0" has identical replaces and skips: "grafana-operator.v5.9.2"
let's
/hold
this until we can talk to some impacted folks and determine if this is a big enough problem to have to solve NOW.
This validation check seems to be very narrowly tailored to "can't both skip and replace the same thing in one entry", which is good! However, I think it very slightly misses the point and the broader problem.
|
This is totally fine in any OLMv1 context, but I'd argue that since it comes with migration side-effects for OLMv0 that it's never OK. In general, we should not have these kind of surprises, and I think it's reasonable to enforce the most-restrictive case here (because it's easier to grow-permissive than -restrictive).
That's a specific flavor of this more general issue. But I'd argue that it is also resolved by preventing the more general issue. |
"name": "clusterwide-alpha", | ||
"entries": [ | ||
{"name": "etcdoperator.v0.9.0"}, | ||
{"name": "etcdoperator.v0.9.2-clusterwide", "replaces": "etcdoperator.v0.9.0", "skips": ["etcdoperator.v0.6.1","etcdoperator.v0.9.0"], "skipRange": ">=0.9.0 <=0.9.1"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change related to the model validation change somehow? It seems unrelated to me at first glance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This removes a skips from the slice where it duplicates the replaces edge.
It was needed for the previous commit, and I haven't yet checked to see if the existing catalogs impact is different with the new commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked the catalogs impact and it looks the same as before. HOWEVER, we no longer need this test change, because somehow it's OK for 0.9.2-clusterwide to skip AND replace v0.9.0 ...?
9acb503
to
7607718
Compare
I've tested this PR using a synthetic catalog with the following structure to explicitly trigger the 📦 Catalog StructureAll files were placed under a directory named
|
Signed-off-by: grokspawn <[email protected]>
7607718
to
5bd337c
Compare
Hey @bandrade I'll need to update the PR description, because the new commit changed the functionality to not merely refuse skipped-replaces, but to really consider if a skipped-replace strands bundles across the replaces chain. The original example essentially ignores ALL lower bundle versions, so the new check does not identify it as a failure. For e.g., this modification to your channel.yaml results in a failure: schema: olm.channel
name: stable-v1
package: test-operator
entries:
- name: test-operator-v1.0.0 # stranded because of skip on v1.1.4
- name: test-operator-v1.1.0 # stranded because of skip on v1.1.4
replaces: test-operator-v1.0.0
- name: test-operator-v1.1.2
replaces: test-operator-v1.1.0
- name: test-operator-v1.1.4
replaces: test-operator-v1.1.2
skips:
- test-operator-v1.1.2
- name: test-operator-v1.2.0
- name: test-operator-v1.2.1
replaces: test-operator-v1.1.4
skips:
- test-operator-v1.1.4
- test-operator-v1.2.0
- name: test-operator-v1.3.0
replaces: test-operator-v1.2.1
skips:
- test-operator-v1.2.1
- name: test-operator-v1.4.0
replaces: test-operator-v1.3.0
skips:
- test-operator-v1.3.0 results in the message
|
/hold cancel |
There is a coverage gap in this PR, in that if there are no non-skipped edges below a skipped edge, then it cannot identify stranded edges. |
Thanks for the clarification and updated logic — I just reproduced the new stranded bundle detection using the modified I created a synthetic catalog with the following
/lgtm |
/label qe-verified |
@bandrade: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@bandrade: This PR has been marked as verified by In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/label qe-approved |
3f87bea
into
operator-framework:master
It looks like thanks to this change, the |
Assessment of existing, known catalogs was tabulated here: https://docs.google.com/spreadsheets/d/1ngHlFDOflLkpzf7Fd3_AAet_wqx65vjV8PkpHPonO2w/ ("[old] Summary" tab). For community-operators, @mantomas it appears that the infinispan-operator has unhealthy graph edges which are identified with this change. operatorhubio catalog has failures with grafana-operator contributions. I'd suggest touching base with the maintainers there to get them to make updates. |
Description of the change:
opm validate
fails when an edge is stranded because the replaces chain from the head is broken by skipped edges.Motivation for the change:
Due to OLMv0 graph mechanics, any skips edge will cause OLMv0 to ignore the bundle version when considering upgrades (since v0 discards graph contribution from skipped bundle versions).
Since the purpose of a replaces edge is to enable upgrade mobility across a graph, allowing the bundle version to be ignored (due to the skips entry) is an error, and potentially results in stranding.
For example, take input
olm.channel
:Using a new version of opm which can optionally display OLMv0 graph semantics, skipped objects are limned in red and ignored edges are red dashed arrows to help visualize the stranded edges.
Reviewer Checklist
/docs