Skip to content

Conversation

@chiragkyal
Copy link
Contributor

@chiragkyal chiragkyal commented Oct 28, 2025

This PR introduces a new plugin for managing day-2 operators in OpenShift clusters using Operator Lifecycle Manager (OLM).

Commands

Core Operations:

/olm:search - Search and discover operators across all catalog sources
/olm:install - Install operators with auto-channel discovery and manual/automatic approval modes
/olm:list - View all installed operators with health status
/olm:status - Get detailed operator health, available updates, and troubleshooting information
/olm:uninstall - Safely uninstall operators with orphaned CR detection and optional CRD/namespace cleanup

Update Management:

/olm:upgrade - Update operators to latest version or switch channels
/olm:approve - Approve pending InstallPlans for manual approval workflows

Administration:

/olm:diagnose - Diagnose and fix common OLM issues (orphaned CRDs, stuck namespaces, failed installations)
/olm:catalog - Manage catalog sources (list, add, remove, refresh, status)

Key Features

  • Smart Defaults: Auto-discovers channels and catalog sources when not specified
  • Comprehensive Verification: Monitors installation progress and validates CSV/pod status
  • Health Monitoring: Aggregates data from Subscriptions, CSVs, InstallPlans, Deployments, and Pods
  • Safe Uninstallation: Multiple confirmations with clear warnings for destructive operations

Example Workflow

/olm:search cert-manager
/olm:install openshift-cert-manager-operator
/olm:approve openshift-cert-manager-operator
/olm:status openshift-cert-manager-operator
/olm:upgrade openshift-cert-manager-operator
/olm:uninstall openshift-cert-manager-operator

Troubleshooting:

/olm:diagnose --cluster
/olm:diagnose problematic-operator --fix

@openshift-ci openshift-ci bot requested review from bentito and brandisher October 28, 2025 13:37
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 28, 2025
@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 28, 2025
@openshift-ci
Copy link

openshift-ci bot commented Oct 28, 2025

Hi @chiragkyal. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 28, 2025
@stbenjam
Copy link
Member

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 29, 2025
@grokspawn
Copy link

To be clear, this works with OLMv0, not v1. IMHO it's fine for it to use the plain plugin path /olm, but it would be nice to enhance docs/files to clarify this.
OLMv1 has been GA with a limited featureset since OCP4.18, and has other features in tech-preview, behind feature gates. We're presently working on compatibility paths for catalog content, and the longer-term plan is for all operators to move to OLMv1 for $reasons. 😄

@chiragkyal
Copy link
Contributor Author

To be clear, this works with OLMv0, not v1. IMHO it's fine for it to use the plain plugin path /olm, but it would be nice to enhance docs/files to clarify this. OLMv1 has been GA with a limited featureset since OCP4.18, and has other features in tech-preview, behind feature gates. We're presently working on compatibility paths for catalog content, and the longer-term plan is for all operators to move to OLMv1 for $reasons. 😄

@grokspawn Thanks a lot for your valuable feedback, I am aware that this plugin will currently work with OLMv0, and I am planning to extend it as part of a follow-up PR to include v1 as well. How about adding a flag --v1=true against each of the commands to support it? I think having a common plugin /olm that can work for both v0 and v1 would be more convenient than having two separate plugins from UX point of view. WDYT?

@chiragkyal
Copy link
Contributor Author

/cc @stbenjam

@openshift-ci openshift-ci bot requested a review from stbenjam October 29, 2025 11:45
@stbenjam
Copy link
Member

stbenjam commented Oct 29, 2025

Is it possible to just have one OLM plugin that does v0 and v1?

We've got 3 contending PR's at the moment:

@stbenjam
Copy link
Member

Looks like #54 is compatible and just needs rebasing after this one lands. And if there's anything novel in #76 they can incorporate it into this plugin.

This version looks good to me. I'm not super familiar with OLM but the structure looks good, would be good for someone with OLM experience to do the final LGTM

/approve

@openshift-ci
Copy link

openshift-ci bot commented Oct 29, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chiragkyal, stbenjam

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 29, 2025
@grokspawn
Copy link

@jianzhangbjz @chiragkyal @stbenjam

It looks to me like we have some substantial overlap, and not so much duplication here.
#54 provides explicit v0 & v1 support in a healthy debugging-focused way
#70 provides implicit v0 day 2+ operations
#76 provides explicit v1 day 2+ operations

I think all three are viable, but we probably need to figure out how to reasonably combine them. Because v0 and v1 coexist on the same cluster (with limitations!) it's going to be necessary for folks to explicitly request an interaction with one side or the other.

(example limitation: installing the same thing twice for each of v0 + v1 can be bad. v0 first, with v1 later == happy, as v1 will refuse to break a single-owner rule. v1 first, with v0 later == boom, as v0 pretends k8s-native multi-tenancy is real)

@grokspawn
Copy link

How about adding a flag --v1=true against each of the commands to support it? I think having a common plugin /olm that can work for both v0 and v1 would be more convenient than having two separate plugins from UX point of view. WDYT?

I think some kind of flag will be necessary due to the need to be explicit about which we want to manipulate. We want very much to be in the business of moving people over from v0 sensibilities to v1 ones, so defaulting to 0 isn't my preferred solution. I'm totally onboard with the idea of unifying the plugins to provide a single logical OLM entrypoint.

I think my flag preference would be something between what Jian has (bare number) and what you propose (gnu-style, with a v0 default). I'm thinking something like

  • requires explicit selection (but maybe not per-command). Like maybe there's a /olm set-context [v0|v1] command with sticky effect, and all commands will barf out the current context selection in their preamble.
  • avoids a default. We've striven for support of the existing catalog content since we GA'd last Dec, and we hope to soon hit a tipping-point where enough content is supported in v1 that we can start to get it part of the default thinking of admins.

WDYT?

@chiragkyal
Copy link
Contributor Author

I think some kind of flag will be necessary due to the need to be explicit about which we want to manipulate. We want very much to be in the business of moving people over from v0 sensibilities to v1 ones, so defaulting to 0 isn't my preferred solution. I'm totally onboard with the idea of unifying the plugins to provide a single logical OLM entrypoint.

Thanks for your feedback. I will try to merge #76 into this common plugin with an explicit selection, in a follow-up PR.

I think my flag preference would be something between what Jian has (bare number) and what you propose (gnu-style, with a v0 default). I'm thinking something like

  • requires explicit selection (but maybe not per-command). Like maybe there's a /olm set-context [v0|v1] command with sticky effect, and all commands will barf out the current context selection in their preamble.

I liked the idea of applying one time selection using something like /olm set-context [v0|v1] ; however I am not yet sure whether this is feasible with agentic flow or not, this is something I will explore next.

  • avoids a default. We've striven for support of the existing catalog content since we GA'd last Dec, and we hope to soon hit a tipping-point where enough content is supported in v1 that we can start to get it part of the default thinking of admins.

Noted, will make sure this suggestion is incorporated in my next round of PR.

@grokspawn
Copy link

IMHO, I think first goals are to merge #54, then to rebase this on top of it to combine the capabilities (though v0/v1 selection by flag for some commands is a less-than-optimal outcome).
Then we can talk about how to smoosh this functionality with #76. I have 76 pretty much where I want it based on feature parity with this PR, so I'll kinda park it for now while we figure out the best way forward.

@chiragkyal
Copy link
Contributor Author

IMHO, I think first goals are to merge #54, then to rebase this on top of it to combine the capabilities (though v0/v1 selection by flag for some commands is a less-than-optimal outcome). Then we can talk about how to smoosh this functionality with #76. I have 76 pretty much where I want it based on feature parity with this PR, so I'll kinda park it for now while we figure out the best way forward.

@grokspawn That sounds like the right approach to me.
@stbenjam Let's merge #54 first, then I will rebase this PR on top of it. Later, we will decide how to integrate with #76

@stbenjam
Copy link
Member

#54 is in, you'll need to rebase and run make update again.

@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants