MON-4479: use TLS profile to configure CMO server#2841
MON-4479: use TLS profile to configure CMO server#2841simonpasquier wants to merge 3 commits intoopenshift:mainfrom
Conversation
3d4df6a to
bf0d422
Compare
|
@simonpasquier: This pull request references MON-4479 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughMain builds a wired operator client, reads the APIServer TLS config, passes TLS settings into server.NewServer, calls Server.Prepare(ctx) before Run, and injects the APIServer config plus a cancel callback into the operator which watches APIServer/config and related resources to trigger graceful restart on changes. Changes
Sequence Diagram(s)sequenceDiagram
actor Main as Main
participant Client as OperatorClient
participant APIServer as OpenShiftAPIServer
participant Server as OperatorServer
participant Operator as CMOOperator
participant Watcher as Informers/Watchers
Main->>Client: newClient(ctx, restConfig, ...)
Client->>APIServer: create clients & recorder
Main->>Client: GetAPIServerConfig(ctx)
APIServer-->>Client: APIServer object
Client-->>Main: *client.Client / APIServerConfig
Main->>Server: NewServer(..., minTLSVersion, tlsCiphers)
Main->>Server: Prepare(ctx)
Server-->>Main: prepared srv stored
Main->>Operator: cmo.New(client, apiServerConfig, cancel)
Operator->>Watcher: start informers/watch(APIServer, ConfigMaps, Secrets, ...)
Main->>Operator: Run(ctx)
Watcher->>Operator: APIServer change event
Operator->>Operator: compare via APIServerConfig.Equal()
alt config changed
Operator->>Main: call cancel()
Main->>Operator: graceful restart/shutdown
end
Main->>Server: Run(ctx)
Server->>APIServer: serve with configured TLS settings
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)level=error msg="Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: err: exit status 1: stderr: go: inconsistent vendoring in :\n\tgithub.com/Jeffail/gabs/v2@v2.6.1: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/alecthomas/units@v0.0.0-20240927000941-0f3dac36c52b: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/blang/semver/v4@v4.0.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/ghodss/yaml@v1.0.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/go-openapi/strfmt@v0.24.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/google/uuid@v1.6.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/imdario/mergo@v0.3.16: is explicitly ... [truncated 21195 characters] ... les.txt\n\tsigs.k8s.io/apiserver-network-proxy/konnectivity-client@v0.31.2: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tsigs.k8s.io/kube-storage-version-migrator@v0.0.6-0.20230721195810-5c8923c5ff96: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tsigs.k8s.io/randfill@v1.0.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tsigs.k8s.io/structured-merge-diff/v6@v6.3.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/onsi/ginkgo/v2: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\n\tTo ignore the vendor directory, use -mod=readonly or -mod=mod.\n\tTo sync the vendor directory, run:\n\t\tgo mod vendor\n" Comment |
|
@simonpasquier: This pull request references MON-4479 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@simonpasquier: This pull request references MON-4479 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/server/server.go (1)
52-65:⚠️ Potential issue | 🔴 CriticalCritical: TLS configuration parameters are not assigned to the struct.
The
minTLSVersionandcipherSuitesparameters are accepted byNewServerbut never stored in the returnedServerstruct. This means the TLS configuration will be empty whenPrepare()usess.minTLSVersionands.cipherSuites.🐛 Proposed fix
func NewServer(name string, config *rest.Config, kubeConfig, certFile, keyFile string, minTLSVersion string, cipherSuites []string) (*Server, error) { kubeClient, err := kubernetes.NewForConfig(config) if err != nil { return nil, err } return &Server{ - name: name, - kubeClient: kubeClient, - kubeConfig: kubeConfig, - certFile: certFile, - keyFile: keyFile, + name: name, + kubeClient: kubeClient, + kubeConfig: kubeConfig, + certFile: certFile, + keyFile: keyFile, + minTLSVersion: minTLSVersion, + cipherSuites: cipherSuites, }, nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/server/server.go` around lines 52 - 65, NewServer currently accepts minTLSVersion and cipherSuites but does not assign them to the Server it returns, so Prepare() will see empty s.minTLSVersion and s.cipherSuites; update NewServer to set the Server fields (e.g., minTLSVersion: minTLSVersion, cipherSuites: cipherSuites) when constructing the &Server{}, and if those fields do not exist on the Server struct add them (matching types used by Prepare()), ensuring Prepare() uses the populated s.minTLSVersion and s.cipherSuites.
🧹 Nitpick comments (2)
pkg/client/client.go (1)
1967-1973: Consider consolidating duplicate accessor methods.There are now two methods returning the same field:
EventRecorder()(Line 279)EventsRecorder()(Line 1971)Both return
c.eventRecorder. This creates potential confusion for maintainers. Consider using a single consistent name or documenting why both are needed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/client/client.go` around lines 1967 - 1973, There are duplicate accessors returning the same field c.eventRecorder (EventRecorder and EventsRecorder); consolidate to a single canonical method name (pick one, e.g., EventRecorder) by removing the redundant method (or make it a thin deprecated wrapper that calls the canonical method), update all call sites to use the canonical name, and add a small comment on the remaining method explaining its purpose; ensure the unique symbols EventRecorder, EventsRecorder, and c.eventRecorder are used to locate and update the code and callers.pkg/server/server.go (1)
135-142: Verify Prepare() is called before Run().
Run()now assumess.srvis initialized by a prior call toPrepare(). IfRun()is called withoutPrepare(), this will panic with a nil pointer dereference ons.srv.PrepareRun().Consider adding a guard or documenting this requirement clearly.
🛡️ Proposed defensive check
func (s *Server) Run(ctx context.Context) error { + if s.srv == nil { + return fmt.Errorf("server not prepared: call Prepare() before Run()") + } if err := s.srv.PrepareRun().RunWithContext(ctx); ctx.Err() == nil { return err } klog.Info("server exited") return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/server/server.go` around lines 135 - 142, The Run method assumes s.srv is initialized and will panic if Prepare() was not called; add a defensive guard at the start of Server.Run that verifies s.srv is non-nil (or call s.Prepare() if that is the intended behavior) before invoking s.srv.PrepareRun(), and return a clear error if initialization is missing; update Server.Run to check s.srv (and/or s.srv.PrepareRun() availability) and return a descriptive error instead of dereferencing a nil pointer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/operator/main.go`:
- Around line 270-286: The Server returned by NewServer is missing assignment of
the TLS fields so Prepare() sees empty values; update the struct literal
returned by NewServer (the &Server{...} in server.NewServer) to include
minTLSVersion: minTLSVersion and cipherSuites: cipherSuites (matching the
function parameters) so the Server struct's fields are populated before
Prepare() is called.
In `@pkg/manifests/apiserver_config.go`:
- Around line 42-47: APIServerConfig.Equal currently calls methods on the
receiver and on the other pointer without nil checks and can panic; update
APIServerConfig.Equal to handle nil safely by returning true if both receiver
and other are nil, false if only one is nil, and otherwise comparing
c.MinTLSVersion() and slices.Equal(c.TLSCiphers(), other.TLSCiphers()) as
before; reference the Equal method on APIServerConfig and the helper methods
MinTLSVersion and TLSCiphers when locating the change.
In `@pkg/operator/operator.go`:
- Around line 632-640: The handler closes o.done directly which can panic if
called twice; add a sync.Once field (e.g., doneOnce) to the Operator struct and
replace close(o.done) with doneOnce.Do(func() { close(o.done) }) inside the
event handling path (the switch handling *configv1.APIServer in the method that
closes o.done) so the channel is closed exactly once.
---
Outside diff comments:
In `@pkg/server/server.go`:
- Around line 52-65: NewServer currently accepts minTLSVersion and cipherSuites
but does not assign them to the Server it returns, so Prepare() will see empty
s.minTLSVersion and s.cipherSuites; update NewServer to set the Server fields
(e.g., minTLSVersion: minTLSVersion, cipherSuites: cipherSuites) when
constructing the &Server{}, and if those fields do not exist on the Server
struct add them (matching types used by Prepare()), ensuring Prepare() uses the
populated s.minTLSVersion and s.cipherSuites.
---
Nitpick comments:
In `@pkg/client/client.go`:
- Around line 1967-1973: There are duplicate accessors returning the same field
c.eventRecorder (EventRecorder and EventsRecorder); consolidate to a single
canonical method name (pick one, e.g., EventRecorder) by removing the redundant
method (or make it a thin deprecated wrapper that calls the canonical method),
update all call sites to use the canonical name, and add a small comment on the
remaining method explaining its purpose; ensure the unique symbols
EventRecorder, EventsRecorder, and c.eventRecorder are used to locate and update
the code and callers.
In `@pkg/server/server.go`:
- Around line 135-142: The Run method assumes s.srv is initialized and will
panic if Prepare() was not called; add a defensive guard at the start of
Server.Run that verifies s.srv is non-nil (or call s.Prepare() if that is the
intended behavior) before invoking s.srv.PrepareRun(), and return a clear error
if initialization is missing; update Server.Run to check s.srv (and/or
s.srv.PrepareRun() availability) and return a descriptive error instead of
dereferencing a nil pointer.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 4184a2ff-26a7-4dbc-bdcf-0a7acbd40b4a
📒 Files selected for processing (5)
cmd/operator/main.gopkg/client/client.gopkg/manifests/apiserver_config.gopkg/operator/operator.gopkg/server/server.go
6bbd685 to
d482ba8
Compare
|
@simonpasquier: This pull request references MON-4479 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/operator/operator.go (1)
351-355:⚠️ Potential issue | 🟠 MajorHandle the initial
APIServerobject too.CMO snapshots the TLS profile in
cmd/operator/main.gobefore this informer starts, but this handler only reacts to later updates. If the profile changes in that startup window, the process can keep the stale TLS settings until anotherAPIServerevent arrives.🔧 Suggested fix
_, err = informer.AddEventHandler(cache.ResourceEventHandlerFuncs{ + AddFunc: o.handleEvent, UpdateFunc: func(_, newObj interface{}) { o.handleEvent(newObj) }, })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/operator.go` around lines 351 - 355, The handler currently only registers an UpdateFunc calling o.handleEvent, so the initial APIServer snapshot can be missed; ensure the initial APIServer is processed at startup by either adding an AddFunc to the cache.ResourceEventHandlerFuncs that calls o.handleEvent for created objects or, after the informer is synced, retrieve the current APIServer from the informer's store/lister and call o.handleEvent on that object (use the same informer variable and o.handleEvent to locate where to invoke this). Ensure this runs once after informer sync so startup TLS profile changes are handled.
🧹 Nitpick comments (1)
pkg/client/client.go (1)
1971-1973: Avoid a second accessor for the same recorder.
EventRecorder()already exposesc.eventRecorder, soEventsRecorder()just adds another public spelling for the same dependency. Reusing the existing method keeps the client surface smaller and avoids singular/plural drift.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/client/client.go` around lines 1971 - 1973, Remove the duplicate accessor EventsRecorder and consolidate on the existing EventRecorder method: delete the EventsRecorder function from type Client, then update any callers that reference EventsRecorder to call EventRecorder instead (preserve the same return type events.Recorder). Ensure imports/tests/usage compile after replacing EventsRecorder with EventRecorder and run tests to verify.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@pkg/operator/operator.go`:
- Around line 351-355: The handler currently only registers an UpdateFunc
calling o.handleEvent, so the initial APIServer snapshot can be missed; ensure
the initial APIServer is processed at startup by either adding an AddFunc to the
cache.ResourceEventHandlerFuncs that calls o.handleEvent for created objects or,
after the informer is synced, retrieve the current APIServer from the informer's
store/lister and call o.handleEvent on that object (use the same informer
variable and o.handleEvent to locate where to invoke this). Ensure this runs
once after informer sync so startup TLS profile changes are handled.
---
Nitpick comments:
In `@pkg/client/client.go`:
- Around line 1971-1973: Remove the duplicate accessor EventsRecorder and
consolidate on the existing EventRecorder method: delete the EventsRecorder
function from type Client, then update any callers that reference EventsRecorder
to call EventRecorder instead (preserve the same return type events.Recorder).
Ensure imports/tests/usage compile after replacing EventsRecorder with
EventRecorder and run tests to verify.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ff0f74b4-31f1-4bab-b122-c24040f7042e
📒 Files selected for processing (5)
cmd/operator/main.gopkg/client/client.gopkg/manifests/apiserver_config.gopkg/operator/operator.gopkg/server/server.go
d482ba8 to
12ff858
Compare
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
12ff858 to
abb8516
Compare
|
@simonpasquier: This pull request references MON-4479 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/operator/operator.go (1)
345-355:⚠️ Potential issue | 🟠 MajorHandle the APIServer informer’s initial add event too.
cmd/operator/main.goreadsAPIServer/clusterbefore informers start, so a TLS profile change in that window only shows up during this informer's initial LIST. With onlyUpdateFuncregistered here, CMO misses that drift and can keep serving with stale TLS settings until the next edit.🛠️ Suggested fix
_, err = informer.AddEventHandler(cache.ResourceEventHandlerFuncs{ + AddFunc: func(obj interface{}) { + o.handleEvent(obj) + }, UpdateFunc: func(_, newObj interface{}) { o.handleEvent(newObj) }, })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/operator.go` around lines 345 - 355, The APIServer informer only registers UpdateFunc, so the initial LIST (add) events are ignored and o.handleEvent never runs for the existing APIServer object; update the informer registration (the cache.NewSharedIndexInformer block where informer.AddEventHandler is called) to also provide an AddFunc in the cache.ResourceEventHandlerFuncs that calls o.handleEvent with the newObj (same as UpdateFunc) so the initial APIServer/cluster state is processed immediately when the informer starts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@pkg/operator/operator.go`:
- Around line 345-355: The APIServer informer only registers UpdateFunc, so the
initial LIST (add) events are ignored and o.handleEvent never runs for the
existing APIServer object; update the informer registration (the
cache.NewSharedIndexInformer block where informer.AddEventHandler is called) to
also provide an AddFunc in the cache.ResourceEventHandlerFuncs that calls
o.handleEvent with the newObj (same as UpdateFunc) so the initial
APIServer/cluster state is processed immediately when the informer starts.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: de251b38-0a3e-4b4c-9800-a81e4dd3330c
📒 Files selected for processing (5)
cmd/operator/main.gopkg/client/client.gopkg/manifests/apiserver_config.gopkg/operator/operator.gopkg/server/server.go
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
abb8516 to
304662e
Compare
|
@simonpasquier: This pull request references MON-4479 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest-required |
|
/test e2e-hypershift-conformance |
1 similar comment
|
/test e2e-hypershift-conformance |
|
/test e2e-hypershift-conformance |
|
/test e2e-hypershift-conformance/test e2e-hypershift-conformance |
|
/test e2e-hypershift-conformance |
machine424
left a comment
There was a problem hiding this comment.
thanks for this!
lgtm, I have some questions/suggestions.
we had an e2e test that we had to turn into unit because MCO started to restart nodes to propagate TLS changes (the change becomes too disruptive). https://issues.redhat.com/browse/MON-3960
It'd be great to have a test for this though to make sure CMO behaves well after a change/restart, maybe that will come for free as part of openshift/origin#30801? (not a blocker of this PR)
| return 1 | ||
| } | ||
|
|
||
| if err := srv.Prepare(ctx); err != nil { |
There was a problem hiding this comment.
a comment on whether this as separated would be helpful.
Also, maybe Prepare could just return the server?
| ) | ||
|
|
||
| _, err = informer.AddEventHandler(cache.ResourceEventHandlerFuncs{ | ||
| AddFunc: func(obj interface{}) { |
There was a problem hiding this comment.
we should probably do the same for the other configmaps/secrets in case they get deleted by mistake, recreated and never updated. (and the informers not restarted in the meanwhile, not sure if they do a relist tough...)
There was a problem hiding this comment.
Correct. In practice I suppose that secrets/configmaps will have revision changes after recreation so it might be more of a theoretical issue.
cmd/operator/main.go
Outdated
| } | ||
| apiServerConfigAdapter := manifests.NewAPIServerConfig(apiServerConfig) | ||
| klog.Infof( | ||
| "TLS configuration: minimum version=%q, ciphers=[%s]", |
There was a problem hiding this comment.
maybe
| "TLS configuration: minimum version=%q, ciphers=[%s]", | |
| "TLS settings from API server configuration: minimum version=%q, ciphers=[%s]", |
to be more explicit?
pkg/operator/operator.go
Outdated
| // Trigger a restart of the process to read the new TLS | ||
| // configuration. | ||
| klog.Info("Detected changes to the TLS profile configuration, stopping the process") | ||
| o.stop() |
There was a problem hiding this comment.
IIUC this will cancel the root ctx
so it'd not only stop the server (the PR title misled me).I assume it's simpler to just restart the whole container in case of just the server?
There was a problem hiding this comment.
Correct it will make the process exit. This is what is currently recommended as it's simple and updating the TLS profile shouldn't be a frequent operation.
yeah, I'd defer that to a generic origin test rather than something particular to CMO. |
sure, I assume you gave this a try locally :) /lgtm |
|
@simonpasquier: This pull request references MON-4479 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: machine424, simonpasquier The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
cmd/operator/main.go (1)
92-126: LGTM - Clean helper function for client initialization.The
newClienthelper encapsulates client creation with proper error handling and event recorder setup. The fallback behavior when controller reference is unavailable is appropriate.One minor inconsistency: the kubernetes client error (line 100) is wrapped with context, but the config client error (line 117) is returned unwrapped. Consider wrapping for consistency:
🔧 Optional: Wrap config client error for consistency
configClient, err := configv1client.NewForConfig(config) if err != nil { - return nil, err + return nil, fmt.Errorf("creating openshift config clientset client: %w", err) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/operator/main.go` around lines 92 - 126, In newClient, the error returned from configv1client.NewForConfig is not wrapped for context; update the error return when creating configClient (configv1client.NewForConfig) to wrap the original error with fmt.Errorf and a descriptive message (e.g., "creating openshift config client: %w") so it matches the earlier kubernetes client error handling and aids debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/operator/operator.go`:
- Line 421: The hardcoded 10*time.Minute used when creating configInformers
should be made consistent with the existing resyncPeriod constant or explicitly
documented if the shorter interval is intentional: replace the literal
10*time.Minute in the configInformers :=
configv1informers.NewSharedInformerFactory(...) call with the resyncPeriod
constant used by kubeInformersOperatorNS, or if you intentionally want faster
feature-gate/ClusterVersion detection, add a concise comment next to the
configInformers creation (referencing configInformers, resyncPeriod, and
kubeInformersOperatorNS) explaining why a shorter resync is required.
---
Nitpick comments:
In `@cmd/operator/main.go`:
- Around line 92-126: In newClient, the error returned from
configv1client.NewForConfig is not wrapped for context; update the error return
when creating configClient (configv1client.NewForConfig) to wrap the original
error with fmt.Errorf and a descriptive message (e.g., "creating openshift
config client: %w") so it matches the earlier kubernetes client error handling
and aids debugging.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3475a490-c6ee-49ed-a804-ff308f4b8655
📒 Files selected for processing (2)
cmd/operator/main.gopkg/operator/operator.go
| o.informerFactories = append(o.informerFactories, kubeInformersOperatorNS) | ||
|
|
||
| configInformers := configv1informers.NewSharedInformerFactory(configClient, 10*time.Minute) | ||
| configInformers := configv1informers.NewSharedInformerFactory(c.OpenShiftConfigClientset(), 10*time.Minute) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if different resync periods are used intentionally across the codebase
rg -n "NewSharedInformerFactory" --type=go -A2 | head -30Repository: openshift/cluster-monitoring-operator
Length of output: 3123
🏁 Script executed:
# First, examine the file around line 421
sed -n '415,430p' pkg/operator/operator.goRepository: openshift/cluster-monitoring-operator
Length of output: 766
🏁 Script executed:
# Search for resyncPeriod constant definition in non-vendor code
rg -n "resyncPeriod\s*:?=" --type=go --type-not=go -- 'vendor' -vRepository: openshift/cluster-monitoring-operator
Length of output: 123
🏁 Script executed:
# Find all NewSharedInformerFactory usages in non-vendor code with context
rg -n "NewSharedInformerFactory" --type=go -B2 -A2 -- 'vendor' -vRepository: openshift/cluster-monitoring-operator
Length of output: 8754
🏁 Script executed:
# Search for resyncPeriod definition in non-vendor code
rg -n "resyncPeriod\s*:?=" --type=go | grep -v vendorRepository: openshift/cluster-monitoring-operator
Length of output: 217
🏁 Script executed:
# Also check around the context for resyncPeriod
sed -n '400,440p' pkg/operator/operator.goRepository: openshift/cluster-monitoring-operator
Length of output: 1648
🏁 Script executed:
# Search for all NewSharedInformerFactory usages in non-vendor code
rg -n "NewSharedInformerFactory" --type=go -B1 -A1 | grep -v vendorRepository: openshift/cluster-monitoring-operator
Length of output: 560
🏁 Script executed:
# Check extended context around configInformers for any comments
sed -n '405,445p' pkg/operator/operator.goRepository: openshift/cluster-monitoring-operator
Length of output: 1555
🏁 Script executed:
# Search for any references to configInformers or featuregates that might explain the timing
rg -n "10.*time.Minute|config.*Informer" pkg/operator/operator.go -B2 -A2Repository: openshift/cluster-monitoring-operator
Length of output: 657
The hardcoded 10-minute resync period for configInformers differs from the 15-minute resyncPeriod constant used by other informers, and this difference lacks documentation.
The configInformers instance is specifically used for watching feature gates and cluster versions (which can trigger container restarts). If this shorter period is intentional for faster feature gate detection, add a comment to explain it. Otherwise, use the resyncPeriod constant for consistency with kubeInformersOperatorNS.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/operator/operator.go` at line 421, The hardcoded 10*time.Minute used when
creating configInformers should be made consistent with the existing
resyncPeriod constant or explicitly documented if the shorter interval is
intentional: replace the literal 10*time.Minute in the configInformers :=
configv1informers.NewSharedInformerFactory(...) call with the resyncPeriod
constant used by kubeInformersOperatorNS, or if you intentionally want faster
feature-gate/ClusterVersion detection, add a concise comment next to the
configInformers creation (referencing configInformers, resyncPeriod, and
kubeInformersOperatorNS) explaining why a shorter resync is required.
|
@simonpasquier: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/hold As discussed with @machine424, the e2e verification is hard (if not impossible) to do with our test framework (too disruptive), I'll let @juzhao verify that whenever the TLS profile is updated, CMO reconfigures itself as expected. |
|
/skip |
This PR implements both reading the TLS configuration from the APIServer configuration and stopping the process when a change is detected. It ensures that CMO always uses the up-to-date configuration.
Summary by CodeRabbit
Refactor
Chores