ESO-182: Updates to use different TLS Secret name and disables cluster resource reconcile based on user config #75

bharath-b-rh · 2025-10-08T11:20:27Z

PR has following the changes:

Updates the TLS secret name used by the webhook component when the cert-manager config is enabled. This is avoid conflict with the cert-controller generated TLS secret, where earlier both Secret names were same.
Updates the external-secrets core controller args to disable cluster custom resources reconciliation when operatingNamespace is enabled. This will restrict operand from processing any other custom resources created outside of the operatingNamespace, including the cluster resources.
Updates resource request config for the operator and removes resource limit config.
- Limit config is removed based on the suggestion here
- Request config was updated based on the observation on a cluster having just ESO as additional operator apart from the core payload using derailed/k9s.
Updates operator metrics server to make use of system CA certificates along with openshift service CA.
Updates Makefile with AI suggested changes and few improvements.
Updates the operand delete option in CSV, so that when user deletes the operator from console a pop-up will be used to ask whether to delete operand custom resources.

openshift-ci-robot · 2025-10-08T11:20:32Z

@bharath-b-rh: This pull request references ESO-155 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.21.0" version, but no target version was set.

In response to this:

PR has following the changes:

Updates the TLS secret name used by the webhook component when the cert-manager config is enabled. This is avoid conflict with the cert-controller generated TLS secret, where earlier both Secret names were same.

Updates the external-secrets core controller args to disable cluster custom resources reconciliation when operatingNamespace is enabled. This will restrict operand from processing any other custom resources created outside of the operatingNamespace, including the cluster resources.

Updates resource request config for the operator and removes resource limit config.

Limit config is removed based on the suggestion here

Request config was updated based on the observation on a cluster having just ESO as additional operator apart from the core payload using derailed/k9s.

Updates operator metrics server to make use of system CA certificates along with openshift service CA.

Updates Makefile with AI suggested changes and few improvements.

Updates the operand delete option in CSV, so that when user deletes the operator from console a pop-up will be used to ask whether to delete operand custom resources.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2025-10-08T11:20:50Z

Walkthrough

Rename Makefile docker-* targets to image-* and switch image/bundle/catalog commands to use $(CONTAINER_TOOL); raise operator CPU/memory requests and remove limits; initialize TLS cert pool from system store with fallback; gate secret-volume diff by volume name; add webhook TLS secret handling and namespace-driven reconciler flags.

Changes

Cohort / File(s)	Summary
Build & tooling `Makefile`	Renamed public targets (`docker-build` → `image-build`, `docker-push` → `image-push`); replaced direct `docker` usage with `$(CONTAINER_TOOL)` for image, bundle and catalog build/push and buildx steps; updated help text, temp outputs path, `go install` messaging (`-mod=vendor`), and improved `govulncheck` messaging/filtering/guards.
Manifests / manager config `bundle/manifests/external-secrets-operator.clusterserviceversion.yaml`, `config/manager/manager.yaml`	Removed container resource limits; increased requests from cpu:`10m`→`100m` and memory:`64Mi`→`1Gi`; updated CSV `createdAt` timestamp.
TLS cert pool initialization `cmd/external-secrets-operator/main.go`	Initialize TLS metrics cert pool via `x509.SystemCertPool()` with logged fallback to a new empty pool; OpenShift CA handling and TLS config retained.
Deployment secret-volume diff logic `pkg/controller/common/utils.go`	In deploymentSpecModified, inner secret-volume comparisons now only run when `desiredVolume.Name == fetchedVolume.Name`; `Secret.Items` and `Secret.SecretName` comparisons moved inside this name-match branch.
Webhook / cert-manager & controller helpers `pkg/controller/external_secrets/constants.go`, `pkg/controller/external_secrets/certificate.go`, `pkg/controller/external_secrets/deployments.go`	Added unexported `certmanagerTLSSecretWebhook` constant; set `certificate.Spec.SecretName` for webhook certificate asset; added `updateWebhookVolumeConfig` and `updateSecretVolumeConfig` helpers to ensure Secret-backed `certs` volumes (applied to webhook and bitwarden); container args updated to toggle cluster reconciler flags based on `operatingNamespace`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "ESO-182: Updates to use different TLS Secret name and disables cluster resource reconcile based on user config" is clearly related to the changeset. It highlights two significant behavioral changes that are verifiable in the raw summary: the TLS secret naming changes for the webhook component (evident in constants.go, certificate.go, and deployments.go) and the cluster resource reconciliation logic updates (evident in deployments.go). While the title does not cover all aspects of the changeset such as resource configuration, metrics server updates, and Makefile changes, the instructions acknowledge that titles don't need to cover every detail. The title is specific, clear, and accurately describes real parts of the changeset.
Description Check	✅ Passed	The description is clearly related to the changeset with specific details about the changes. The raw summary confirms most of the described changes: TLS secret naming updates (certificate.go, constants.go, deployments.go), cluster resource reconciliation logic (deployments.go), resource request and limit configuration updates (manager.yaml and CSV), metrics server system CA certificate integration (main.go), and Makefile updates. The description provides sufficient context and specificity relevant to the actual changes in the pull request, and the check criteria explicitly states that level of detail is not important as long as the description relates to the changeset.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2025-10-08T11:21:24Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bharath-b-rh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [bharath-b-rh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

bharath-b-rh · 2025-10-08T11:23:49Z

Resource request tests

Use below command to check the CPU and memory consumption by the operator.

docker run --rm -it -v $KUBECONFIG:/root/.kube/config quay.io/derailed/k9s -n external-secrets-operator

Below snapshot shows the consumption when it's ESO operator in cluster apart from core payload. i.e there aren't many deployment, serviceaccount, service, clusterrole, clusterrolebindings existing like in a large cluster, which would cause operator resource usage spike when filtering for cache and reconcile events.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

pkg/controller/common/utils.go (1)

213-222: Fix the volume comparison logic.

The broadened condition on Line 213 now allows detection of SecretName differences when Items is nil, which is good. However, the inner loop logic has a critical flaw: it compares a desired volume against all fetched volumes without matching by volume name first.

Current logic:

for each desiredVolume:
  if desiredVolume.Secret != nil:
    for each fetchedVolume:
      compare desiredVolume.Secret vs fetchedVolume.Secret

This incorrectly compares unrelated volumes and returns true on any mismatch, causing false positives.

Apply this diff to fix the logic by matching volumes by name before comparing their secret properties:

-	for _, desiredVolume := range desired.Spec.Template.Spec.Volumes {
-		if desiredVolume.Secret != nil {
-			for _, fetchedVolume := range fetched.Spec.Template.Spec.Volumes {
-				if !reflect.DeepEqual(desiredVolume.Secret.Items, fetchedVolume.Secret.Items) {
-					return true
-				}
-				if desiredVolume.Secret.SecretName != fetchedVolume.Secret.SecretName {
-					return true
-				}
-			}
-		}
-	}
+	for _, desiredVolume := range desired.Spec.Template.Spec.Volumes {
+		if desiredVolume.Secret != nil {
+			// Find the matching volume by name in the fetched deployment
+			found := false
+			for _, fetchedVolume := range fetched.Spec.Template.Spec.Volumes {
+				if desiredVolume.Name == fetchedVolume.Name {
+					found = true
+					if fetchedVolume.Secret == nil {
+						return true
+					}
+					if !reflect.DeepEqual(desiredVolume.Secret.Items, fetchedVolume.Secret.Items) {
+						return true
+					}
+					if desiredVolume.Secret.SecretName != fetchedVolume.Secret.SecretName {
+						return true
+					}
+					break
+				}
+			}
+			if !found {
+				return true
+			}
+		}
+	}

🧹 Nitpick comments (1)

Makefile (1)

201-204: Consider using CONTAINER_TOOL variable consistently.

The docker-buildx target still uses hard-coded docker commands instead of $(CONTAINER_TOOL). While buildx is Docker-specific, this inconsistency might confuse users.

Consider adding a comment explaining why docker is hard-coded here (buildx is Docker-specific), or conditionally skip this target when CONTAINER_TOOL != docker.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between e3fe1cf and ca5a522.

📒 Files selected for processing (9)

Makefile (7 hunks)
bundle/manifests/external-secrets-operator.clusterserviceversion.yaml (2 hunks)
cmd/external-secrets-operator/main.go (1 hunks)
config/manager/manager.yaml (1 hunks)
config/manifests/bases/external-secrets-operator.clusterserviceversion.yaml (1 hunks)
pkg/controller/common/utils.go (1 hunks)
pkg/controller/external_secrets/certificate.go (1 hunks)
pkg/controller/external_secrets/constants.go (1 hunks)
pkg/controller/external_secrets/deployments.go (3 hunks)

🧰 Additional context used

🪛 checkmake (0.2.2)

Makefile

[warning] 429-429: Target body for "govulnscan" exceeds allowed length of 5 (16).

(maxbodylength)

🔇 Additional comments (15)

config/manager/manager.yaml (1)

123-124: LGTM! Resource adjustments align with observations and OpenShift conventions.

The significant increase in resource requests (CPU: 10m→100m, memory: 64Mi→1Gi) and removal of resource limits follow OpenShift best practices. These values are based on actual cluster observations.

Consider monitoring resource usage in test environments to ensure these values remain appropriate under various workloads.

cmd/external-secrets-operator/main.go (1)

157-161: LGTM! System certificate pool integration is correct.

The addition of x509.SystemCertPool() properly leverages the system trust store before appending the OpenShift service CA certificate. The error handling with os.Exit(1) is appropriate for this initialization phase.

config/manifests/bases/external-secrets-operator.clusterserviceversion.yaml (1)

8-8: LGTM! Operand delete confirmation enabled.

Changing console.openshift.io/disable-operand-delete to "false" enables the OpenShift console to prompt users for confirmation when deleting operand custom resources, providing better user experience and preventing accidental deletions.

bundle/manifests/external-secrets-operator.clusterserviceversion.yaml (3)

221-221: LGTM! Consistent with base manifest.

This change aligns with the corresponding update in config/manifests/bases/external-secrets-operator.clusterserviceversion.yaml, enabling operand deletion confirmation in the OpenShift console.

223-223: Timestamp update is expected.

The createdAt timestamp update is a standard bundle regeneration artifact.

750-751: LGTM! Resource configuration consistent with manager manifest.

These resource request values match the updates in config/manager/manager.yaml and follow OpenShift conventions.

Makefile (5)

182-188: LGTM! Container-tool-agnostic target naming.

Renaming docker-build→image-build and docker-push→image-push makes the Makefile more portable and aligns with the CONTAINER_TOOL variable usage.

296-307: LGTM! Improved go-install-tool with vendor support.

The addition of GOFLAGS="-mod=vendor" ensures vendored dependencies are used during tool installation, and the messaging change from "Downloading" to "Installing" is more accurate.

422-445: Excellent govulnscan improvements.

The enhanced vulnerability scanning workflow with:

Clear progress messages

Robust filtering of known vulnerabilities

Better error detection and reporting

Informative success/failure output

significantly improves the developer experience.

Note: The static analysis warning about target body length (16 > 5 lines) is a pedantic style guideline. The current implementation prioritizes clarity and robustness over brevity, which is appropriate for a critical security scanning task.

355-355: LGTM! Consistent CONTAINER_TOOL usage.

Bundle build now uses $(CONTAINER_TOOL) for container-tool-agnostic operation.

395-395: LGTM! Consistent CONTAINER_TOOL usage.

Catalog build now uses --container-tool $(CONTAINER_TOOL) parameter.

pkg/controller/external_secrets/constants.go (1)

48-54: LGTM!

The new constants are well-named and appropriately scoped. The comment on certmanagerTLSSecretWebhook clearly explains why a different secret name is needed to avoid conflicts with the cert-controller-generated TLS secret.

pkg/controller/external_secrets/certificate.go (1)

85-88: LGTM!

The conditional logic correctly applies the new TLS secret name specifically for the webhook certificate asset, ensuring the webhook uses certmanagerTLSSecretWebhook instead of the default secret name that would conflict with the cert-controller.

pkg/controller/external_secrets/deployments.go (2)

126-126: LGTM!

The call to updateWebhookVolumeConfig is correctly placed after the container spec update and only applies to the webhook deployment.

305-330: LGTM!

The namespace-based cluster reconciler flag logic correctly implements the PR objective:

When operatingNamespace is configured, the controller is restricted to that namespace and cluster-scoped reconciliation is disabled (both flags set to false)

When operatingNamespace is empty, cluster-scoped reconciliation is enabled (both flags set to true)

The explanatory comment clearly documents this behavior.

Makefile

coderabbitai · 2025-10-08T11:24:40Z

pkg/controller/external_secrets/deployments.go

+func updateWebhookVolumeConfig(deployment *appsv1.Deployment, esc *operatorv1alpha1.ExternalSecretsConfig) {
+	if isCertManagerConfigEnabled(esc) {
+		const certsVolumeName = "certs"
+		if deployment.Spec.Template.Spec.Volumes == nil {
+			deployment.Spec.Template.Spec.Volumes = append(deployment.Spec.Template.Spec.Volumes, corev1.Volume{
+				Name: certsVolumeName,
+			})
+		}
+		for i := range deployment.Spec.Template.Spec.Volumes {
+			if deployment.Spec.Template.Spec.Volumes[i].Name == certsVolumeName {
+				if deployment.Spec.Template.Spec.Volumes[i].Secret == nil {
+					deployment.Spec.Template.Spec.Volumes[i].Secret = &corev1.SecretVolumeSource{}
+				}
+				deployment.Spec.Template.Spec.Volumes[i].Secret.SecretName = certmanagerTLSSecretWebhook
+			}
+		}
+	}
+}


⚠️ Potential issue | 🟠 Major

Fix the volume creation logic.

The function only appends a certs volume when deployment.Spec.Template.Spec.Volumes is nil (lines 408-412). If Volumes is already non-nil but doesn't contain a certs entry, the subsequent loop (lines 413-420) won't find it, and no volume will be configured.

Apply this diff to ensure the certs volume is created when it doesn't exist:

func updateWebhookVolumeConfig(deployment *appsv1.Deployment, esc *operatorv1alpha1.ExternalSecretsConfig) { if isCertManagerConfigEnabled(esc) { const certsVolumeName = "certs" - if deployment.Spec.Template.Spec.Volumes == nil { - deployment.Spec.Template.Spec.Volumes = append(deployment.Spec.Template.Spec.Volumes, corev1.Volume{ - Name: certsVolumeName, - }) - } + + // Find if certs volume already exists + certsVolumeExists := false for i := range deployment.Spec.Template.Spec.Volumes { if deployment.Spec.Template.Spec.Volumes[i].Name == certsVolumeName { + certsVolumeExists = true if deployment.Spec.Template.Spec.Volumes[i].Secret == nil { deployment.Spec.Template.Spec.Volumes[i].Secret = &corev1.SecretVolumeSource{} } deployment.Spec.Template.Spec.Volumes[i].Secret.SecretName = certmanagerTLSSecretWebhook + break } } + + // If certs volume doesn't exist, create it + if !certsVolumeExists { + deployment.Spec.Template.Spec.Volumes = append(deployment.Spec.Template.Spec.Volumes, corev1.Volume{ + Name: certsVolumeName, + VolumeSource: corev1.VolumeSource{ + Secret: &corev1.SecretVolumeSource{ + SecretName: certmanagerTLSSecretWebhook, + }, + }, + }) + } } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func updateWebhookVolumeConfig(deployment *appsv1.Deployment, esc *operatorv1alpha1.ExternalSecretsConfig) {

if isCertManagerConfigEnabled(esc) {

const certsVolumeName = "certs"

if deployment.Spec.Template.Spec.Volumes == nil {

deployment.Spec.Template.Spec.Volumes = append(deployment.Spec.Template.Spec.Volumes, corev1.Volume{

Name: certsVolumeName,

})

}

for i := range deployment.Spec.Template.Spec.Volumes {

if deployment.Spec.Template.Spec.Volumes[i].Name == certsVolumeName {

if deployment.Spec.Template.Spec.Volumes[i].Secret == nil {

deployment.Spec.Template.Spec.Volumes[i].Secret = &corev1.SecretVolumeSource{}

}

deployment.Spec.Template.Spec.Volumes[i].Secret.SecretName = certmanagerTLSSecretWebhook

}

}

}

}

func updateWebhookVolumeConfig(deployment *appsv1.Deployment, esc *operatorv1alpha1.ExternalSecretsConfig) {

if isCertManagerConfigEnabled(esc) {

const certsVolumeName = "certs"

// Find if certs volume already exists

certsVolumeExists := false

for i := range deployment.Spec.Template.Spec.Volumes {

if deployment.Spec.Template.Spec.Volumes[i].Name == certsVolumeName {

certsVolumeExists = true

if deployment.Spec.Template.Spec.Volumes[i].Secret == nil {

deployment.Spec.Template.Spec.Volumes[i].Secret = &corev1.SecretVolumeSource{}

}

deployment.Spec.Template.Spec.Volumes[i].Secret.SecretName = certmanagerTLSSecretWebhook

break

}

}

// If certs volume doesn't exist, create it

if !certsVolumeExists {

deployment.Spec.Template.Spec.Volumes = append(deployment.Spec.Template.Spec.Volumes, corev1.Volume{

Name: certsVolumeName,

VolumeSource: corev1.VolumeSource{

Secret: &corev1.SecretVolumeSource{

SecretName: certmanagerTLSSecretWebhook,

},

},

})

}

}

}

🤖 Prompt for AI Agents

In pkg/controller/external_secrets/deployments.go around lines 405 to 422, the code only appends the "certs" volume when deployment.Spec.Template.Spec.Volumes is nil so if Volumes is non-nil but missing a "certs" entry it never adds one; change the logic to scan existing volumes for a "certs" entry, and if not found append a corev1.Volume{Name: certsVolumeName} to the slice before the loop, then in the loop ensure the Secret field is initialized (if nil) and set Secret.SecretName = certmanagerTLSSecretWebhook.

This is not required, since certs is only the volume expected.

@bharath-b-rh Thank you for the clarification! If the webhook deployment asset structure ensures that Volumes is either nil or already contains the certs volume, then the current implementation is indeed sufficient. I appreciate you confirming that the logic works correctly for your use case.

This is not required, since certs is only the volume expected.

Then we can just access deployment.Spec.Template.Spec.Volumes[0] directly right? Why the loop and nil check? To cover for any mutating webhooks outside the operator's control?

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

===> Then we can just access deployment.Spec.Template.Spec.Volumes[0] directly right? Why the loop and nil check? To cover for any mutating webhooks outside the operator's control?

openshift-ci-robot · 2025-10-08T11:28:07Z

@bharath-b-rh: This pull request references ESO-182 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

PR has following the changes:

Updates the TLS secret name used by the webhook component when the cert-manager config is enabled. This is avoid conflict with the cert-controller generated TLS secret, where earlier both Secret names were same.

Updates the external-secrets core controller args to disable cluster custom resources reconciliation when operatingNamespace is enabled. This will restrict operand from processing any other custom resources created outside of the operatingNamespace, including the cluster resources.

Updates resource request config for the operator and removes resource limit config.

Limit config is removed based on the suggestion here

Request config was updated based on the observation on a cluster having just ESO as additional operator apart from the core payload using derailed/k9s.

Updates operator metrics server to make use of system CA certificates along with openshift service CA.

Updates Makefile with AI suggested changes and few improvements.

Updates the operand delete option in CSV, so that when user deletes the operator from console a pop-up will be used to ask whether to delete operand custom resources.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

bharath-b-rh · 2025-10-08T11:35:14Z

/label px-approved
/label docs-approved
/cherrypick release-1.0

openshift-cherrypick-robot · 2025-10-08T11:35:17Z

@bharath-b-rh: once the present PR merges, I will cherry-pick it on top of release-1.0 in a new PR and assign it to you.

In response to this:

/label px-approved
/label docs-approved
/cherrypick release-1.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between ca5a522 and 5527bde.

📒 Files selected for processing (1)

Makefile (7 hunks)

🧰 Additional context used

🪛 checkmake (0.2.2)

Makefile

[warning] 429-429: Target body for "govulnscan" exceeds allowed length of 5 (16).

(maxbodylength)

Makefile

bharath-b-rh · 2025-10-08T11:36:29Z

Makefile

-	$(CONTAINER_TOOL) buildx use external-secrets-operator-builder
-	- $(CONTAINER_TOOL) buildx build --push --platform=$(PLATFORMS) --tag ${IMG} -f Dockerfile.cross .
-	- $(CONTAINER_TOOL) buildx rm external-secrets-operator-builder
+	- docker buildx create --name external-secrets-operator-builder


The docker is used instead of $(CONTAINER_TOOL) because, by default $(CONTAINER_TOOL) is set to podman, and buildx is supported by docker. Hence just here docker is hardcoded.

bharath-b-rh · 2025-10-08T11:57:30Z

Please refer comments [1] and [2] for test results.

mytreya-rh

[claude-generated] Critical Issues Identified:

1. Volume Comparison Logic Bug (pkg/controller/common/utils.go:212-221)

The loop logic in deploymentSpecModified appears incorrect. You're iterating over desired volumes and for each desiredVolume.Secret, you check ALL fetchedVolume.Secret.Items without matching volumes by name first. This means you're comparing wrong volumes and could cause false positives in detecting deployment changes.

Recommendation: Match volumes by name first, then compare Secret.Items only for matching volumes.

2. SystemCertPool Error Handling (cmd/external-secrets-operator/main.go:157-160)

If x509.SystemCertPool() fails, the application exits immediately. Consider:

Is SystemCertPool() failure expected on certain platforms/containers?
Should there be a fallback to x509.NewCertPool() if system pool is unavailable?
This could break existing deployments if system cert pool becomes unavailable

3. Resource Request Increases (config/manager/manager.yaml:122-123)

CPU request increased from 10m to 100m (10x) and memory from 64Mi to 1Gi (16x). This is significant and needs justification:

What workload patterns drove these increases?
Has this been tested in resource-constrained environments?
Will this impact the operator's ability to run on smaller clusters?

4. Resource Limits Removal (config/manager/manager.yaml)

While OpenShift conventions recommend removing limits, verify this won't cause resource contention. Consider memory usage patterns under load and whether cluster resource quotas are in place.

5. Operand Deletion Enabled (config/manifests/bases/external-secrets-operator.clusterserviceversion.yaml:8)

Setting disable-operand-delete: false allows users to delete operand CRs from console. Ensure:

This is intentional and documented
Users understand implications
Safeguards exist for data loss prevention
No orphaned resources remain

6. updateWebhookVolumeConfig Logic (pkg/controller/external_secrets/deployments.go:405-421)

The function assumes the volume exists and is initialized. Edge cases:

If deployment.Spec.Template.Spec.Volumes is nil, appending fails
If no volume with name "certs" exists, the loop completes without creating it
Could result in nil pointer dereference or silent failure

Recommendation: Ensure the volume exists before trying to update it.

mytreya-rh

Inline code review - critical issues

mytreya-rh · 2025-10-09T09:30:41Z

pkg/controller/common/utils.go

 					return true
 				}
 				if desiredVolume.Secret.SecretName != fetchedVolume.Secret.SecretName {
 					return true


[claude-generated] Critical: This loop logic is incorrect. You're comparing all desired volumes against all fetched volumes without matching by name first. This will cause false positives.

Fix:

for _, desiredVolume := range desired.Spec.Template.Spec.Volumes { if desiredVolume.Secret != nil { for _, fetchedVolume := range fetched.Spec.Template.Spec.Volumes { if desiredVolume.Name == fetchedVolume.Name { // Match by name first! if !reflect.DeepEqual(desiredVolume.Secret.Items, fetchedVolume.Secret.Items) { return true } if !reflect.DeepEqual(desiredVolume.Secret.SecretName, fetchedVolume.Secret.SecretName) { return true } } } } }

So here, we are relying on the fact that in any of the deployments there will only be one volume that will be mounted from secrets.
i feel this being common code, its quite a bit about how the deployment assets are structured.
We can add in the Name comparison.
Also i have 2 more questions:

Why don't we just use controllerutil.CreateOrUpdate and avoid our custom matching?

We are starting of with specific CreateOrApplyDeploymentFromAsset to a common facility HasObjectChanged, but there again we have a big switch case for each type. Why not just call deploymentSpecModified from CreateOrApplyDeploymentFromAsset? If we call into a common function that accepts a higher abstraction (i.e. Object vs Deployment), i think common function is expected to work at that higher level of abstraction instead of inspecting the types again unless its some kind of a framework (OpenClosedPrinciple).
Similar comment i had earlier on restructuring deployment struct in createOrApplyDeployment but i couldn't locate it now :)

We are having our own helper functions because, we are using the static manifests to derive the desired objects, which will not have the default values populated in each of these objects. And the direct comparison at the spec will always fail and cause update to happen. Instead we are specifically looking for the desired fields.

mytreya-rh · 2025-10-09T09:30:41Z

cmd/external-secrets-operator/main.go

+			if err != nil {
+				setupLog.Error(err, "failed to load system certificate pool")
+				os.Exit(1)
+			}


[claude-generated] Critical: Hard exit on SystemCertPool() failure could break deployments. Consider adding a fallback:

certPool, err := x509.SystemCertPool() if err != nil { setupLog.Info("system cert pool unavailable, using empty pool", "error", err) certPool = x509.NewCertPool() }

like claude mentioned above, is it really a fatal error if SystemCertPool() cannot be fetched that we need to do a hard exit?

I feel yes, atleast in RHEL which is our base image, the system CA certificates will always be present, and if at all failure occurs it would be a genuine failure and we don't want proceed I think.

mytreya-rh · 2025-10-09T09:30:41Z

config/manager/manager.yaml

-            cpu: 10m
-            memory: 64Mi
+            cpu: 100m
+            memory: 1Gi


[claude-generated] Critical: CPU request increased from 10m to 100m (10x) and memory from 64Mi to 1Gi (16x). Please provide:

Performance test results showing why 1Gi memory is needed

CPU utilization metrics from production/staging

Impact analysis on resource-constrained clusters

inline with above claude comment, curious to know 1Gi memory need? Most of it is cache or is there any other significant use of memory in this operator?

It was decided based on the observation added in this comment. With just one additional day-2 operator, the memory reached 512Mib when the operator started. And I think it's safe to keep 1Gi considering the large clusters.

i have not seen the memory go beyond 62MiB for the operator POD.. i started a k top pod in while loop and then installed the operator and created the ExternalSecretsConfig CR on my GCP cluster.

while true; do kubectl -n external-secrets-operator top pod; done No resources found in external-secrets-operator namespace. No resources found in external-secrets-operator namespace. // Installed operator from console error: metrics not available yet error: metrics not available yet // Operator came up NAME CPU(cores) MEMORY(bytes) external-secrets-operator-controller-manager-5685dc9cf7-kghkn 34m 62Mi NAME CPU(cores) MEMORY(bytes) external-secrets-operator-controller-manager-5685dc9cf7-kghkn 34m 62Mi ... ... external-secrets-operator-controller-manager-5685dc9cf7-kghkn 1m 57Mi NAME CPU(cores) MEMORY(bytes) external-secrets-operator-controller-manager-5685dc9cf7-kghkn 1m 57Mi ... ... // Created ExternalSecretsConfig external-secrets-operator-controller-manager-5685dc9cf7-kghkn 9m 58Mi NAME CPU(cores) MEMORY(bytes) external-secrets-operator-controller-manager-5685dc9cf7-kghkn 9m 58Mi

Also, we already had limit as 128Mi which went through QA testing.
In addition, i tested with 50Mi limit which as expected crashed with OOM.

k -n external-secrets-operator get po NAME READY STATUS RESTARTS AGE external-secrets-operator-controller-manager-85b4888895-4pqjc 0/1 CrashLoopBackOff 2 (20s ago) 77s k -n external-secrets-operator get po -o yaml | grep -A2 -B2 OOM exitCode: 137 finishedAt: "2025-10-10T11:41:08Z" reason: OOMKilled startedAt: "2025-10-10T11:40:50Z" name: manager

But even with 64Mi limit there are no OOMKilled.
Thus i feel 128Mi could be a safe resource request allowing for easier POD scheduling than to have it as 1Gi, unless i am missing something here.

mytreya-rh · 2025-10-09T09:30:41Z

config/manifests/bases/external-secrets-operator.clusterserviceversion.yaml

    capabilities: Basic Install
    categories: Security
-    console.openshift.io/disable-operand-delete: "true"
+    console.openshift.io/disable-operand-delete: "false"


[claude-generated] Security: Enabling operand deletion (disable-operand-delete: "false") is a significant change. Ensure:

This is documented

Users understand implications

Safeguards exist for data loss prevention

Highlighted in release notes

pkg/controller/external_secrets/deployments.go

mytreya-rh · 2025-10-09T09:30:41Z

pkg/controller/external_secrets/deployments.go

[claude-generated] Good: The logic for disabling cluster-scoped reconciliation when operatingNamespace is set looks correct. Consider:

Add e2e tests to verify cluster resources are truly ignored when operatingNamespace is set

Update API docs to explain this behavior

What happens if cluster resources exist and then operatingNamespace is enabled? Need migration path.

Add e2e tests to verify cluster resources are truly ignored when operatingNamespace is set

Please refer comment for the tests. Could we add e2e test case in our suite in a follow up PR?

What happens if cluster resources exist and then operatingNamespace is enabled? Need migration path.

The resources outside of the operatingNamespace will not be reconciled after namespace is configured. And the same applies for other way around too, which is checked in verification

…amespace is configured Signed-off-by: Bharath B <[email protected]>

…nabled Signed-off-by: Bharath B <[email protected]>

Signed-off-by: Bharath B <[email protected]>

mytreya-rh · 2025-10-13T14:59:50Z

/lgtm

bharath-b-rh · 2025-10-14T09:56:59Z

/cherrypick release-1.0

openshift-cherrypick-robot · 2025-10-14T09:57:02Z

@bharath-b-rh: once the present PR merges, I will cherry-pick it on top of release-1.0 in a new PR and assign it to you.

In response to this:

/cherrypick release-1.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: Bharath B <[email protected]>

mytreya-rh · 2025-10-16T04:48:59Z

/lgtm

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between ca2c73f and d82cfc0.

📒 Files selected for processing (4)

bundle/manifests/external-secrets-operator.clusterserviceversion.yaml (2 hunks)
cmd/external-secrets-operator/main.go (1 hunks)
pkg/controller/common/utils.go (1 hunks)
pkg/controller/external_secrets/constants.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

pkg/controller/common/utils.go
bundle/manifests/external-secrets-operator.clusterserviceversion.yaml

🔇 Additional comments (1)

cmd/external-secrets-operator/main.go (1)

159-163: LGTM! Fallback implementation addresses previous review concern.

The code now implements the suggested fallback approach: attempting to load the system certificate pool, logging a non-fatal info message on failure, and creating an empty pool to continue. Since the OpenShift CA certificate is appended afterward (Line 170), the metrics server will still function correctly even if the system pool is unavailable.

coderabbitai · 2025-10-18T15:30:08Z

pkg/controller/external_secrets/constants.go

+	// certmanagerTLSSecretWebhook is the TLS secret created by cert-manager for the webhook component. A different
+	// name is used to avoiding clash with the secret created by the inbuilt cert-controller component.


⚠️ Potential issue | 🟡 Minor

Fix grammar in comment.

The comment contains a grammatical error.

Apply this diff:

- // certmanagerTLSSecretWebhook is the TLS secret created by cert-manager for the webhook component. A different - // name is used to avoiding clash with the secret created by the inbuilt cert-controller component. + // certmanagerTLSSecretWebhook is the TLS secret created by cert-manager for the webhook component. A different + // name is used to avoid clash with the secret created by the inbuilt cert-controller component.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// certmanagerTLSSecretWebhook is the TLS secret created by cert-manager for the webhook component. A different

// name is used to avoiding clash with the secret created by the inbuilt cert-controller component.

// certmanagerTLSSecretWebhook is the TLS secret created by cert-manager for the webhook component. A different

// name is used to avoid clash with the secret created by the inbuilt cert-controller component.

🤖 Prompt for AI Agents

In pkg/controller/external_secrets/constants.go around lines 52 to 53, the comment has a grammatical error; update the sentence "A different name is used to avoiding clash with the secret created by the inbuilt cert-controller component." to a correct form such as "A different name is used to avoid a clash with the secret created by the inbuilt cert-controller component." (or "to avoid clashing with the secret...") so the comment reads clearly and grammatically correct.

openshift-ci · 2025-10-18T15:48:09Z

@bharath-b-rh: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

emmajiafan · 2025-10-20T06:20:18Z

/label qe-approved
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/release-openshift-origin-installer-launch-aws-modern/1980063282576232448

openshift-ci-robot · 2025-10-20T06:20:24Z

@bharath-b-rh: This pull request references ESO-182 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

PR has following the changes:

Updates the TLS secret name used by the webhook component when the cert-manager config is enabled. This is avoid conflict with the cert-controller generated TLS secret, where earlier both Secret names were same.

Updates the external-secrets core controller args to disable cluster custom resources reconciliation when operatingNamespace is enabled. This will restrict operand from processing any other custom resources created outside of the operatingNamespace, including the cluster resources.

Updates resource request config for the operator and removes resource limit config.

Limit config is removed based on the suggestion here

Request config was updated based on the observation on a cluster having just ESO as additional operator apart from the core payload using derailed/k9s.

Updates operator metrics server to make use of system CA certificates along with openshift service CA.

Updates Makefile with AI suggested changes and few improvements.

Updates the operand delete option in CSV, so that when user deletes the operator from console a pop-up will be used to ask whether to delete operand custom resources.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

mytreya-rh · 2025-10-20T06:53:43Z

/lgtm

openshift-cherrypick-robot · 2025-10-20T07:00:12Z

@bharath-b-rh: #75 failed to apply on top of branch "release-1.0":

Applying: ESO-155: Sets cluster resource reconcile arg to false when operatingNamespace is configured
Applying: ESO-155: Updates to use different TLS Secret name when cert-manager enabled
Applying: ESO-155: Updates resources config for operator pod
Using index info to reconstruct a base tree...
M	bundle/manifests/external-secrets-operator.clusterserviceversion.yaml
Falling back to patching base and 3-way merge...
Auto-merging bundle/manifests/external-secrets-operator.clusterserviceversion.yaml
CONFLICT (content): Merge conflict in bundle/manifests/external-secrets-operator.clusterserviceversion.yaml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0003 ESO-155: Updates resources config for operator pod

In response to this:

/cherrypick release-1.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

bharath-b-rh · 2025-10-20T07:59:21Z

/cherrypick release-1.0

openshift-cherrypick-robot · 2025-10-20T08:00:02Z

@bharath-b-rh: #75 failed to apply on top of branch "release-1.0":

Applying: ESO-155: Sets cluster resource reconcile arg to false when operatingNamespace is configured
Applying: ESO-155: Updates to use different TLS Secret name when cert-manager enabled
Applying: ESO-155: Updates resources config for operator pod
Using index info to reconstruct a base tree...
M	bundle/manifests/external-secrets-operator.clusterserviceversion.yaml
Falling back to patching base and 3-way merge...
Auto-merging bundle/manifests/external-secrets-operator.clusterserviceversion.yaml
CONFLICT (content): Merge conflict in bundle/manifests/external-secrets-operator.clusterserviceversion.yaml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0003 ESO-155: Updates resources config for operator pod

In response to this:

/cherrypick release-1.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 8, 2025

openshift-ci bot requested review from TrilokGeer and swghosh October 8, 2025 11:21

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 8, 2025

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

bharath-b-rh changed the title ~~ESO-155: Updates to use different TLS Secret name and disables cluster resource reconcile based on user config~~ ESO-182: Updates to use different TLS Secret name and disables cluster resource reconcile based on user config Oct 8, 2025

bharath-b-rh force-pushed the eso-155-3 branch from ca5a522 to 5527bde Compare October 8, 2025 11:32

openshift-ci bot added px-approved Signifies that Product Support has signed off on this PR docs-approved Signifies that Docs has signed off on this PR labels Oct 8, 2025

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

Makefile Show resolved Hide resolved

Makefile Show resolved Hide resolved

bharath-b-rh commented Oct 8, 2025

View reviewed changes

mytreya-rh reviewed Oct 9, 2025

View reviewed changes

bharath-b-rh force-pushed the eso-155-3 branch from cba3253 to 9703a95 Compare October 9, 2025 15:32

bharath-b-rh added 5 commits October 13, 2025 10:32

ESO-155: Sets cluster resource reconcile arg to false when operatingN…

0714206

…amespace is configured Signed-off-by: Bharath B <[email protected]>

ESO-155: Updates to use different TLS Secret name when cert-manager e…

a8c7f31

…nabled Signed-off-by: Bharath B <[email protected]>

ESO-155: Updates resources config for operator pod

94f8165

Signed-off-by: Bharath B <[email protected]>

ESO-155: Updates metrics server TLS opts to also use system CA certs

bc10341

Signed-off-by: Bharath B <[email protected]>

ESO-155: AI suggestions Makefile improvements

84b4046

Signed-off-by: Bharath B <[email protected]>

bharath-b-rh force-pushed the eso-155-3 branch from 9703a95 to 84b4046 Compare October 13, 2025 05:11

openshift-ci bot assigned mytreya-rh Oct 13, 2025

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 13, 2025

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Oct 14, 2025

ESO-155: fixes loop iteration to go through all volumes

ca2c73f

Signed-off-by: Bharath B <[email protected]>

bharath-b-rh force-pushed the eso-155-3 branch from c887dab to ca2c73f Compare October 14, 2025 10:55

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 16, 2025

ESO-155: Rebase with main

d82cfc0

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Oct 18, 2025

coderabbitai bot reviewed Oct 18, 2025

View reviewed changes

openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Oct 20, 2025

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 20, 2025

openshift-merge-bot bot merged commit 05a312b into openshift:main Oct 20, 2025
9 checks passed

bharath-b-rh deleted the eso-155-3 branch October 20, 2025 07:57

bharath-b-rh mentioned this pull request Oct 20, 2025

[release-1.0] ESO-182: Updates to use different TLS Secret name and disables cluster resource reconcile based on user config #81

Merged

		// certmanagerTLSSecretWebhook is the TLS secret created by cert-manager for the webhook component. A different
		// name is used to avoiding clash with the secret created by the inbuilt cert-controller component.

ESO-182: Updates to use different TLS Secret name and disables cluster resource reconcile based on user config #75

ESO-182: Updates to use different TLS Secret name and disables cluster resource reconcile based on user config #75

Uh oh!

Conversation

bharath-b-rh commented Oct 8, 2025

Uh oh!

openshift-ci-robot commented Oct 8, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Oct 8, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

openshift-ci bot commented Oct 8, 2025

Uh oh!

bharath-b-rh commented Oct 8, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openshift-ci-robot commented Oct 8, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bharath-b-rh commented Oct 8, 2025

Uh oh!

openshift-cherrypick-robot commented Oct 8, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bharath-b-rh commented Oct 8, 2025

Uh oh!

mytreya-rh left a comment

Choose a reason for hiding this comment

1. Volume Comparison Logic Bug (pkg/controller/common/utils.go:212-221)

2. SystemCertPool Error Handling (cmd/external-secrets-operator/main.go:157-160)

3. Resource Request Increases (config/manager/manager.yaml:122-123)

4. Resource Limits Removal (config/manager/manager.yaml)

5. Operand Deletion Enabled (config/manifests/bases/external-secrets-operator.clusterserviceversion.yaml:8)

6. updateWebhookVolumeConfig Logic (pkg/controller/external_secrets/deployments.go:405-421)

Uh oh!

mytreya-rh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

openshift-ci-robot commented Oct 8, 2025 •

edited by openshift-ci bot

Loading

coderabbitai bot commented Oct 8, 2025 •

edited by openshift-ci bot

Loading

coderabbitai bot Oct 8, 2025 •

edited

Loading

openshift-ci-robot commented Oct 8, 2025 •

edited by openshift-ci bot

Loading

bharath-b-rh Oct 9, 2025 •

edited

Loading

emmajiafan commented Oct 20, 2025 •

edited

Loading

openshift-ci-robot commented Oct 20, 2025 •

edited by openshift-ci bot

Loading