Skip to content

feat(config): add webhook as kustomize component#122

Open
AvineshTripathi wants to merge 2 commits intokubernetes-sigs:mainfrom
AvineshTripathi:feat/webhook-config
Open

feat(config): add webhook as kustomize component#122
AvineshTripathi wants to merge 2 commits intokubernetes-sigs:mainfrom
AvineshTripathi:feat/webhook-config

Conversation

@AvineshTripathi
Copy link
Contributor

@AvineshTripathi AvineshTripathi commented Feb 7, 2026

Description

This PR converts the webhook config to a component like metrics and cert-manager and enables it. It also removes dependency of service monitor from the controller.

NOTE: webhooks require TLS, so cert manager crds installation is mandatory and ENABLE_TLS needs to be true.

Related Issue

Testing

Checklist

  • make test passes
  • make lint passes

Signed-off-by: AvineshTripathi <avineshtripathi1@gmail.com>
@netlify
Copy link

netlify bot commented Feb 7, 2026

Deploy Preview for node-readiness-controller canceled.

Name Link
🔨 Latest commit 6cdc0a9
🔍 Latest deploy log https://app.netlify.com/projects/node-readiness-controller/deploys/69901014db85fa00088f7385

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: AvineshTripathi
Once this PR has been reviewed and has the lgtm label, please assign dchen1107 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 7, 2026
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 7, 2026
@k8s-ci-robot
Copy link
Contributor

Hi @AvineshTripathi. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 7, 2026
name: validating-webhook-configuration
annotations:
cert-manager.io/inject-ca-from: $(CERTIFICATE_NAMESPACE)/$(CERTIFICATE_NAME)
cert-manager.io/inject-ca-from: nrr-system/nrr-serving-cert
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a reason these values are hardcoded here: they were previously populated using vars. However, vars is now deprecated (kubernetes-sigs/kustomize#5046
), and we should move to using replacements. When running Kustomize, it also throws the following warning indicating this deprecation.

# Warning: 'vars' is deprecated. Please use 'replacements' instead. [EXPERIMENTAL] Run 'kustomize edit fix' to update your Kustomization automatically.

If we switch to replacements, we run into a dependency issue. Both the webhook and metrics services need these replacements, but they are individual components and may or may not be deployed together. Because of this, we cannot keep the replacements in config/default. Placing them in individual components also does not work, as it fails to populate the nrr- prefix in the DNS names and annotations.

So I thought a better solution would be to hardcode it. Open for suggestions

Other places:

@ajaysundark
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 7, 2026
@ajaysundark ajaysundark requested review from Priyankasaggu11929 and ajaysundark and removed request for dchen1107 and haircommander February 7, 2026 23:59
@ajaysundark
Copy link
Contributor

@Priyankasaggu11929 had interests in testing the validation webhook, could you find time for this review?

@Priyankasaggu11929
Copy link
Member

@Priyankasaggu11929 had interests in testing the validation webhook, could you find time for this review?

yes, let me test it over the coming week and get back.

Comment on lines -11 to -24
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: metrics-certs
namespace: system
spec:
commonName: nrr-metrics
dnsNames:
- $(SERVICE_NAME).$(SERVICE_NAMESPACE).svc
- $(SERVICE_NAME).$(SERVICE_NAMESPACE).svc.cluster.local
issuerRef:
kind: Issuer
name: selfsigned-issuer
secretName: metrics-server-cert
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(confirming as I read the changes)

Is this change is essentially a cleanup? Basically moving the certificate creation under the component that needs the certificate right?

  • so for prometheus (metrics-cert) is defined here now - config/prometheus/tls/certificate.yaml
  • and same, webhook (serving-cert) is defined here - config/webhook/certificate.yaml

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is correct!

name: metrics-certs
namespace: system
spec:
commonName: metrics
Copy link
Member

@Priyankasaggu11929 Priyankasaggu11929 Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering, is there any use of commonName now?
(because IIUC, hostname verification will happen with SAN entries below)

If it is for just completion sake in the certificate, let's add some commonName to the webhook certificate as well?
Maybe it's a nit, but then atleast both our certificates will have common meta

target:
kind: Deployment
# Configure ServiceMonitor for TLS
- path: monitor_tls_patch.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we delete the monitor_tls_patch.yaml file if not in use?

target:
kind: Service

replacements:
Copy link
Member

@Priyankasaggu11929 Priyankasaggu11929 Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was there any benefit of not hardcoding these values earlier?

And just to confirm - this replacements section is not required now because you've hardcoded the dnsNames in the new config/prometheus/tls/certificate.yaml?

  dnsNames:
  - nrr-controller-manager-metrics-service.nrr-system.svc
  - nrr-controller-manager-metrics-service.nrr-system.svc.cluster.local

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier too, the dnsName field was not propagating correctly; however, it seems Prometheus was still allowing that, but in the case of a webhook, the API server does not allow a mismatch in dnsName.

Since I was doing the changes in webhook certificate, I corrected it as well.

Comment on lines +8 to +10
dnsNames:
- nrr-controller-manager-metrics-service.nrr-system.svc
- nrr-controller-manager-metrics-service.nrr-system.svc.cluster.local
Copy link
Member

@Priyankasaggu11929 Priyankasaggu11929 Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

I see in the webhook certificate below, the dnsNames entries are shorter:

  dnsNames:
  - nrr-webhook-service.nrr-system.svc
  - nrr-webhook-service.nrr-system.svc.cluster.local

should we trim the service name for metrics as well? to follow a pattern?
(remove the controller-manager bit?)

apiVersion: kustomize.config.k8s.io/v1alpha1
kind: Component
resources:
- monitor.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete the monitor.yaml file as well if not in use?

name: webhook-service
namespace: system
path: /validate-nodereadiness-io-v1alpha1-nodereadinessrule
path: /validate-readiness-node-x-k8s-io-v1alpha1-nodereadinessrule
Copy link
Member

@Priyankasaggu11929 Priyankasaggu11929 Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we correcting the path here with the patch and not directly correcting the marker itself (which IMO is the source of truth) -

// +kubebuilder:webhook:path=/validate-nodereadiness-io-v1alpha1-nodereadinessrule,mutating=false,failurePolicy=fail,sideEffects=None,groups=readiness.node.x-k8s.io,resources=nodereadinessrules,verbs=create;update,versions=v1alpha1,name=vnodereadinessrule.kb.io,admissionReviewVersions=v1

I tried correcting the marker and removing this patch - it works.

So, I suggest we drop patching the path here and only keep this patch to add the required cert-manager annotation?

More so, because if we don't update the marker, any subsequent make manifests and make deploy will go back to using the old path defined in the markers, and that will break

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @Priyankasaggu11929. I totally agree on updating the marker itself

Comment on lines -1 to -27
apiVersion: apps/v1
kind: Deployment
metadata:
name: controller-manager
namespace: system
spec:
template:
spec:
containers:
- name: manager
args:
- --leader-elect
- --health-probe-bind-address=:8081
- --enable-webhook=true
ports:
- containerPort: 9443
name: webhook-server
protocol: TCP
volumeMounts:
- mountPath: /tmp/k8s-webhook-server/serving-certs
name: cert
readOnly: true
volumes:
- name: cert
secret:
defaultMode: 420
secretName: webhook-server-certs No newline at end of file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, only for confirmation -

This change is just moving the patch to config/webhook/manager_webhook_patch.yaml and using JSON Patch format?

Copy link
Contributor Author

@AvineshTripathi AvineshTripathi Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that is correct

@Priyankasaggu11929
Copy link
Member

One thing (and maybe not for the scope of this PR and can be handled in follow ups) -

How would we manage scheduling cert-manager deployments on a tainted worker node (infact all other componets too?)

The other PR #117 only handle injecting matching tolerations for daemonsets.

And I don't think we can upfront manually insert matching tolerations in our provided kustomization components yaml?
(Maybe we later create MAP Policy/Policy-binding scoped to just these components?)

Signed-off-by: AvineshTripathi <avineshtripathi1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants