Skip to content

Conversation

ehearne-redhat
Copy link

What type of PR is this?

/kind bug

What this PR does / why we need it:

Provides an admission plugin for kube-descheduler instances so that an error is logged when metadata.name is not cluster .

Which issue(s) this PR is related to:

Fixes https://issues.redhat.com/browse/OCPBUGS-62726 .

Special notes for your reviewer:

This is currently a WIP PR.

Does this PR introduce a user-facing change?

Action Required: Error logging for kube descheduler instantiation can now be found using `oc logs <descheduler-operator-pod-name>`.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

None.

@openshift-ci-robot openshift-ci-robot added the backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. label Oct 20, 2025
@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. labels Oct 20, 2025
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Oct 20, 2025
@openshift-ci-robot
Copy link

@ehearne-redhat: This pull request references Jira Issue OCPBUGS-63132, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

What type of PR is this?

/kind bug

What this PR does / why we need it:

Provides an admission plugin for kube-descheduler instances so that an error is logged when metadata.name is not cluster .

Which issue(s) this PR is related to:

Fixes https://issues.redhat.com/browse/OCPBUGS-62726 .

Special notes for your reviewer:

This is currently a WIP PR.

Does this PR introduce a user-facing change?

Action Required: Error logging for kube descheduler instantiation can now be found using `oc logs <descheduler-operator-pod-name>`.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

None.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

@ehearne-redhat: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

Comment /validate-backports to re-evaluate validity of the upstream PRs, for example when they are merged upstream.

Copy link

openshift-ci bot commented Oct 20, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ehearne-redhat
Once this PR has been reviewed and has the lgtm label, please assign bertinatto for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ehearne-redhat ehearne-redhat changed the title [WIP] OCPBUGS-63132: add descheduler validation plugin [WIP] OCPBUGS-62726: add descheduler validation plugin Oct 20, 2025
@openshift-ci-robot openshift-ci-robot added jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. and removed jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Oct 20, 2025
@openshift-ci-robot
Copy link

@ehearne-redhat: This pull request references Jira Issue OCPBUGS-62726, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

What type of PR is this?

/kind bug

What this PR does / why we need it:

Provides an admission plugin for kube-descheduler instances so that an error is logged when metadata.name is not cluster .

Which issue(s) this PR is related to:

Fixes https://issues.redhat.com/browse/OCPBUGS-62726 .

Special notes for your reviewer:

This is currently a WIP PR.

Does this PR introduce a user-facing change?

Action Required: Error logging for kube descheduler instantiation can now be found using `oc logs <descheduler-operator-pod-name>`.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

None.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@ehearne-redhat
Copy link
Author

ehearne-redhat commented Oct 20, 2025

/jira refresh

1 similar comment
@ehearne-redhat
Copy link
Author

/jira refresh

@openshift-ci-robot
Copy link

@ehearne-redhat: This pull request references Jira Issue OCPBUGS-62726, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@ehearne-redhat
Copy link
Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 20, 2025
@openshift-ci-robot
Copy link

@ehearne-redhat: This pull request references Jira Issue OCPBUGS-62726, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @kasturinarra

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from kasturinarra October 20, 2025 09:45
Signed-off-by: Evan Hearne <[email protected]>
@openshift-ci-robot
Copy link

@ehearne-redhat: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

Comment /validate-backports to re-evaluate validity of the upstream PRs, for example when they are merged upstream.

@ehearne-redhat
Copy link
Author

/retest

Copy link

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a handful of comments.

Another thought I've had is that maybe this makes sense as a ValidatingAdmissionPolicy controlled by the cluster-operator instead of adding another validation plugin into the Kubernetes API server.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why it is necessary is without it, Go is unable to obtain the correct version. It tries to get v0.30.0 or something like that if I remember correctly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Error - 15:41:26] Request workspace/executeCommand failed.
  Message: err: exit status 1: stderr: go: finding module for package github.com/openshift/cluster-kube-descheduler-operator/pkg/apis/descheduler/v1
go: k8s.io/[email protected]: reading k8s.io/kube-openapi/go.mod at revision v0.30.0: unknown revision v0.30.0

"k8s.io/kubernetes/openshift-kube-apiserver/admission/customresourcevalidation"
)

const PluginName = "config.openshift.io/ValidateDescheduler"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KubeDescheduler is in the operator.openshift.io API group. Might makes sense to have this reflect that as well.

Reference: https://github.com/openshift/cluster-kube-descheduler-operator/blob/ccddc666b5b9607d29aa4f8614267d38e4ba6633/pkg/apis/descheduler/v1/register.go#L16

plugins.Register(PluginName, func(config io.Reader) (admission.Interface, error) {
return customresourcevalidation.NewValidator(
map[schema.GroupResource]bool{
configv1.Resource("deschedulers"): true,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't resolve correctly for a couple reasons:

  1. The resource would be kubedeschedulers not deschedulers.
  2. The resource is part of the operator.openshift.io group, not config.openshift.io.

I think instead you probably want:

Suggested change
configv1.Resource("deschedulers"): true,
deschedulerv1.Resource("kubedeschedulers"): true,

configv1.Resource("deschedulers"): true,
},
map[schema.GroupVersionKind]customresourcevalidation.ObjectValidator{
configv1.GroupVersion.WithKind("Descheduler"): deschedulerV1{},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to my comment above, you probably want:

Suggested change
configv1.GroupVersion.WithKind("Descheduler"): deschedulerV1{},
deschedulerv1.SchemeGroupVersion.WithKind("KubeDescheduler"): deschedulerV1{},

Comment on lines +45 to +46
field.NotSupported(field.NewPath("kind"), fmt.Sprintf("%T", uncastObj), []string{"Descheduler"}),
field.NotSupported(field.NewPath("apiVersion"), fmt.Sprintf("%T", uncastObj), []string{"config.openshift.io/v1"}))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
field.NotSupported(field.NewPath("kind"), fmt.Sprintf("%T", uncastObj), []string{"Descheduler"}),
field.NotSupported(field.NewPath("apiVersion"), fmt.Sprintf("%T", uncastObj), []string{"config.openshift.io/v1"}))
field.NotSupported(field.NewPath("kind"), fmt.Sprintf("%T", uncastObj), []string{"KubeDescheduler"}),
field.NotSupported(field.NewPath("apiVersion"), fmt.Sprintf("%T", uncastObj), []string{"operator.openshift.io/v1"}))

Comment on lines +54 to +64
func validateDeschedulerSpec(spec deschedulerv1.KubeDeschedulerSpec) field.ErrorList {
allErrs := field.ErrorList{}

if name := spec.Policy.Name; len(name) > 0 {
for _, msg := range validation.NameIsDNSSubdomain(spec.Policy.Name, false) {
allErrs = append(allErrs, field.Invalid(field.NewPath("spec.Policy.name"), name, msg))
}
}

return allErrs
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to validate the .spec field of the KubeDescheduler resource? If I understand correctly, we are only wanting to enforce the naming of the object to resolve this bug.

Enforcing additional validations on the .spec field of this resource introduces more complexity into the considerations we need to have here. Specifically how we handle potentially breaking changes to user expectations.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't a specific reason to validate, and there does not exist a .spec field from the v1 package --> https://pkg.go.dev/github.com/openshift/cluster-kube-descheduler-operator/pkg/apis/descheduler/v1 .

I will remove this. :)

Comment on lines +57 to +61
if name := spec.Policy.Name; len(name) > 0 {
for _, msg := range validation.NameIsDNSSubdomain(spec.Policy.Name, false) {
allErrs = append(allErrs, field.Invalid(field.NewPath("spec.Policy.name"), name, msg))
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where the .spec.policy field exists for a kubedeschedulers resource

Comment on lines +78 to +107
func (deschedulerV1) ValidateUpdate(_ context.Context, uncastObj runtime.Object, uncastOldObj runtime.Object) field.ErrorList {
obj, allErrs := toDeschedulerV1(uncastObj)
if len(allErrs) > 0 {
return allErrs
}
oldObj, allErrs := toDeschedulerV1(uncastOldObj)
if len(allErrs) > 0 {
return allErrs
}

allErrs = append(allErrs, validation.ValidateObjectMetaUpdate(&obj.ObjectMeta, &oldObj.ObjectMeta, field.NewPath("metadata"))...)
allErrs = append(allErrs, validateDeschedulerSpec(obj.Spec)...)

return allErrs
}

func (deschedulerV1) ValidateStatusUpdate(_ context.Context, uncastObj runtime.Object, uncastOldObj runtime.Object) field.ErrorList {
obj, errs := toDeschedulerV1(uncastObj)
if len(errs) > 0 {
return errs
}
oldObj, errs := toDeschedulerV1(uncastOldObj)
if len(errs) > 0 {
return errs
}

errs = append(errs, validation.ValidateObjectMetaUpdate(&obj.ObjectMeta, &oldObj.ObjectMeta, field.NewPath("metadata"))...)
errs = append(errs, validateDeschedulerSpec(obj.Spec)...)
return errs
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we need to validate these operations. I think we would only care to validate that the .metadata.name matches cluster on creates.

I believe by default, Kubernetes will enforce immutability on things like custom resource .metadata.name updates.

@ehearne-redhat
Copy link
Author

Closing PR as CEL validation ( suggested by @everettraven ) can be used for this problem as seen here --> https://github.com/openshift/api/blob/286504b695bcef7a455f512918b0ee5e5fe6d651/operator/v1/types_olm.go#L22 .

Copy link

openshift-ci bot commented Oct 20, 2025

@ehearne-redhat: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn 9214012 link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-aws-ovn-hypershift 9214012 link true /test e2e-aws-ovn-hypershift
ci/prow/e2e-aws-crun-wasm 9214012 link true /test e2e-aws-crun-wasm
ci/prow/k8s-e2e-gcp-serial 9214012 link true /test k8s-e2e-gcp-serial
ci/prow/images 9214012 link true /test images
ci/prow/e2e-aws-ovn-runc 9214012 link true /test e2e-aws-ovn-runc
ci/prow/k8s-e2e-conformance-aws 9214012 link true /test k8s-e2e-conformance-aws
ci/prow/e2e-aws-ovn-cgroupsv2 9214012 link true /test e2e-aws-ovn-cgroupsv2
ci/prow/verify-commits 9214012 link true /test verify-commits
ci/prow/e2e-aws-ovn-fips 9214012 link true /test e2e-aws-ovn-fips
ci/prow/e2e-gcp 9214012 link true /test e2e-gcp
ci/prow/e2e-aws-ovn-serial 9214012 link true /test e2e-aws-ovn-serial
ci/prow/k8s-e2e-gcp-ovn 9214012 link true /test k8s-e2e-gcp-ovn
ci/prow/e2e-aws-ovn-crun 9214012 link true /test e2e-aws-ovn-crun

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link

@ehearne-redhat: This pull request references Jira Issue OCPBUGS-62726. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state.

In response to this:

What type of PR is this?

/kind bug

What this PR does / why we need it:

Provides an admission plugin for kube-descheduler instances so that an error is logged when metadata.name is not cluster .

Which issue(s) this PR is related to:

Fixes https://issues.redhat.com/browse/OCPBUGS-62726 .

Special notes for your reviewer:

This is currently a WIP PR.

Does this PR introduce a user-facing change?

Action Required: Error logging for kube descheduler instantiation can now be found using `oc logs <descheduler-operator-pod-name>`.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

None.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@ehearne-redhat ehearne-redhat deleted the OCPBUGS-62726-enforce-name-cluster-kube-desched branch October 20, 2025 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. kind/bug Categorizes issue or PR as related to a bug.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants