Skip to content

Conversation

openshift-cherrypick-robot

This is an automated cherry-pick of #5305

/assign pablintino

@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: Jira Issue OCPBUGS-62341 has been cloned as Jira Issue OCPBUGS-63127. Will retitle bug to link to clone.
/retitle [release-4.20] OCPBUGS-63127: Ensure the node passed to RunCordonOrUncordon comes from the latest updated state

In response to this:

This is an automated cherry-pick of #5305

/assign pablintino

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot changed the title [release-4.20] OCPBUGS-62341: Ensure the node passed to RunCordonOrUncordon comes from the latest updated state [release-4.20] OCPBUGS-63127: Ensure the node passed to RunCordonOrUncordon comes from the latest updated state Oct 15, 2025
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 15, 2025
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-63127, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

This is an automated cherry-pick of #5305

/assign pablintino

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from djoshy and yuqi-zhang October 15, 2025 09:20
@sergiordlr
Copy link
Contributor

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Oct 15, 2025
@sergiordlr
Copy link
Contributor

Verified using IPI on AWS

  1. Create a webhook that will make fail any attempt to change the .spec.unschedulable value in a node. It will make all cordon/uncordon operations fail

This is an example of a webhook failing all cordon/uncordon operations: https://github.com/sergiordlr/temp-testfiles/tree/master/webhook_example

  1. Apply a machineconfiguraion to make MCO cordon/uncordon the nodes to apply the config
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: test-machine-config-0
spec:
  config:
    ignition:
      version: 3.1.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,dGVzdA==
        path: /etc/test-file-0.test

  1. Check that the MCO controller cannot cordon the node and starts retrying
I1016 10:48:03.975563       1 drain_controller.go:192] node ip-10-0-4-55.us-east-2.compute.internal: cordoning
I1016 10:48:03.975609       1 drain_controller.go:192] node ip-10-0-4-55.us-east-2.compute.internal: initiating cordon (currently schedulable: true)
I1016 10:48:03.998915       1 drain_controller.go:580] cordon failed with: cordon error: admission webhook "unschedulable-webhook.default.svc" denied the request: Changing .spec.unschedulable on node is forbidden., retrying
I1016 10:48:13.999342       1 drain_controller.go:192] node ip-10-0-4-55.us-east-2.compute.internal: initiating cordon (currently schedulable: false)
I1016 10:48:14.003558       1 drain_controller.go:192] node ip-10-0-4-55.us-east-2.compute.internal: RunCordonOrUncordon() succeeded but node is still not in cordon state, retrying
I1016 10:48:34.004256       1 drain_controller.go:192] node ip-10-0-4-55.us-east-2.compute.internal: initiating cordon (currently schedulable: true)
I1016 10:48:34.028124       1 drain_controller.go:580] cordon failed with: cordon error: admission webhook "unschedulable-webhook.default.svc" denied the request: Changing .spec.unschedulable on node is forbidden., retrying
I1016 10:49:14.028914       1 drain_controller.go:192] node ip-10-0-4-55.us-east-2.compute.internal: initiating cordon (currently schedulable: false)
I1016 10:49:14.033293       1 drain_controller.go:192] node ip-10-0-4-55.us-east-2.compute.internal: RunCordonOrUncordon() succeeded but node is still not in cordon state, retrying

  1. Remove the MutatingWebhookConfiguration created in step 1 to allow cordon/uncordon operations succeed again
  2. Check that the controller can now cordon the node and start applying the config
I1016 10:48:34.028124       1 drain_controller.go:580] cordon failed with: cordon error: admission webhook "unschedulable-webhook.default.svc" denied the request: Changing .spec.unschedulable on node is forbidden., retrying
I1016 10:49:14.028914       1 drain_controller.go:192] node ip-10-0-4-55.us-east-2.compute.internal: initiating cordon (currently schedulable: false)
I1016 10:49:14.033293       1 drain_controller.go:192] node ip-10-0-4-55.us-east-2.compute.internal: RunCordonOrUncordon() succeeded but node is still not in cordon state, retrying
I1016 10:50:34.033479       1 drain_controller.go:192] node ip-10-0-4-55.us-east-2.compute.internal: initiating cordon (currently schedulable: true)
I1016 10:50:34.045013       1 node_controller.go:676] Pool worker[zone=us-east-2a]: node ip-10-0-4-55.us-east-2.compute.internal: Reporting unready: node ip-10-0-4-55.us-east-2.compute.internal is reporting Unschedulable
I1016 10:50:34.052311       1 drain_controller.go:192] node ip-10-0-4-55.us-east-2.compute.internal: cordon succeeded (currently schedulable: false)
I1016 10:50:34.080422       1 drain_controller.go:192] node ip-10-0-4-55.us-east-2.compute.internal: initiating drain
I1016 10:50:34.095651       1 node_controller.go:676] Pool worker[zone=us-east-2a]: node ip-10-0-4-55.us-east-2.compute.internal: changed taints

  1. The configuration is properly applied in all nodes

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Oct 16, 2025
@pablintino
Copy link
Contributor

/jira refresh
/lgtm

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 16, 2025
@openshift-ci-robot
Copy link
Contributor

@pablintino: This pull request references Jira Issue OCPBUGS-63127, which is valid. The bug has been moved to the POST state.

7 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
  • release note type set to "Release Note Not Required"
  • dependent bug Jira Issue OCPBUGS-62341 is in the state Verified, which is one of the valid states (MODIFIED, ON_QA, VERIFIED)
  • dependent Jira Issue OCPBUGS-62341 targets the "4.21.0" version, which is one of the valid target versions: 4.21.0
  • bug has dependents

Requesting review from QA contact:
/cc @sergiordlr

In response to this:

/jira refresh
/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from sergiordlr October 16, 2025 16:44
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 16, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 16, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: openshift-cherrypick-robot, pablintino

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 16, 2025
@sdodson
Copy link
Member

sdodson commented Oct 17, 2025

/label staff-eng-approved

@openshift-ci openshift-ci bot added the staff-eng-approved Indicates a release branch PR has been approved by a staff engineer (formerly group/pillar lead). label Oct 17, 2025
@pablintino
Copy link
Contributor

/label backport-risk-assessed

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Oct 17, 2025
@pablintino
Copy link
Contributor

/verified by @sergiordlr

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Oct 17, 2025
@openshift-ci-robot
Copy link
Contributor

@pablintino: This PR has been marked as verified by @sergiordlr.

In response to this:

/verified by @sergiordlr

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 763fec8 and 2 for PR HEAD 95d1399 in total

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD f587a1b and 1 for PR HEAD 95d1399 in total

@chizhang21
Copy link
Contributor

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 22, 2025

@openshift-cherrypick-robot: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/bootstrap-unit 95d1399 link false /test bootstrap-unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 04d7cb3 into openshift:release-4.20 Oct 22, 2025
15 of 16 checks passed
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: Jira Issue Verification Checks: Jira Issue OCPBUGS-63127
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-63127 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

In response to this:

This is an automated cherry-pick of #5305

/assign pablintino

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR staff-eng-approved Indicates a release branch PR has been approved by a staff engineer (formerly group/pillar lead). verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.