-
Notifications
You must be signed in to change notification settings - Fork 462
NO-ISSUE: Fix Failure Status condition to True when kubeletconfig or container runtime config validation fails #5542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…runtime config validation fails
|
Skipping CI for Draft Pull Request. |
|
/test all |
|
/test unit |
|
/payload-job periodic-ci-openshift-machine-config-operator-release-4.21-periodics-e2e-aws-mco-disruptive-techpreview-1of2 |
|
@ngopalak-redhat: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/24844ac0-efc9-11f0-9415-bd4d485cb97a-0 |
|
/payload-job periodic-ci-openshift-machine-config-operator-release-4.22-periodics-e2e-aws-mco-disruptive-techpreview-1of2 |
|
@ngopalak-redhat: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/3339db20-efc9-11f0-899d-abca408d040f-0 |
sairameshv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this fix requires UTs but they are a good to have anyway.
/lgtm
|
/retest |
|
/retitle: NO-ISSUE: Fix Failure Status condition to True when kubeletconfig or container runtime config validation fails |
|
@ngopalak-redhat: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@ngopalak-redhat: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| condition = apihelpers.NewContainerRuntimeConfigCondition( | ||
| mcfgv1.ContainerRuntimeConfigFailure, | ||
| corev1.ConditionFalse, | ||
| corev1.ConditionTrue, | ||
| fmt.Sprintf("Error: %v", err), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is only one condition being set here? If we hit an error case here, does it mean that there could still be a mcfgv1.ContainerRuntimeConfigSuccess condition set to True from a previous update that persists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. I have raised https://issues.redhat.com/browse/OCPNODE-4040 to fix it. The scope of this fix is to make sure that the reported "Failure" is set to "False"
The scope of this immediate fix is to ensure that the reported 'Failure' condition is set to 'False'.
A larger fix (potentially as part of OCPNODE-4040 or a separate task) might need to define a new status type, such as KubeletConfigAccepted, rather than relying on generic 'Failure' or 'Success' types. That would likely require an API change (see here: [Link]). Therefore, I am limiting the scope of this current change to just fixing the 'Failure' condition logic.
The larger fix as part of https://issues.redhat.com/browse/OCPNODE-4040 might need to define a new status instead of using "Failure" or "Success". It needs to be more meaningful like "KubeletConfigAccepted" or something similar. May be an API change also https://github.com/openshift/api/blob/master/machineconfiguration/v1/types.go#L801.
Hence scope of this fix is to just fix the condition "Failure".
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: haircommander, ngopalak-redhat, sairameshv, umohnani8 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
- What I did
Fixed the
wrapErrorWithConditionfunction in bothkubelet-configandcontainer-runtime-configcontrollers to correctly report failure status asTrueinstead ofFalsewhen validation errors occur.The function was incorrectly setting Failure condition status to False when errors occurred. According to Kubernetes condition semantics, a condition type should be True when that condition is present. So
Failure=Truemeans a failure has occurred, notFailure=False.- How to verify it
- Description for the changelog
Fix KubeletConfig and ContainerRuntimeConfig Failure condition status to correctly report True when validation errors occur.