-
Notifications
You must be signed in to change notification settings - Fork 167
fix: add FilterFunc to skip deleted Pods in ResisterResultSavingToInf… #435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: add FilterFunc to skip deleted Pods in ResisterResultSavingToInf… #435
Conversation
Hi @Park-Jiyeonn. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @utam0k @ordovicia @saza-ku
/ok-to-test
@sanposhiho: GitHub didn't allow me to request PR reviews from the following users: ordovicia, saza-ku. Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution.
While filtering out Pods where DeletionTimestamp != nil
can mitigate the issue, it doesn't prevent it completely. A Pod could still be deleted during the execution of UpdateFunc even if its DeletionTimestamp
was nil
at the beginning. This suggests that it might be impossible to avoid this race condition entirely.
As an alternative, what do you think about configuring the FilterFunc
so that the UpdateFunc
is executed only when necessary? The UpdateFunc
should run when a Pod has been successfully scheduled. Therefore, I suggest triggering the UpdateFunc
only when the nodeName
of the Pod has changed between the update, as this indicates that scheduling is complete.
If, during scheduling, a Pod doesn't pass the filter function and we only consider changes in the node name, then we won't be able to record the reason why it didn't pass the filter function. Do you mean that we don't need to care about Pods that fail to be scheduled? |
Hi @saza-ku Just wanted to kindly follow up on this PR 🙇 Thank you again for your time and feedback! |
Hope you're doing well! Just wanted to follow up on this PR again. I'd really appreciate it if you could take a look or let me know if there's anything else I should improve 🙇 Looking forward to your feedback. Thank you so much for your time! |
@sanposhiho @saza-ku PTAL |
@sanposhiho @saza-ku PTAL, Thx! |
Sorry, I noticed you pinged me many times, but I got my laptop broken and hence will be away from oss work until I get a new one (will arrive this week or next). |
@Park-Jiyeonn Sorry for being late. Actually I'm uncertain about the purpose of this change because of the following reasons.
Could you please provide more context or elaborate on the reasoning behind this change? |
Regarding the concern you mentioned — I believe I addressed this in my earlier comment #issuecomment-2995142507, but let me briefly summarize again:
|
Why not just ignore NotFound error when applying? |
That’s exactly the purpose of this change, I added the |
But, that is not perfect depending on the timing. If we just ignore the not found errors at the fetch API, that's perfect prevention, no? |
do you mean we don't need the |
I'd say we can keep your filter to reduce the num of fetch API calls (performance benefit), and we can add an additional error handling to ignore not-found errors for a perfect prevention of the issue. |
How about add a warning before FilterFunc: func(obj interface{}) bool {
if pod, ok := obj.(*corev1.Pod); ok {
if pod.DeletionTimestamp != nil {
add some warning or error
return false
}
} |
I don't think we need warning. Warning implies something goes wrong, but in this case, nothing goes wrong: It's just that the pod is being deleted (i.e., users would have no action to solve this warning) |
d89840d
to
a50456d
Compare
a50456d
to
9fddd8c
Compare
it's OK! Additionally ignore NotFound on Get/Update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Park-Jiyeonn, sanposhiho The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does
This PR adds a
FilterFunc
to theResisterResultSavingToInformer
method to prevent scheduling results from being written to Pods that are being deleted.Why
As described in issue #426, when a Pod is being deleted from the cluster, the informer may still receive update events, but attempts to annotate such Pods will fail because they no longer exist.
By adding a
FilterFunc
, we ensure that only Pods not marked for deletion (DeletionTimestamp == nil
) are processed, preventing errors during update handling.Related Issue
Fixes: #426
Additional Notes