-
Notifications
You must be signed in to change notification settings - Fork 39
BootID support for KMM #1061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BootID support for KMM #1061
Conversation
|
|
|
Welcome @sriram-30! |
|
Hi @sriram-30. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for kubernetes-sigs-kmm ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
4772809 to
a81a33f
Compare
|
/ok-to-test |
|
@sriram-30 looks good, just a couple of comments:
|
fce1aca to
0ea2c8f
Compare
Thanks!
|
0ea2c8f to
06082b3
Compare
06082b3 to
9908fbd
Compare
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sriram-30, yevgeny-shnaidman The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1061 +/- ##
==========================================
- Coverage 79.09% 74.42% -4.67%
==========================================
Files 51 77 +26
Lines 5109 6886 +1777
==========================================
+ Hits 4041 5125 +1084
- Misses 882 1549 +667
- Partials 186 212 +26 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/lgtm |
Done now, thanks! |
|
Sorry, I just noticed the commit message. @sriram-30 Can you please update the commit message to a descriptive one and please (I will re-approve) ? Something like
|
9908fbd to
377de92
Compare
Sure, I've added that description to the commit message now and repushed |
1 similar comment
Sure, I've added that description to the commit message now and repushed |
|
@sriram-30 I am sorry I am bugging. Can you please keep short lines in the commit message to be more readable? |
BootID KMM support: KMM is checking if a node has rebooted by inspecting the Ready timestamp on the node and check if it newer than the last Ready timestamp recorded. Kubernetes has a grace period in which if a node stop reporting heartbeats it is then marked by the k8s API server as not ready. In some cases, the reboot is so fast that the node become Ready again before the k8s API has even noticed it went down, and in those cases we need to make sure KMM catches it and reload the kmod to the node. This is being done by comparing the node's status.nodeInfo.bootID, which is unique per reboot, with the last recorded value.
377de92 to
0cdc3fb
Compare
No problem :) |
|
/lgtm |
|
Github runners are being under maintenance - hopefully, CI will be working again soon. |
|
Fixes #1062 |
KMM loads driver again on node reboot once it detects Node moving from Ready to NotReady back to Ready.
In some cases, like VM node reboot, reboot happens very fast and k8s doesn't report NotReady because of its reporting frequency. Node status, however, has bootID which changes after reboot.
Added support for KMM to detect node reboot using this bootID in cases where reboot happens fast and NotReady state is not reached
Verified that kmm is detecting the boot id change and triggering worker after node is rebooted
BootId present in nmc: