fix: prevent nil pointer panic in NodeLifecycle handleDisruption#2516
fix: prevent nil pointer panic in NodeLifecycle handleDisruption#2516Yashika0724 wants to merge 2 commits intoopenyurtio:masterfrom
Conversation
Signed-off-by: Yashika0724 <ssyashika1311@gmail.com>
|
Hi @zyjhtangtang, |
…eDisruption Signed-off-by: Yashika <ssyashika1311@gmail.com>
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2516 +/- ##
==========================================
+ Coverage 44.16% 44.17% +0.01%
==========================================
Files 399 399
Lines 26579 26584 +5
==========================================
+ Hits 11738 11743 +5
Misses 13776 13776
Partials 1065 1065
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi @zyjhtangtang |



What does this PR do?
This PR fixes a potential nil pointer dereference in the NodeLifecycle controller when handling the transition out of disruption mode.
Background / Problem
The
handleDisruptionlogic assumes that all nodes have corresponding entries innodeHealthMap.However, during edge or network-disrupted scenarios, this assumption may not always hold.
In particular, if node health updates return early due to API errors or pod listing failures, some nodes
may not be populated in the health map. When the controller later exits disruption mode and iterates
over all nodes, this can lead to a nil dereference and controller panic.
Why is this change needed?
This code path is exercised during recovery from disruption, which is a common and sensitive phase
in edge environments with intermittent connectivity. A controller panic at this point stops node
health processing, including tainting and eviction logic, until the controller is restarted.
What does this change do?
The controller now defensively initializes node health data when it is missing before updating
timestamps. This preserves existing behavior for healthy nodes while preventing a possible crash.
Scope of change
handleDisruptionTesting
This change is small and defensive in nature. It follows existing controller patterns and does not
affect normal execution paths.