|
26 | 26 | - [Container resource limit update ordering](#container-resource-limit-update-ordering)
|
27 | 27 | - [Container resource limit update failure handling](#container-resource-limit-update-failure-handling)
|
28 | 28 | - [CRI Changes Flow](#cri-changes-flow)
|
| 29 | + - [Kubelet Restart Analysis](#kubelet-restart-analysis) |
29 | 30 | - [Notes](#notes)
|
30 | 31 | - [Lifecycle Nuances](#lifecycle-nuances)
|
31 | 32 | - [Atomic Resizes](#atomic-resizes)
|
@@ -702,6 +703,42 @@ Pod Status in response to user changing the desired resources in Pod Spec.
|
702 | 703 | in ContainerStatus.Resources to update ContainerStatuses[i].Resources.Limits
|
703 | 704 | for that Container in the Pod's Status.
|
704 | 705 |
|
| 706 | +#### Kubelet Restart Analysis |
| 707 | + |
| 708 | +Analysis of Kubelet restarts happening at various points of resize, and how recovery happens. |
| 709 | +Impacts of a restart outside of resource configuration are out of scope. |
| 710 | + |
| 711 | +1. Kubelet Admits a new pod |
| 712 | + - Resource allocation checkpointed before sending the pod to the pod workers |
| 713 | + - Restart before checkpointing: pod goes through admission again as if new |
| 714 | + - Restart after checkpointing: pod goes through admission using the allocated resources |
| 715 | +1. Kubelet creates a container |
| 716 | + - Resources acknowledged after CreateContainer call succeeds |
| 717 | + - Restart before acknowledgement: Kubelet issues a superfluous UpdatePodResources request |
| 718 | + - Restart after acknowledgement: No resize needed |
| 719 | +1. Container starts, triggering a pod sync event |
| 720 | + - Kubelet updates status with actual resources reported by runtime, allocated resources from checkpoint |
| 721 | + - Allocated == Acknowledeged, so no resize needed |
| 722 | + - No races around restart. |
| 723 | +1. Pod is resized in the API, Kubelet observes the update |
| 724 | + - Triggers a pod sync |
| 725 | + - On restart, Kubelet reads the latest pod from the API and triggers a pod sync, so same effect as |
| 726 | + observing the update. |
| 727 | +1. Updated pod is synced: Check if pod can be admitted |
| 728 | + - No: resize status is deferred, no change to allocated resources |
| 729 | + - Restart: redo admission check, still deferred. |
| 730 | + - Yes: resize status is in-progress, update allocated checkpoint |
| 731 | + - Restart before update: readmit, then update allocated |
| 732 | + - Restart after update: allocated != acknowledged --> proceed with resize |
| 733 | +1. Allocated != Acknowledged |
| 734 | + - Trigger an `UpdateContainerResources` CRI call, then update Acknowledged resources on success |
| 735 | + - Restart before CRI call: allocated != acknowledged, will still trigger the update call |
| 736 | + - Restart after CRI call, before acknowledged update: will redo update call |
| 737 | + - Restart after acknowledged update: allocated == acknowledged, resize status cleared |
| 738 | +1. PLEG updates PodStatus cache, triggers pod sync |
| 739 | + - Pod status updated with actual resources, resize status cleared |
| 740 | + - Desired == Allocated == Acknowledged, no resize changes needed. |
| 741 | + |
705 | 742 | #### Notes
|
706 | 743 |
|
707 | 744 | * If CPU Manager policy for a Node is set to 'static', then only integral
|
|
0 commit comments