Skip to content

Commit 9cc97fe

Browse files
committed
KEP-1287: Update sample resize flow
1 parent 4e1ca58 commit 9cc97fe

File tree

1 file changed

+58
-64
lines changed
  • keps/sig-node/1287-in-place-update-pod-resources

1 file changed

+58
-64
lines changed

keps/sig-node/1287-in-place-update-pod-resources/README.md

Lines changed: 58 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -498,113 +498,107 @@ The scheduler will use the maximum of:
498498

499499
The following steps denote the flow of a series of in-place resize operations
500500
for a Pod with ResizePolicy set to PreferNoRestart for all its Containers.
501-
This is intentionally hitting various edge-cases to demonstrate.
501+
This is intentionally hitting various edge-cases for demonstration.
502502

503-
```
504-
T=0: A new pod is created
503+
1. A new pod is created
505504
- `spec.containers[0].resources.requests[cpu]` = 1
505+
- `spec.containers[0].resizePolicy[cpu].restartPolicy` = `"PreferNoRestart"`
506506
- all status is unset
507507

508-
T=1: apiserver defaults are applied
508+
1. Pod is scheduled
509509
- `spec.containers[0].resources.requests[cpu]` = 1
510-
- `status.containerStatuses` = unset
511-
- `status.resize[cpu]` = unset
510+
- status still mostly unset
512511

513-
T=2: kubelet runs the pod and updates the API
512+
1. kubelet runs the pod and updates the API
514513
- `spec.containers[0].resources.requests[cpu]` = 1
515-
- `status.containerStatuses[0].allocatedResources[cpu]` = 1
516-
- `status.resize[cpu]` = unset
514+
- `status.resize` = unset
515+
- `status.allocatedResources.requests[cpu]` = 1
517516
- `status.containerStatuses[0].resources.requests[cpu]` = 1
517+
- actual CPU shares = 1024
518518

519-
T=3: Resize #1: cpu = 1.5 (via PUT or PATCH or /resize)
519+
1. Resize #1: cpu = 1.5 (via PUT or PATCH to /resize)
520520
- apiserver validates the request (e.g. `limits` are not below
521521
`requests`, ResourceQuota not exceeded, etc) and accepts the operation
522522
- `spec.containers[0].resources.requests[cpu]` = 1.5
523+
- `status.resize` = unset
523524
- `status.containerStatuses[0].allocatedResources[cpu]` = 1
524525
- `status.containerStatuses[0].resources.requests[cpu]` = 1
526+
- actual CPU shares = 1024
525527

526-
T=4: Kubelet watching the pod sees resize #1 and accepts it
527-
- kubelet sends patch {
528-
`resourceVersion` = `<previous value>` # enable conflict detection
529-
`status.containerStatuses[0].allocatedResources[cpu]` = 1.5
530-
`status.resize[cpu]` = "InProgress"'
531-
}
528+
1. Kubelet syncs the pod, sees resize #1 and admits it
532529
- `spec.containers[0].resources.requests[cpu]` = 1.5
530+
- `status.resize` = `"InProgress"`
533531
- `status.containerStatuses[0].allocatedResources[cpu]` = 1.5
534-
- `status.resize[cpu]` = "InProgress"
535532
- `status.containerStatuses[0].resources.requests[cpu]` = 1
533+
- actual CPU shares = 1024
536534

537-
T=5: Resize #2: cpu = 2
535+
1. Resize #2: cpu = 2
538536
- apiserver validates the request and accepts the operation
539537
- `spec.containers[0].resources.requests[cpu]` = 2
540-
- `status.containerStatuses[0].allocatedResources[cpu]` = 1.5
538+
- `status.resize` = `"InProgress"`
539+
- `status.allocatedResources.requests[cpu]` = 1.5
541540
- `status.containerStatuses[0].resources.requests[cpu]` = 1
541+
- actual CPU shares = 1024
542542

543-
T=6: Container runtime applied cpu=1.5
544-
- kubelet sends patch {
545-
`resourceVersion` = `<previous value>` # enable conflict detection
546-
`status.containerStatuses[0].resources.requests[cpu]` = 1.5
547-
`status.resize[cpu]` = unset
548-
}
549-
- apiserver fails the operation with a "conflict" error
550-
551-
T=7: kubelet refreshes and sees resize #2 (cpu = 2)
552-
- kubelet decides this is possible, but not right now
553-
- kubelet sends patch {
554-
`resourceVersion` = `<updated value>` # enable conflict detection
555-
`status.containerStatuses[0].resources.requests[cpu]` = 1.5
556-
`status.resize[cpu]` = "Deferred"
557-
}
543+
1. Container runtime applied cpu=1.5
558544
- `spec.containers[0].resources.requests[cpu]` = 2
559-
- `status.containerStatuses[0].allocatedResources[cpu]` = 1.5
560-
- `status.resize[cpu]` = "Deferred"
545+
- `status.resize` = `"InProgress"`
546+
- `status.allocatedResources.requests[cpu]` = 1.5
547+
- `status.containerStatuses[0].resources.requests[cpu]` = 1
548+
- actual CPU shares = 1536
549+
550+
1. kubelet syncs the pod, and sees resize #2 (cpu = 2)
551+
- kubelet decides this is feasible, but currently insufficient available resources
552+
- `spec.containers[0].resources.requests[cpu]` = 2
553+
- `status.resize[cpu]` = `"Deferred"`
554+
- `status.allocatedResources.requests[cpu]` = 1.5
561555
- `status.containerStatuses[0].resources.requests[cpu]` = 1.5
556+
- actual CPU shares = 1536
562557

563-
T=8: Resize #3: cpu = 1.6
558+
1. Resize #3: cpu = 1.6
564559
- apiserver validates the request and accepts the operation
565560
- `spec.containers[0].resources.requests[cpu]` = 1.6
566-
- `status.containerStatuses[0].allocatedResources[cpu]` = 1.5
561+
- `status.resize[cpu]` = `"Deferred"`
562+
- `status.allocatedResources.requests[cpu]` = 1.5
567563
- `status.containerStatuses[0].resources.requests[cpu]` = 1.5
564+
- actual CPU shares = 1536
568565

569-
T=9: Kubelet watching the pod sees resize #3 and accepts it
570-
- kubelet sends patch {
571-
`resourceVersion` = `<previous value>` # enable conflict detection
572-
`status.containerStatuses[0].allocatedResources[cpu]` = 1.6
573-
`status.resize[cpu]` = "InProgress"'
574-
}
566+
1. Kubelet syncs the pod, and sees resize #3 and admits it
575567
- `spec.containers[0].resources.requests[cpu]` = 1.6
576-
- `status.containerStatuses[0].allocatedResources[cpu]` = 1.6
577-
- `status.resize[cpu]` = "InProgress"
568+
- `status.resize[cpu]` = `"InProgress"`
569+
- `status.allocatedResources.requests[cpu]` = 1.6
578570
- `status.containerStatuses[0].resources.requests[cpu]` = 1.5
571+
- actual CPU shares = 1536
579572

580-
T=10: Container runtime applied cpu=1.6
581-
- kubelet sends patch {
582-
`resourceVersion` = `<previous value>` # enable conflict detection
583-
`status.containerStatuses[0].resources.requests[cpu]` = 1.6
584-
`status.resize[cpu]` = unset
585-
}
573+
1. Container runtime applied cpu=1.6
574+
- `spec.containers[0].resources.requests[cpu]` = 1.6
575+
- `status.resize[cpu]` = `"InProgress"`
576+
- `status.allocatedResources.requests[cpu]` = 1.6
577+
- `status.containerStatuses[0].resources.requests[cpu]` = 1.5
578+
- actual CPU shares = 1638
579+
580+
1. Kubelet syncs the pod
586581
- `spec.containers[0].resources.requests[cpu]` = 1.6
587-
- `status.containerStatuses[0].allocatedResources[cpu]` = 1.6
588582
- `status.resize[cpu]` = unset
583+
- `status.allocatedResources.requests[cpu]` = 1.6
589584
- `status.containerStatuses[0].resources.requests[cpu]` = 1.6
585+
- actual CPU shares = 1638
590586

591-
T=11: Resize #4: cpu = 100
587+
1. Resize #4: cpu = 100
592588
- apiserver validates the request and accepts the operation
593589
- `spec.containers[0].resources.requests[cpu]` = 100
594-
- `status.containerStatuses[0].allocatedResources[cpu]` = 1.6
590+
- `status.resize[cpu]` = unset
591+
- `status.allocatedResources.requests[cpu]` = 1.6
595592
- `status.containerStatuses[0].resources.requests[cpu]` = 1.6
593+
- actual CPU shares = 1638
596594

597-
T=12: Kubelet watching the pod sees resize #4
598-
- this node does not have 100 CPUs, so kubelet cannot accept
599-
- kubelet sends patch {
600-
`resourceVersion` = `<previous value>` # enable conflict detection
601-
`status.resize[cpu]` = "Infeasible"'
602-
}
595+
1. Kubelet syncs the pod, and sees resize #4
596+
- this node does not have 100 CPUs, so kubelet cannot admit it
603597
- `spec.containers[0].resources.requests[cpu]` = 100
604-
- `status.containerStatuses[0].allocatedResources[cpu]` = 1.6
605-
- `status.resize[cpu]` = "Infeasible"
598+
- `status.resize[cpu]` = `"Infeasible"`
599+
- `status.allocatedResources.requests[cpu]` = 1.6
606600
- `status.containerStatuses[0].resources.requests[cpu]` = 1.6
607-
```
601+
- actual CPU shares = 1638
608602

609603
#### Container resource limit update ordering
610604

0 commit comments

Comments
 (0)