@@ -364,23 +364,26 @@ classes](https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduli
364
364
365
365
Upon shutdown Kubelet will:
366
366
367
- 1 . Gracefully terminate all non critical system pods with a gracePeriodOverride
368
- computed as ` min(podSpec.terminationGracePeriodSeconds, ShutdownGracePeriod) `
369
- 2 . Gracefully terminate all critical system pods with gracePeriodOverride of 2
370
- seconds
367
+ 1 . Update the Node's ` Ready ` condition to ` false ` , with the reason `Node is
368
+ shutting down`
369
+ 2 . Gracefully terminate all non critical system pods with a gracePeriodOverride
370
+ computed as `min(podSpec.terminationGracePeriodSeconds,
371
+ ShutdownGracePeriod-ShutdownGracePeriodCriticalPods)`
372
+ 3 . Gracefully terminate all critical system pods with gracePeriodOverride of
373
+ ` ShutdownGracePeriodCriticalPods ` seconds
371
374
372
375
Kubelet will use the same existing
373
376
[ killPod] ( https://github.com/kubernetes/kubernetes/blob/release-1.19/pkg/kubelet/pod_workers.go#L292 )
374
377
function to perform the termination of pods, using ` gracePeriodOverride ` to set
375
378
the appropriate grace period. During the termination process, normal [ pod
376
379
termination] ( https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination )
377
- processes will apply, e.g. preStopHooks will be called, SIGTERM to containers
380
+ processes will apply, e.g. preStop Hooks will be called, ` SIGTERM ` to containers
378
381
delivered, etc.
379
382
380
- 2 seconds as gracePeriodOverride for critical system pods was decided to ensure
381
- that they can also perform a graceful shutdown and 2 seconds is currently
382
- [ defined ] ( https://github.com/kubernetes/kubernetes/blob/release-1.19/pkg/kubelet/kuberuntime/kuberuntime_container.go#L626-L629 )
383
- as the minimum grace period defined in the kubelet .
383
+ To ensure ` gracePeriodOverride ` is respected, Github issue
384
+ [ # 92432 ] ( https://github.com/kubernetes/kubernetes/issues/92432 ) should also be
385
+ addressed to ensure that ` gracePeriod ` override will be respected for ` preStop `
386
+ hooks .
384
387
385
388
POC: I’ve prototyped an initial POC
386
389
[ here] ( https://github.com/bobbypage/kubernetes/tree/shutdown ) of the proposed
@@ -412,7 +415,10 @@ Consider including folks who also work outside the SIG or subproject.
412
415
413
416
* Kubelet does not receive shutdown event or is able to create inhibitor lock
414
417
* Mitigation: Kubelet does not provide graceful shutdown to pods (same as
415
- today’s existing behavior)
418
+ today’s existing behavior). For alpha stage, to track shutdown behavior
419
+ and if it was successful, we plan to add a debugging log statement just
420
+ prior to kubelet's shutdown process being completed, so it's possible
421
+ to verify if kubelet shutdown the node gracefully.
416
422
* Kubelet is unable to update ` InhibitDelayMaxSec ` in logind to match that of
417
423
` kubeletConfig.ShutdownGracePeriod `
418
424
* If there are multiple logind configuration file overrides in
@@ -440,10 +446,18 @@ The design proposes adding a new KubeletConfig field `ShutdownGracePeriod` used
440
446
to specify total time period kubelet should delay shutdown by and thus total time
441
447
allocated to the graceful termination process.
442
448
449
+ In addition to ` ShutdownGracePeriod ` , another KubeletConfig field will be added
450
+ ` ShutdownGracePeriodCriticalPods ` . During the shutdown, the
451
+ ` ShutdownGracePeriod-ShutdownGracePeriodCriticalPods ` duration will be grace
452
+ period for non critical system pods like user workloads, while the remaining
453
+ time of ` ShutdownGracePeriodCriticalPods ` will be the grace period for critical
454
+ pods like node logging daemonsets.
455
+
443
456
```
444
457
type KubeletConfiguration struct {
445
458
...
446
459
ShutdownGracePeriod metav1.Duration
460
+ ShutdownGracePeriodCriticalPods metav1.Duration
447
461
}
448
462
```
449
463
0 commit comments