Skip to content

experimental OOM handler #11588

@smira

Description

@smira

Trigger

  • Monitor memory PSI information via cgroups
  • We can tune which PSI is more important:
    • global cgroup
    • or high-priority cgroups: init/podruntime/runtime (?)
  • Threshold: which value of PSI triggers OOM

Pick target

  • First, sort by high-level group:
    • kubepods (workloads)
    • podruntime (CRI, kubelet, etcd)
    • runtime (core containerd, system services)
    • init
  • Second, inside kubepods we have QoS groups:
    • first priority: BestEffort
    • second: Burstable
    • last: Guaranteed
  • Third, look into other attributes, e.g. OOM score.
  • Fourth, look into memory max - memory current (if memory max is set).

Pick a target(s) to kill

Kill

Kill the whole cgroup.

Cooldown

Wait some time before, next kill, go to the beginning monitoring PSI.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions