Skip to content

Prediction-based filesystem alert may conflict with kubelet image GC (85% threshold) #2801

@sratslla

Description

@sratslla

What happened?

We observed that kubelet performs automatic image garbage collection when disk usage exceeds the default imageGCHighThresholdPercent (85%).

Reference:
https://kubernetes.io/docs/concepts/architecture/garbage-collection/#containers-images

By default:

  • imageGCHighThresholdPercent = 85%
  • imageGCLowThresholdPercent = 80%

However, kube-prometheus includes a prediction-based alert similar to:

predict_linear(node_filesystem_free_bytes{device=~"/.*"}[2d], 3600 * 24 * 5) < 0

This predicts filesystem exhaustion within 5 days based on recent growth.

In practice, we observed:

  • Disk usage increased rapidly
  • Alert fired predicting exhaustion
  • Kubelet GC triggered automatically at ~85%
  • Disk usage dropped
  • Node remained healthy

This results in what appears to be a false positive alert during normal kubelet behavior.

Question:
Should the prediction alert take kubelet's image GC threshold into account?
For example, should the alert be suppressed if usage is below imageGCHighThresholdPercent, since kubelet will automatically intervene?
Or is the intended design that operators tune this alert manually?
Would appreciate guidance on recommended alignment between kubelet GC behavior and disk prediction alerts.

Thanks.

Environment

  • Kubernetes: v1.33.0
  • Prometheus: v3.5.0 (quay.io/prometheus/prometheus:v3.5.0)
  • Prometheus Operator: quay.io/prometheus-operator/prometheus-operator:v0.85.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions