Skip to content

KubeClientCertificateExpiration alert false positives with cloud providers #6161

@romankucherov-cmyk

Description

@romankucherov-cmyk

Problem

KubeClientCertificateExpiration alert triggers false positives with cloud providers that automatically renew certificates 7 days before expiration (DigitalOcean).

Current Behavior

  • Cloud providers initiate certificate renewal exactly 7 days (604,800 seconds) before expiration
  • Alert rule threshold matches exactly 7 days (< 604800) with for: 5m duration
  • Alert triggers during auto-renewal operations and resolves within ~5 minutes
  • Operations team receives false warnings without actual certificate issues

Creates unnecessary noise and alert fatigue

Expected Behavior

  • Buffer should prevent warnings during planned auto-renewal
  • Real expiring certificates should still trigger warnings with at least 24h advance
  • Minimal changes to existing chart configuration

Proposed Solution

Change warning threshold from 7 days (604800 seconds) to 6 days 22 hours (601200 seconds).

Why This Works

  • Creates 2-hour buffer between cloud provider operations and alerting
  • Prevents false positives while maintaining security monitoring
  • Critical alert (<24 hours) remains unchanged for real emergencies
  • Minimal impact - still catches genuine certificate issues with 6+ days warning

Impact

  • Reduces operational noise from planned maintenance
  • Maintains security - real certificate failures still detected
  • Cloud-agnostic - benefits all users of cloud providers with 7-day renewal policies

Additional Context

  • Chart: kube-prometheus-stack
  • Rule: KubeClientCertificateExpiration (warning severity)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions