Skip to content

Keeps an eye on Secrets that contains an expiration date

Notifications You must be signed in to change notification settings

stakater/expiring-secret-operator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

31 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Expiring Secrets Operator

A Kubernetes operator that monitors expiring credentials in Secrets and provides alerting through Prometheus metrics.

Overview

Many organizations use Personal Access Tokens (PATs), API keys, and other credentials that have expiration dates. When these expire without notice, it can lead to CI/CD pipeline failures, deployment issues, and downtime.

This operator watches Kubernetes Secrets containing expiring credentials and:

  • βœ… Exposes Prometheus metrics for monitoring expiration times
  • πŸ“Š Updates CR status with human-readable state information
  • 🚨 Supports configurable alert thresholds (Info, Warning, Critical)
  • πŸ”„ Works with External Secrets Operator and other secret management tools
  • 🏒 Follows OpenShift operator best practices

Quick Start

1. Deploy the Operator

# Build and deploy to local Kind cluster
make deploy-core

# Or build Docker image
make docker-build

2. Create a Secret with Expiration Label

apiVersion: v1
kind: Secret
metadata:
  name: docker-registry-token
  labels:
    expiringsecret.stakater.com/validUntil: "2026-03-15"  # YYYY-MM-DD format
data:
  token: <base64-encoded-token>

3. Create a Monitor Resource

apiVersion: expiring-secrets.stakater.com/v1alpha1
kind: Monitor
metadata:
  name: docker-registry-monitor
spec:
  service: docker.io
  secretRef:
    name: docker-registry-token
    namespace: default
  alertThresholds:
    infoDays: 30      # Start showing "Info" state 30 days before expiration
    warningDays: 14   # "Warning" state 14 days before  
    criticalDays: 7   # "Critical" state 7 days before

4. Check Status

kubectl get monitor docker-registry-monitor -o wide
# NAME                     REGISTRY   SECRET                 STATE   EXPIRES AT             AGE
# docker-registry-monitor  docker.io  docker-registry-token  Info    2026-03-15T00:00:00Z   1m

Prometheus Metrics

The operator exposes these metrics on /metrics:

# Absolute expiration timestamp (Unix time)
secretmonitor_valid_until_timestamp{registry="docker.io",name="docker-registry-monitor",namespace="default"} 1760486400

# Human-friendly seconds until expiry
secretmonitor_seconds_until_expiry{registry="docker.io",name="docker-registry-monitor",namespace="default"} 1209600

# Reconciliation success/failure counter
secretmonitor_reconcile_total{monitor="docker-registry-monitor",namespace="default",result="success"} 5

Alert States

State Description Default Threshold
Valid Secret is healthy and far from expiration > 30 days
Info Secret approaching expiration but not urgent 14-30 days
Warning Secret needs attention soon 7-14 days
Critical Secret expires very soon - action required < 7 days
Expired Secret has already expired Past expiration date
Error Cannot determine state (missing secret, invalid format, etc.) N/A

PrometheusRule Example

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: expiring-secrets-alerts
spec:
  groups:
  - name: expiring-secrets
    rules:
    - alert: SecretExpiringSoon
      expr: secretmonitor_seconds_until_expiry < 14 * 24 * 60 * 60  # 14 days
      for: 1h
      labels:
        severity: warning
      annotations:
        summary: "Secret {{ $labels.name }} expires in {{ $value | humanizeDuration }}"
    
    - alert: SecretExpiredCritical  
      expr: secretmonitor_seconds_until_expiry < 7 * 24 * 60 * 60   # 7 days
      for: 30m
      labels:
        severity: critical
      annotations:
        summary: "Secret {{ $labels.name }} expires VERY SOON: {{ $value | humanizeDuration }}"

Use Cases

  • Container Registry PATs: Docker Hub, Quay.io, GitHub Packages tokens
  • CI/CD Pipeline Tokens: GitHub Actions, GitLab CI, Jenkins credentials
  • API Keys: Third-party service credentials with expiration dates
  • Certificates: TLS certs, signing certificates (when stored as secrets)
  • Database Passwords: Auto-rotated credentials from secret management systems

Development

# Run tests (includes controller unit tests)
make test

# Run end-to-end tests
make test-e2e

# Generate CRD manifests
make manifests

# Format and vet code
make fmt vet

Architecture

graph TD
    SECRET[Secret with validUntil label]
    MONITOR_CR[Monitor CR]
    
    subgraph OPERATOR[Expiring Secrets Operator]
        RECONCILER[Reconciler]
        PARSER[Parse expiration date]
        STATUS_UPDATER[Update CR Status]
        METRICS_EXPORTER[Export Prometheus Metrics]
        
        RECONCILER --> PARSER
        PARSER --> STATUS_UPDATER
        PARSER --> METRICS_EXPORTER
    end
    
    subgraph MONITORING[Monitoring Stack]
        SM[ServiceMonitor] 
        PROM[Prometheus]
        AM[Alertmanager]
        
        SM --> PROM
        PROM --> AM
    end
    
    %% Watch relationships
    SECRET <-.->|watches| RECONCILER
    MONITOR_CR -.->|watches| RECONCILER
    
    %% Data flow
    SECRET -->|references| MONITOR_CR
    STATUS_UPDATER -->|status updates| MONITOR_CR
    METRICS_EXPORTER -->|/metrics| SM
Loading

Reconciliation Flow

flowchart TD
    START([Reconcile Event]) --> GET_MONITOR{Get Monitor CR}
    
    GET_MONITOR -->|Not Found| CLEANUP[Clean up metrics]
    CLEANUP --> END_SUCCESS([Success - Monitor deleted])
    
    GET_MONITOR -->|Found| DEFAULTS[Apply default alert thresholds]
    DEFAULTS --> GET_SECRET{Get referenced Secret}
    
    GET_SECRET -->|Not Found| ERROR_SECRET[Status: Error - Secret not found]
    ERROR_SECRET --> END_ERROR([Error - Requeue])
    
    GET_SECRET -->|Found| CHECK_LABEL{Has validUntil label?}
    
    CHECK_LABEL -->|No| ERROR_LABEL[Status: Error - Missing label]
    ERROR_LABEL --> END_ERROR
    
    CHECK_LABEL -->|Yes| PARSE_DATE{Parse date format?}
    
    PARSE_DATE -->|Invalid| ERROR_PARSE[Status: Error - Invalid date format]
    ERROR_PARSE --> END_ERROR
    
    PARSE_DATE -->|Valid| CALC_REMAINING[Calculate seconds remaining]
    CALC_REMAINING --> DETERMINE_STATE{Determine state}
    
    DETERMINE_STATE -->|<= 0 days| STATE_EXPIRED[Status: Expired] 
    DETERMINE_STATE -->|<= Critical days| STATE_CRITICAL[Status: Critical]
    DETERMINE_STATE -->|<= Warning days| STATE_WARNING[Status: Warning] 
    DETERMINE_STATE -->|<= Info days| STATE_INFO[Status: Info]
    DETERMINE_STATE -->|> Info days| STATE_VALID[Status: Valid]
    
    STATE_EXPIRED --> UPDATE_METRICS[Update Prometheus metrics]
    STATE_CRITICAL --> UPDATE_METRICS
    STATE_WARNING --> UPDATE_METRICS
    STATE_INFO --> UPDATE_METRICS
    STATE_VALID --> UPDATE_METRICS
    
    UPDATE_METRICS --> UPDATE_STATUS[Update Monitor status]
    UPDATE_STATUS --> REQUEUE[Requeue after 1 minute]
    REQUEUE --> END_SUCCESS
    
    %% Styling
    classDef errorPath fill:#ffcccc
    classDef successPath fill:#ccffcc
    classDef processPath fill:#cce5ff
    
    class ERROR_SECRET,ERROR_LABEL,ERROR_PARSE,END_ERROR errorPath
    class STATE_EXPIRED,STATE_CRITICAL,STATE_WARNING,STATE_INFO,STATE_VALID,END_SUCCESS successPath
    class DEFAULTS,CALC_REMAINING,UPDATE_METRICS,UPDATE_STATUS,REQUEUE processPath
Loading

Requirements

  • Kubernetes 1.25+
  • Secrets must have expiringsecret.stakater.com/validUntil label in YYYY-MM-DD format
  • Prometheus operator (for ServiceMonitor/PrometheusRule)

License

Apache License 2.0

About

Keeps an eye on Secrets that contains an expiration date

Resources

Stars

Watchers

Forks

Sponsor this project

 

Packages

No packages published