A Kubernetes operator that monitors expiring credentials in Secrets and provides alerting through Prometheus metrics.
Many organizations use Personal Access Tokens (PATs), API keys, and other credentials that have expiration dates. When these expire without notice, it can lead to CI/CD pipeline failures, deployment issues, and downtime.
This operator watches Kubernetes Secrets containing expiring credentials and:
- β Exposes Prometheus metrics for monitoring expiration times
- π Updates CR status with human-readable state information
- π¨ Supports configurable alert thresholds (Info, Warning, Critical)
- π Works with External Secrets Operator and other secret management tools
- π’ Follows OpenShift operator best practices
# Build and deploy to local Kind cluster
make deploy-core
# Or build Docker image
make docker-buildapiVersion: v1
kind: Secret
metadata:
name: docker-registry-token
labels:
expiringsecret.stakater.com/validUntil: "2026-03-15" # YYYY-MM-DD format
data:
token: <base64-encoded-token>apiVersion: expiring-secrets.stakater.com/v1alpha1
kind: Monitor
metadata:
name: docker-registry-monitor
spec:
service: docker.io
secretRef:
name: docker-registry-token
namespace: default
alertThresholds:
infoDays: 30 # Start showing "Info" state 30 days before expiration
warningDays: 14 # "Warning" state 14 days before
criticalDays: 7 # "Critical" state 7 days beforekubectl get monitor docker-registry-monitor -o wide
# NAME REGISTRY SECRET STATE EXPIRES AT AGE
# docker-registry-monitor docker.io docker-registry-token Info 2026-03-15T00:00:00Z 1mThe operator exposes these metrics on /metrics:
# Absolute expiration timestamp (Unix time)
secretmonitor_valid_until_timestamp{registry="docker.io",name="docker-registry-monitor",namespace="default"} 1760486400
# Human-friendly seconds until expiry
secretmonitor_seconds_until_expiry{registry="docker.io",name="docker-registry-monitor",namespace="default"} 1209600
# Reconciliation success/failure counter
secretmonitor_reconcile_total{monitor="docker-registry-monitor",namespace="default",result="success"} 5
| State | Description | Default Threshold |
|---|---|---|
| Valid | Secret is healthy and far from expiration | > 30 days |
| Info | Secret approaching expiration but not urgent | 14-30 days |
| Warning | Secret needs attention soon | 7-14 days |
| Critical | Secret expires very soon - action required | < 7 days |
| Expired | Secret has already expired | Past expiration date |
| Error | Cannot determine state (missing secret, invalid format, etc.) | N/A |
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: expiring-secrets-alerts
spec:
groups:
- name: expiring-secrets
rules:
- alert: SecretExpiringSoon
expr: secretmonitor_seconds_until_expiry < 14 * 24 * 60 * 60 # 14 days
for: 1h
labels:
severity: warning
annotations:
summary: "Secret {{ $labels.name }} expires in {{ $value | humanizeDuration }}"
- alert: SecretExpiredCritical
expr: secretmonitor_seconds_until_expiry < 7 * 24 * 60 * 60 # 7 days
for: 30m
labels:
severity: critical
annotations:
summary: "Secret {{ $labels.name }} expires VERY SOON: {{ $value | humanizeDuration }}"- Container Registry PATs: Docker Hub, Quay.io, GitHub Packages tokens
- CI/CD Pipeline Tokens: GitHub Actions, GitLab CI, Jenkins credentials
- API Keys: Third-party service credentials with expiration dates
- Certificates: TLS certs, signing certificates (when stored as secrets)
- Database Passwords: Auto-rotated credentials from secret management systems
# Run tests (includes controller unit tests)
make test
# Run end-to-end tests
make test-e2e
# Generate CRD manifests
make manifests
# Format and vet code
make fmt vetgraph TD
SECRET[Secret with validUntil label]
MONITOR_CR[Monitor CR]
subgraph OPERATOR[Expiring Secrets Operator]
RECONCILER[Reconciler]
PARSER[Parse expiration date]
STATUS_UPDATER[Update CR Status]
METRICS_EXPORTER[Export Prometheus Metrics]
RECONCILER --> PARSER
PARSER --> STATUS_UPDATER
PARSER --> METRICS_EXPORTER
end
subgraph MONITORING[Monitoring Stack]
SM[ServiceMonitor]
PROM[Prometheus]
AM[Alertmanager]
SM --> PROM
PROM --> AM
end
%% Watch relationships
SECRET <-.->|watches| RECONCILER
MONITOR_CR -.->|watches| RECONCILER
%% Data flow
SECRET -->|references| MONITOR_CR
STATUS_UPDATER -->|status updates| MONITOR_CR
METRICS_EXPORTER -->|/metrics| SM
flowchart TD
START([Reconcile Event]) --> GET_MONITOR{Get Monitor CR}
GET_MONITOR -->|Not Found| CLEANUP[Clean up metrics]
CLEANUP --> END_SUCCESS([Success - Monitor deleted])
GET_MONITOR -->|Found| DEFAULTS[Apply default alert thresholds]
DEFAULTS --> GET_SECRET{Get referenced Secret}
GET_SECRET -->|Not Found| ERROR_SECRET[Status: Error - Secret not found]
ERROR_SECRET --> END_ERROR([Error - Requeue])
GET_SECRET -->|Found| CHECK_LABEL{Has validUntil label?}
CHECK_LABEL -->|No| ERROR_LABEL[Status: Error - Missing label]
ERROR_LABEL --> END_ERROR
CHECK_LABEL -->|Yes| PARSE_DATE{Parse date format?}
PARSE_DATE -->|Invalid| ERROR_PARSE[Status: Error - Invalid date format]
ERROR_PARSE --> END_ERROR
PARSE_DATE -->|Valid| CALC_REMAINING[Calculate seconds remaining]
CALC_REMAINING --> DETERMINE_STATE{Determine state}
DETERMINE_STATE -->|<= 0 days| STATE_EXPIRED[Status: Expired]
DETERMINE_STATE -->|<= Critical days| STATE_CRITICAL[Status: Critical]
DETERMINE_STATE -->|<= Warning days| STATE_WARNING[Status: Warning]
DETERMINE_STATE -->|<= Info days| STATE_INFO[Status: Info]
DETERMINE_STATE -->|> Info days| STATE_VALID[Status: Valid]
STATE_EXPIRED --> UPDATE_METRICS[Update Prometheus metrics]
STATE_CRITICAL --> UPDATE_METRICS
STATE_WARNING --> UPDATE_METRICS
STATE_INFO --> UPDATE_METRICS
STATE_VALID --> UPDATE_METRICS
UPDATE_METRICS --> UPDATE_STATUS[Update Monitor status]
UPDATE_STATUS --> REQUEUE[Requeue after 1 minute]
REQUEUE --> END_SUCCESS
%% Styling
classDef errorPath fill:#ffcccc
classDef successPath fill:#ccffcc
classDef processPath fill:#cce5ff
class ERROR_SECRET,ERROR_LABEL,ERROR_PARSE,END_ERROR errorPath
class STATE_EXPIRED,STATE_CRITICAL,STATE_WARNING,STATE_INFO,STATE_VALID,END_SUCCESS successPath
class DEFAULTS,CALC_REMAINING,UPDATE_METRICS,UPDATE_STATUS,REQUEUE processPath
- Kubernetes 1.25+
- Secrets must have
expiringsecret.stakater.com/validUntillabel inYYYY-MM-DDformat - Prometheus operator (for ServiceMonitor/PrometheusRule)
Apache License 2.0