| applyTo | k8s/**/*.yaml,k8s/**/*.yml,manifests/**/*.yaml,manifests/**/*.yml,deploy/**/*.yaml,deploy/**/*.yml,charts/**/templates/**/*.yaml,charts/**/templates/**/*.yml |
|---|---|
| description | Best practices for Kubernetes YAML manifests including labeling conventions, security contexts, pod security, resource management, probes, and validation commands |
Create production-ready Kubernetes manifests that prioritize security, reliability, and operational excellence with consistent labeling, proper resource management, and comprehensive health checks.
Required Labels (Kubernetes recommended):
app.kubernetes.io/name: Application nameapp.kubernetes.io/instance: Instance identifierapp.kubernetes.io/version: Versionapp.kubernetes.io/component: Component roleapp.kubernetes.io/part-of: Application groupapp.kubernetes.io/managed-by: Management tool
Additional Labels:
environment: Environment nameteam: Owning teamcost-center: For billing
Useful Annotations:
- Documentation and ownership
- Monitoring:
prometheus.io/scrape,prometheus.io/port,prometheus.io/path - Change tracking: git commit, deployment date
Pod-level:
runAsNonRoot: truerunAsUserandrunAsGroup: Specific IDsfsGroup: File system groupseccompProfile.type: RuntimeDefault
Container-level:
allowPrivilegeEscalation: falsereadOnlyRootFilesystem: true(with tmpfs mounts for writable dirs)capabilities.drop: [ALL](add only what's needed)
Use Pod Security Admission:
- Restricted (recommended for production): Enforces security hardening
- Baseline: Minimal security requirements
- Apply at namespace level
Always define:
- Requests: Guaranteed minimum (scheduling)
- Limits: Maximum allowed (prevents exhaustion)
QoS Classes:
- Guaranteed: requests == limits (best for critical apps)
- Burstable: requests < limits (flexible resource use)
- BestEffort: No resources defined (avoid in production)
Liveness: Restart unhealthy containers Readiness: Control traffic routing Startup: Protect slow-starting applications
Configure appropriate delays, periods, timeouts, and thresholds for each.
Deployment Strategy:
RollingUpdatewithmaxSurgeandmaxUnavailable- Set
maxUnavailable: 0for zero-downtime
High Availability:
- Minimum 2-3 replicas
- Pod Disruption Budget (PDB)
- Anti-affinity rules (spread across nodes/zones)
- Horizontal Pod Autoscaler (HPA) for variable load
Pre-deployment:
kubectl apply --dry-run=client -f manifest.yamlkubectl apply --dry-run=server -f manifest.yamlkubeconform -strict manifest.yaml(schema validation)helm template ./chart | kubeconform -strict(for Helm)
Policy Validation:
- OPA Conftest, Kyverno, or Datree
Deploy:
kubectl apply -f manifest.yamlkubectl rollout status deployment/NAME
Rollback:
kubectl rollout undo deployment/NAMEkubectl rollout undo deployment/NAME --to-revision=Nkubectl rollout history deployment/NAME
Restart:
kubectl rollout restart deployment/NAME
- Labels: Standard labels applied
- Annotations: Documentation and monitoring
- Security: runAsNonRoot, readOnlyRootFilesystem, dropped capabilities
- Resources: Requests and limits defined
- Probes: Liveness, readiness, startup configured
- Images: Specific tags (never :latest)
- Replicas: Minimum 2-3 for production
- Strategy: RollingUpdate with appropriate surge/unavailable
- PDB: Defined for production
- Anti-affinity: Configured for HA
- Graceful shutdown: terminationGracePeriodSeconds set
- Validation: Dry-run and kubeconform passed
- Secrets: In Secrets resource, not ConfigMaps
- NetworkPolicy: Least-privilege access (if applicable)
- Use standard labels and annotations
- Always run as non-root with dropped capabilities
- Define resource requests and limits
- Implement all three probe types
- Pin image tags to specific versions
- Configure anti-affinity for HA
- Set Pod Disruption Budgets
- Use rolling updates with zero unavailability
- Validate manifests before applying
- Enable read-only root filesystem when possible