Skip to content

Commit 18948fd

Browse files
committed
fill in monitoring requirements
1 parent 2162ef4 commit 18948fd

File tree

1 file changed

+17
-12
lines changed

1 file changed

+17
-12
lines changed

keps/sig-auth/2579-psp-replacement/README.md

Lines changed: 17 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -627,7 +627,7 @@ A single metric will be added to track policy evaluations against pods and [temp
627627
[Namespace evaluations](#namespace-policy-update-warnings) are not counted.
628628

629629
```
630-
<component_name>_evaluations_total
630+
pod_security_evaluations_total
631631
```
632632

633633
The metric will use the following labels:
@@ -644,6 +644,8 @@ The metric will use the following labels:
644644
enabled, every every create request and in-scope update request will at least increment the
645645
`enforce` total.
646646
6. `request_operation {create, update}` - The operation of the request being checked.
647+
7. `resource {pod, controller}` - Whether the request object is a Pod, or a [templated
648+
pod](#podtemplate-resources) resource.
647649

648650
<<[UNRESOLVED]>>
649651

@@ -869,21 +871,24 @@ fields of API types, flags, etc.?**
869871

870872
### Monitoring Requirements
871873

872-
_This section must be completed when targeting beta graduation to a release._
873-
874874
* **How can an operator determine if the feature is in use by workloads?**
875-
Ideally, this should be a metric. Operations against the Kubernetes API (e.g.,
876-
checking if there are objects with field X set) may be a last resort. Avoid
877-
logs or events for this purpose.
875+
- non-zero `pod_security_evaluations_total` metrics indicate the feature is in use
878876

879877
* **What are the SLIs (Service Level Indicators) an operator can use to determine
880878
the health of the service?**
881-
- [ ] Metrics
882-
- Metric name:
883-
- [Optional] Aggregation method:
884-
- Components exposing the metric:
885-
- [ ] Other (treat as last resort)
886-
- Details:
879+
- [x] Metrics
880+
- Metric name: `pod_security_evaluations_total`
881+
- Components exposing the metric: `kube-apiserver`
882+
883+
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
884+
- `pod_security_evaluations_total{decision=error}`
885+
- any rising count of these metrics indicates an unexpected problem evaluating the policy
886+
- `pod_security_evaluations_total{decision=error,mode=enforce}`
887+
- any rising count of these metrics indicates an unexpected problem evaluating the policy that
888+
is preventing pod write requests
889+
- `pod_security_evaluations_total{decision=deny,mode=enforce}`
890+
- a rising count indicates that the policy is preventing pod creation as intended, but is
891+
preventing a user or controller from successfully writing pods
887892

888893
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
889894
At a high level, this usually will be in the form of "high percentile of SLI

0 commit comments

Comments
 (0)