You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -1186,48 +1188,42 @@ are included with the key provided. E.g.:
1186
1188
}
1187
1189
```
1188
1190
1189
-
#### Namespace scoped policy binding
1190
-
1191
-
For phase 1, policy bindings were only allowed to be cluster scoped. We can
1192
-
support namespace scoped policy bindings as follows:
1193
-
1194
-
- Add a `NamespacePolicyBinding` resource.
1195
-
- If the parameter resource is namespace scoped, it implicitly matches
1196
-
resources only in the namespace it is in, but may further constrain what
1197
-
resources it matches with additional match criteria.
1198
-
1199
-
Benefits: Allows policy of a namespace to be controlled from within the
1200
-
namespace. As an example, ResourceQuota works this way.
1201
-
1202
-
Details to consider:
1191
+
#### Per namespace policy params
1203
1192
1204
-
- Should a policy support both cluster scoped and namespace scoped binding? If
1205
-
so how? It would need two different parameter CRDs (since a CRD must either be
1206
-
cluster scoped or namespace scoped, not both).
1193
+
Currently the policies and bindings are only allowed to be cluster scoped.
1194
+
We want to support per namespace configuration with namespace scoped param resources.
1207
1195
1208
-
#### CEL Expression Composition
1196
+
(Thanks for the input from @dead2k)
1197
+
This sort of mapping allows:
1198
+
- A cluster-admin can write a single resource to say, “this is the policy I want in all my namespaces”.
1199
+
- If namespace admins can read the param resources, but not write that resource, they can understand the limitations they currently have.
1200
+
- A single lenient cluster policy and cluster policybinding can enforce the minimum constraint, and a single cluster policy with a cluster policybinding pointing to a namespace level param can further restrict.
1209
1201
1210
-
##### Use Cases
1211
-
1212
-
###### Code re-use for complicated expressions
1202
+
A new optional field `namespaceParamRef` could be added inside ValidatingAdmissionPolicyBinding to support such use case.
1203
+
In contrast, a namespace scoped policybinding will require creation and maintenance of both policybindings and parameters
1204
+
in every namespace to enforce the policy itself, versus the single policybinding and many parameters.
1213
1205
1214
-
A CEL expression may not be computationally expensive, but could still be
1215
-
intricate enough that copy-pasting could prove to be a bad decision later on
1216
-
in time. With the addition of the `messageExpression` field, more copy-pasting
1217
-
is expected as well. If a sufficiently complex expression ended up copy-pasted everywhere,
1218
-
and then needs to be updated somehow, it will need that update in every place
1219
-
it was copy-pasted. A variable, on the other hand, will only need to be updated
1220
-
in one place.
1206
+
```yaml
1207
+
apiVersion: admissionregistration.k8s.io/v1alpha1
1208
+
kind: ValidatingAdmissionPolicyBinding
1209
+
metadata:
1210
+
name: "demo-binding-test.example.com"
1211
+
spec:
1212
+
policyName: "demo-policy.example.com"
1213
+
namespaceParamRef:
1214
+
name: "param-resource.example.com"
1215
+
failAction: “allow”
1216
+
validationActions: [Deny]
1217
+
```
1221
1218
1222
-
###### Reusing/memoizing an expensive computation
1219
+
- a new optional field `namespaceParamRef` is added as a peer to `paramRef`. User has to choose one for parameterization.
1220
+
It allows users to configure param resource per namespace.
1221
+
- failAction defines the behavior when the param resource cannot be found in current namespace.
1222
+
Set to `allow` will ignore the validation and let the request through. Set to `deny` will fail the validation if specific param resource not found.
1223
+
- if the resource be validated on is a cluster scoped resource and have `namespaceParamRef` set, return error.
1224
+
- the existing behavior should not be affected.
1223
1225
1224
-
For a CEL expression that runs in O(n^2) time or worse (or otherwise
1225
-
takes a significant amount of time to execute), it would be nice to only run
1226
-
it when necessary. For instance, if multiple validation expressions used the
1227
-
same expensive expression, that expression could be refactored out into a
1228
-
variable.
1229
-
1230
-
##### Match Conditions
1226
+
#### Match Conditions
1231
1227
1232
1228
Note that the syntax of the `matchConditions` resource is intended to
1233
1229
align with the [Admission Webhook Match Conditions KEP #3716](https://github.com/kubernetes/enhancements/pull/3717),
@@ -1283,6 +1279,27 @@ Note that `matchConditions` and `validations` look similar, but `matchConditions
1283
1279
* All match conditions must be satisfied (evaluate to `true`) before `validations` are tested
1284
1280
* If there is an error executing a match condition, the failure policy for the (definition, binding) tuple is invoked
1285
1281
1282
+
#### CEL Expression Composition
1283
+
1284
+
##### Use Cases
1285
+
1286
+
###### Code re-use for complicated expressions
1287
+
1288
+
A CEL expression may not be computationally expensive, but could still be
1289
+
intricate enough that copy-pasting could prove to be a bad decision later on
1290
+
in time. With the addition of the `messageExpression` field, more copy-pasting
1291
+
is expected as well. If a sufficiently complex expression ended up copy-pasted everywhere,
1292
+
and then needs to be updated somehow, it will need that update in every place
1293
+
it was copy-pasted. A variable, on the other hand, will only need to be updated
1294
+
in one place.
1295
+
1296
+
###### Reusing/memoizing an expensive computation
1297
+
1298
+
For a CEL expression that runs in O(n^2) time or worse (or otherwise
1299
+
takes a significant amount of time to execute), it would be nice to only run
1300
+
it when necessary. For instance, if multiple validation expressions used the
1301
+
same expensive expression, that expression could be refactored out into a
1302
+
variable.
1286
1303
1287
1304
##### Variables
1288
1305
@@ -1480,16 +1497,17 @@ If we were to offer a way to lookup arbitrary other resources, or even if
1480
1497
we provided selective access to just some resources, this might become
1481
1498
easier. This can explored as future work.
1482
1499
1483
-
#### Access to namespace metadata
1500
+
#### Access to namespace
1484
1501
1485
-
We have general agreement to include this as a feature, but need to provide
1486
-
a concrete design.
1502
+
We have general agreement to grant CEL expressions access to the admission object's namespace through a newly added CEL variable `namespaceObject`.
1503
+
If the resource is cluster scoped, `namespaceObject` will be null.
1487
1504
1488
-
- Namespace labels and annotations are the most commonly needed fields not
1489
-
already available in the resource being validated. Note that
1490
-
namespaceSelectors already allow matches to examine namespace levels, but we
1491
-
also have use cases that need to be able to inspects the fields in CEL
1492
-
expressions.
1505
+
`namespaceObject`will provide access to all existing fields under namespace metadata, namespace spec and namespace status except for metadata.managedFields and metadata.ownerReferences.
1506
+
The fields could be directly accessed through `namespaceObject` variable. e.g. `namespaceObject.metadata.name` or `namespaceObject.status.phase`.
1507
+
1508
+
Namespace labels and annotations are the most commonly needed fields not already available in the resource being validated.
1509
+
labels and annotations could be accessed through `namespaceObject.metadata.labels` for example `namespaceObject.metadata.labels.env`.
1510
+
Note that we recommend to check if the specific label/annotation exists before validation: `'env' in namespaceObject.metadata.labels`.
1493
1511
1494
1512
#### Transition rules
1495
1513
@@ -1618,6 +1636,32 @@ the number of biindings can become quite large, so let's limit it to
1618
1636
- xref: [Metrics Provided by OPA Gatekeeper](https://open-policy-agent.github.io/gatekeeper/website/docs/metrics/)
The namespace scoped policy binding will require a new API in place.
1645
+
It will be planned separately and will not be affecting the existing ValidatingAdmissionPolicy behavior.
1646
+
1647
+
For phase 1, policy bindings were only allowed to be cluster scoped. We can
1648
+
support namespace scoped policy bindings as follows:
1649
+
1650
+
- Add a `NamespacePolicyBinding` resource.
1651
+
- If the parameter resource is namespace scoped, it implicitly matches
1652
+
resources only in the namespace it is in, but may further constrain what
1653
+
resources it matches with additional match criteria.
1654
+
1655
+
Benefits: Allows policy of a namespace to be controlled from within the
1656
+
namespace. As an example, ResourceQuota works this way.
1657
+
1658
+
Details to consider:
1659
+
1660
+
- Should a policy support both cluster scoped and namespace scoped binding? If
1661
+
so how? It would need two different parameter CRDs (since a CRD must either be
1662
+
cluster scoped or namespace scoped, not both).
1663
+
1664
+
1621
1665
### User Stories
1622
1666
1623
1667
In addition to "User Stores", see below "Potential Applications" for a list of
@@ -2312,6 +2356,11 @@ in back-to-back releases.
2312
2356
If multiple admission policies require the same conversion, convert only once.
2313
2357
From @liggitt: "webhook code loops up one level, first accumulates all the validation webhooks we'll run, then converts to the versions needed by those webhooks then evaluates in parallel"
2314
2358
- authz check to the specific resource referenced in the policy's paramKind. ([comment](https://github.com/kubernetes/kubernetes/pull/113314#discussion_r1013135860))
2359
+
- complete feature of access to namespace metadata
2360
+
- complete type check for CRD
2361
+
- add controlled rollout strategy to support future CEL library/function/variable changes
2362
+
- [Quantity](https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apimachinery/pkg/api/resource/quantity.go#L100) support from CEL expression and tested properly
2363
+
- support the list of features mentioned under phrase 2
2315
2364
2316
2365
### Upgrade / Downgrade Strategy
2317
2366
@@ -2387,15 +2436,9 @@ well as the [existing list] of feature gates.
- [] Feature gate (also fill in values in `kep.yaml`)
2391
-
- Feature gate name: CELValidatingAdmission
2439
+
- [X] Feature gate (also fill in values in `kep.yaml`)
2440
+
- Feature gate name: ValidatingAdmissionPolicy
2392
2441
- Components depending on the feature gate: kube-apiserver
2393
-
- [ ] Other
2394
-
- Describe the mechanism:
2395
-
- Will enabling / disabling the feature require downtime of the control
2396
-
plane?
2397
-
- Will enabling / disabling the feature require downtime or reprovisioning
2398
-
of a node?
2399
2442
2400
2443
###### Does enabling the feature change any default behavior?
2401
2444
@@ -2506,6 +2549,9 @@ Ideally, this should be a metric. Operations against the Kubernetes API (e.g.,
2506
2549
checking if there are objects with field X set) may be a last resort. Avoid
2507
2550
logs or events for this purpose.
2508
2551
-->
2552
+
The following metrics could be used to see if the feature is in use:
2553
+
- validating_admission_policy/check_total
2554
+
- validating_admission_policy/definition_total
2509
2555
2510
2556
###### How can someone using this feature know that it is working for their instance?
2511
2557
@@ -2518,13 +2564,10 @@ and operation of this feature.
2518
2564
Recall that end users cannot usually observe component logs or access metrics.
2519
2565
-->
2520
2566
2521
-
- [ ] Events
2522
-
- Event Reason:
2523
-
- [ ] API .status
2524
-
- Condition name:
2525
-
- Other field:
2526
-
- [ ] Other (treat as last resort)
2527
-
- Details:
2567
+
- Metrics like `validating_admission_policy/check_total` can be used to check how many validation applied in total
2568
+
- Audit mode can be used to check audit event following [this documentation](https://kubernetes.io/docs/reference/access-authn-authz/validating-admission-policy/#audit-annotations)
2569
+
- ValidatingAdmissionPolicy.Status can be used to see if typechecking performed as expected
2570
+
- User can also verify if the admission request is rejected or a warning is shown as expected based on how validationAction is set.
2528
2571
2529
2572
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
2530
2573
@@ -2542,26 +2585,28 @@ high level (needs more precise definitions) those may be things like:
2542
2585
These goals will help you determine what you need to measure (SLIs) in the next
2543
2586
question.
2544
2587
-->
2588
+
No impact on latency for admission request when ValidatingAdmissionPolicy are absent.
2589
+
2590
+
Performance when ValidatingAdmissionPolicy are in use will need to be measured and optimized.
2545
2591
2546
2592
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
2560
2604
2561
2605
<!--
2562
2606
Describe the metrics themselves and the reasons why they weren't added (e.g., cost,
2563
2607
implementation difficulties, etc.).
2564
2608
-->
2609
+
No. We are open to input.
2565
2610
2566
2611
### Dependencies
2567
2612
@@ -2585,6 +2630,7 @@ and creating new ones, as well as about cluster-level services (e.g. DNS):
2585
2630
- Impact of its outage on the feature:
2586
2631
- Impact of its degraded performance or high-error rates on the feature:
2587
2632
-->
2633
+
No.
2588
2634
2589
2635
### Scalability
2590
2636
@@ -2612,6 +2658,7 @@ Focusing mostly on:
2612
2658
- periodic API calls to reconcile state (e.g. periodic fetching state,
2613
2659
heartbeats, leader election, etc.)
2614
2660
-->
2661
+
Yes. A new API group is introduced which will be used for this feature.
2615
2662
2616
2663
###### Will enabling / using this feature result in introducing new API types?
2617
2664
@@ -2621,6 +2668,7 @@ Describe them, providing:
2621
2668
- Supported number of objects per cluster
2622
2669
- Supported number of objects per namespace (for namespace-scoped objects)
2623
2670
-->
2671
+
Yes. We introduced two new kinds for this feature: ValidatingAdmissionPolicy and ValidatingAdmissionPolicyBinding as described in [this doc](https://kubernetes.io/docs/reference/access-authn-authz/validating-admission-policy/)
2624
2672
2625
2673
###### Will enabling / using this feature result in any new calls to the cloud provider?
2626
2674
@@ -2629,6 +2677,7 @@ Describe them, providing:
2629
2677
- Which API(s):
2630
2678
- Estimated increase:
2631
2679
-->
2680
+
No.
2632
2681
2633
2682
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
2634
2683
@@ -2638,6 +2687,7 @@ Describe them, providing:
2638
2687
- Estimated increase in size: (e.g., new annotation of size 32B)
2639
2688
- Estimated amount of new objects: (e.g., new Object X for every existing Pod)
2640
2689
-->
2690
+
No.
2641
2691
2642
2692
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
2643
2693
@@ -2649,6 +2699,7 @@ Think about adding additional work or introducing new steps in between
We don't expect it to. Especially comparing to the existing method to achieve the same goal, using this feature will not result in non-negligible increase of resource usage.
2716
+
2717
+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
2718
+
2719
+
<!--
2720
+
Focus not just on happy cases, but primarily on more pathological cases
2721
+
(e.g. probes taking a minute instead of milliseconds, failed pods consuming resources, etc.).
2722
+
If any of the resources can be exhausted, how this is mitigated with the existing limits
2723
+
(e.g. pods per node) or new limits added by this KEP?
2724
+
2725
+
Are there any tests that were run/should be run to understand performance characteristics better
2726
+
and validate the declared limits?
2727
+
-->
2728
+
No.
2664
2729
2665
2730
### Troubleshooting
2666
2731
@@ -2676,6 +2741,7 @@ details). For now, we leave it here.
2676
2741
-->
2677
2742
2678
2743
###### How does this feature react if the API server and/or etcd is unavailable?
2744
+
Same as without this feature.
2679
2745
2680
2746
###### What are other known failure modes?
2681
2747
@@ -2691,9 +2757,15 @@ For each of them, fill in the following information by copying the below templat
2691
2757
Not required until feature graduated to beta.
2692
2758
- Testing: Are there any tests for failure mode? If not, describe why.
2693
2759
-->
2760
+
N/A
2694
2761
2695
2762
###### What steps should be taken if SLOs are not being met to determine the problem?
2696
2763
2764
+
- The feature can be disabled by disabling the API or setting the feature-gate to false if the performance impact of it is not tolerable.
2765
+
- Try to run the validations separately to see which rule is slow
2766
+
- Remove the problematic rules or update the rules to meet the requirement
0 commit comments