Skip to content

Commit 6af4a04

Browse files
committed
Clarify escaping, root field access and support for int-or-string, embedded and unknown fields
1 parent 28e6461 commit 6af4a04

File tree

1 file changed

+70
-29
lines changed
  • keps/sig-api-machinery/2876-crd-validation-expression-language

1 file changed

+70
-29
lines changed

keps/sig-api-machinery/2876-crd-validation-expression-language/README.md

Lines changed: 70 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -101,9 +101,7 @@ CRDs validation, they significantly complicate the development and
101101
operability of CRDs.
102102

103103
This KEP proposes that an inline expression language be integrated directly into CRDs such that a
104-
much larger portion of validation use cases can be solved without the use of webhooks. When
105-
selecting an exression language, we want to be sure that it can support defaulting and CRD
106-
conversion in the future.
104+
much larger portion of validation use cases can be solved without the use of webhooks.
107105

108106
This KEP proposes the adoption of [Common Expression Language
109107
(CEL)](https://github.com/google/cel-go). It is sufficiently lightweight and safe to be run directly
@@ -237,10 +235,12 @@ will be surfaced when the validation rule evaluates to false.
237235
extension in the schema. In the above example, the validator is scoped to the
238236
`spec` field. `self` will be used to represent the name of the field which the validator
239237
is scoped to.
240-
- Consideration under adding the representative of scoped filed name: There would be composition problem while generating CRD with tools like `controller-gen`.
241-
When trying to add validation as a marker comment to a field, the validation rule will
242-
be hard to define without the actual field name. As the example showing below. When we want to put cel validation on ToySpec, the field name as `spec` has not
243-
been identified yet which makes rule hard to define.
238+
239+
- Consideration under adding the representative of scoped filed name: There would be composition
240+
problem while generating CRD with tools like `controller-gen`. When trying to add validation as
241+
a marker comment to a field, the validation rule will be hard to define without the actual field
242+
name. As the example showing below. When we want to put cel validation on ToySpec, the field
243+
name as `spec` has not been identified yet which makes rule hard to define.
244244

245245
```azure
246246
// +kubebuilder:validation:XValidator=
@@ -283,14 +283,31 @@ like the `all` macro, e.g. `self.all(listItem, <predicate>)` or `self.all(mapKey
283283
Strategy](https://kubernetes.io/docs/reference/using-api/server-side-apply/#merge-strategy) for
284284
details.
285285
- The use of "old" is congruent with how `AdmissionReview` identifies the existing object as
286-
`oldObject`. To avoid name collisions `old<propertyName>` will be treated the same as a CEL
286+
`oldSelf`. To avoid name collisions `old<propertyName>` will be treated the same as a CEL
287287
keyword for escaping purposes (see below).
288288
- xref [analysis of possible interactions with immutability and
289289
validation](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1101-immutable-fields#openapi-extension-x-kubernetes-immutable).
290290

291-
- If a object property collides with a CEL keyword (see RESERVED in [CEL Syntax](https://github.com/google/cel-spec/blob/master/doc/langdef.md#syntax)),
291+
- If a object property name is a CEL keyword (see RESERVED in [CEL Syntax](https://github.com/google/cel-spec/blob/master/doc/langdef.md#syntax)),
292292
it will be escaped by prepending a _ prefix. To prevent this from causing a subsequent collision, all properties with a `_` prefix will always be
293293
prefixed by `__` (generally, N+1 the existing number of `_`s).
294+
295+
- If a object property name is a CEL language identifier (`int`, `uint`, `double`, `bool`, `string`,
296+
`bytes`, `list`, `map`, `null_type`, `type`, see [CEL language
297+
identifiers](https://github.com/google/cel-spec/blob/master/doc/langdef.md#values)) it is not
298+
accessible as a root variable and must be accessed via `self`, .e.g. `seft.int`.
299+
300+
- Rules may be written at the root of an object, and may make field selection into any fields
301+
declared in the OpenAPIv3 schema of the CRD. This includes selection of fields in both the `spec`
302+
and `status` in the same expression, e.g. `status.quantity <= spec.maxQuantity`. Because CRDs only
303+
allow the `name` and `generatedName` to be declared in the `metadata` of an object, these are the
304+
only metadata fields that may be validated using CEL validator rules. For example,
305+
`metadata.name.endsWith('mySuffix')` is allowed, but only when the OpenAPIv3 schema explicitly
306+
declares `metadata` as a field in the root object and declares the `name` field within `metadata`.
307+
`size(metadata.labels) < 3` however, it not allowed. The limit on which `metadata` fields may be
308+
validated is an intentional design choice (that aims to keep metadata behavior uniform across
309+
types) and applies to all validation mechanisms (e.g. the OpenAPIV3 `maxItems` restriction), not
310+
just CEL validator rules.
294311

295312
- We plan to allow access to the current state of the object to allow validation rules to check the
296313
new value against the current value, e.g. for immutability checks (for validation racheting we would
@@ -311,7 +328,6 @@ exact indices of the list items and the keys of map entries it traverses.
311328
A field *pattern* is a path to all nodes in the data tree that match the pattern. I.e.
312329
it may wildcard list item and map keys.
313330

314-
315331
#### Expression lifecycle
316332

317333
When CRDs are written to the kube-apiserver, all expressions will be [parsed and
@@ -330,7 +346,6 @@ List of functions to include for the initial release:
330346
- Regular Expressions
331347
- Some Standard Definitions
332348

333-
334349
Considerations:
335350
- The functions will become VERY difficult to change as this feature matures. We
336351
should limit ourselves initially to functions that we have a high level of
@@ -475,24 +490,50 @@ coverage of interactions in these dimensions:
475490

476491
Types:
477492

478-
| OpenAPIv3 type | CEL type |
479-
| --------------------------------------- | ------------------------------ |
480-
| 'object' with Properties | object / "message type" |
481-
| 'object' with AdditionalProperties | map |
482-
| 'array | list |
483-
| 'array' with x-kubernetes-list-type=map | list with map based Equality & unique key guarantees |
484-
| 'array' with x-kubernetes-list-type=set | list with set based Equality & unique entry guarantees |
485-
| 'boolean' | boolean |
486-
| 'number' (all formats) | double |
487-
| 'integer' (all formats) | int (64) |
488-
| 'null' | null |
489-
| 'string' | string |
490-
| 'string' with format=byte | bytes |
491-
| 'string' with format=binary | bytes |
492-
| 'string' with format=date | timestamp (protobuf.Timestamp) |
493-
| 'string' with format=datetime | timestamp (protobuf.Timestamp) |
494-
495-
xref: [CEL types](https://github.com/google/cel-spec/blob/master/doc/langdef.md#values), [OpenAPI types](https://swagger.io/specification/#data-types).
493+
| OpenAPIv3 type | CEL type |
494+
| -------------------------------------------------- | -------------------------------------------------------------------------------------------------------------- |
495+
| 'object' with Properties | object / "message type" |
496+
| 'object' with AdditionalProperties | map |
497+
| 'object' with x-kubernetes-embedded-type | <treatment is the same as 'object' more details below> |
498+
| 'object' with x-kubernetes-preserve-unknown-fields | <treatment is the same as 'object', more details below> |
499+
| x-kubernetes-embedded-int-or-string | object with 'intVal' (type: int) and 'strVal' (type: string) fields |
500+
| 'array | list |
501+
| 'array' with x-kubernetes-list-type=map | list with map based Equality & unique key guarantees |
502+
| 'array' with x-kubernetes-list-type=set | list with set based Equality & unique entry guarantees |
503+
| 'boolean' | boolean |
504+
| 'number' (all formats) | double |
505+
| 'integer' (all formats) | int (64) |
506+
| 'null' | null_type |
507+
| 'string' | string |
508+
| 'string' with format=byte | string (containing base64 encoded data) |
509+
| 'string' with format=binary | bytes |
510+
| 'string' with format=date | timestamp (google.protobuf.Timestamp) |
511+
| 'string' with format=datetime | timestamp (google.protobuf.Timestamp) |
512+
| 'string' with format=duration | duration (google.protobuf.Duration) |
513+
514+
xref: [CEL types](https://github.com/google/cel-spec/blob/master/doc/langdef.md#values), [OpenAPI
515+
types](https://swagger.io/specification/#data-types), [Kubernetes Structural Schemas](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#specifying-a-structural-schema).
516+
517+
Although `x-kubernetes-preserve-unknown-fields` allows custom resources to contain values without
518+
corresponding schema information, we will not provide validation support of these "schemaless"
519+
values. Reasons for this include:
520+
521+
- Without schema information, types (e.g. `map` vs. `object`), formats (e.g. plain `string`
522+
vs. `date`) and list types (plain `list` vs. `set`) are not available, and this feature depends
523+
on this schema information to integrate custom resources values with CEL.
524+
525+
- The contents of unknown fields are unvalidated. So it is possible for the contents to be any
526+
possible values. This significantly complicates the authoring of validation rules and makes them
527+
far more difficult write safely. Specifically, all unchecked field selection and assumptions about
528+
value types will result in runtime errors when the values contents of the unknown data does not
529+
match the expectations of the validation rule author. None of the benefits of static type checking
530+
at CRD registration time would apply.
531+
532+
- For objects with both `x-kubernetes-embedded-type` and `x-kubernetes-preserve-unknown-fields` set to `true`,
533+
if we were to allow for validation of embedded resource while the validation rules that are part of
534+
the embedded resource are not enforced, we would be allowing for developers to use CEL to validate
535+
the contents of embedded types without allowing developers to write validation rules directly on the
536+
embedded type to validate it. We don't want to do that.
496537

497538
Implementation:
498539

0 commit comments

Comments
 (0)