You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-storage/177-volume-snapshot/tighten-validation-webhook-crd.md
+16-4Lines changed: 16 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -232,10 +232,11 @@ CRD validation is preferred over webhook validation due to their lower complexit
232
232
233
233
Tighten the validation on Volume Snapshot objects. Please see the tables below for detailed information.
234
234
235
-
Due to backwards compatibility concerns, the tightening will occur in two phases.
235
+
Due to backwards compatibility concerns, the tightening will occur in three phases.
236
236
237
237
1. The first phase is webhook-only, and will use [ratcheting validation](#backwards-compatibility). It will be the user's responsibility to clean up invalid objects which already existed before the webhook was enabled. Invalid objects are those which fail the new, stricter validation. The controller will not be able to automatically fix invalid objects, however it will apply a [label](#automatic-labelling-of-invalid-objects) to invalid objects so that users can easily locate them.
238
-
2. The second phase can occur once all invalid objects are cleared from the cluster. It will be the cluster admin's responsibility to check and detect when it is safe to move to the second phase. The CRD schema validation will be tightened and the webhook will stick around to enforce immutability until immutable fields come to CRDs (Custom Resource Definition). This will be accompanied by a version change to make it clear the CRD is using different validation.
238
+
2. The second phase can occur once all invalid objects are cleared from the cluster. It will be the cluster admin's responsibility to check and detect when it is safe to move to the second phase. The CRD schema validation will be tightened and the webhook will stick around to enforce immutability until immutable fields come to CRDs (Custom Resource Definition). This will be accompanied by a version change to make it clear the CRD is using different validation, however the storage version will be kept as `v1beta1` to ensure a [rollback](#rollback) is possible at phase 2.
239
+
3. The storage version of the CRD will be changed from `v1beta1` to `v1`
239
240
240
241
The phases come in separate releases to allow users / cluster admin the opportunity to clean their cluster of any invalid objects. More details are in the Risks and Mitigations section.
241
242
@@ -280,7 +281,7 @@ Authentication on incoming requests to the webhook server is configurable howeve
280
281
281
282
Webhooks add latency to each API server call, thus setting up a reasonable timeout for each AdmissionReview request from the webhook server side is critical. The default timeout is 10 seconds if not specified. When an AdmissionReview request sent to the webhook server timed out, `failurePolicy`(default to `Fail` which is equivalent to disallow) will be triggered.
282
283
283
-
In the ValidatingWebhookConfiguration yaml example, a default timeout of two seconds is provided, cluster admins who wish to change the timeout may change the value of `timeoutSeconds`.
284
+
In the ValidatingWebhookConfiguration yaml [example](#kubernetes-api-server-configuration), a default timeout of two seconds is provided, cluster admins who wish to change the timeout may change the value of `timeoutSeconds`.
284
285
285
286
To avoid migration pain it is recommended to start with a `failurePolicy` value of `Ignore`, changing it to `Fail` only after the webhook is confirmed to have been installed successfully. Choosing `Ignore` means that it would be possible invalid objects can get created/updated in the system.
286
287
@@ -389,6 +390,8 @@ For `UPDATE` operations, the webhook server will receive the existing object and
389
390
390
391
Once we are sure no invalid data is persisted, we can switch to CRD schema-enforced validation with validating webhooks for immutability in a subsequent release.
391
392
393
+
#### Rollback
394
+
392
395
If users do not completely remove their invalid objects before upgrading their CRD definition, it should be possible to downgrade the CRD definition to allow invalid objects to get deleted.
393
396
394
397
The rollback procedure would look like this:
@@ -398,6 +401,15 @@ The rollback procedure would look like this:
398
401
4. User upgrades the control plane again.
399
402
5. In an n+2 release, once all the invalid objects are purged, we can switch the storage version to v1.
400
403
404
+
In phase 2, the storage version will be kept at v1beta1 in order to ensure the rollback is possible.
405
+
406
+
In phase 3, the storage version will be changed to v1.
407
+
408
+
```yaml
409
+
v1 (served=true, storage=false)
410
+
v1beta1 (served=false, storage=true)
411
+
```
412
+
401
413
#### Current Controller validation of OneOf semantic
402
414
403
415
##### Handling VolumeSnapshot.
@@ -451,7 +463,7 @@ webhooks:
451
463
service:
452
464
namespace: "default"
453
465
name: "snapshot-validation-service"
454
-
path: "/path/to/webhook"
466
+
path: "/volumesnapshots"
455
467
caBundle: "LS0tLS...base64 encoded of public key...LS0K"
0 commit comments