Skip to content

Commit 136525c

Browse files
committed
Update with style suggestions and to include validating webhooks
1 parent 12db570 commit 136525c

File tree

1 file changed

+115
-78
lines changed

1 file changed

+115
-78
lines changed

content/en/docs/concepts/cluster-administration/admission-webhooks-good-practices.md

Lines changed: 115 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
---
2-
title: Mutating Webhook Good Practices
2+
title: Admission Webhook Good Practices
33
description: >
4-
Recommendations for writing mutating admission webhooks in Kubernetes.
4+
Recommendations for designing and deploying admission webhooks in Kubernetes.
55
content_type: concept
66
weight: 60
77
---
88

99
<!-- overview -->
1010

1111
This page provides good practices and considerations when designing
12-
_mutating admission webhooks_ in Kubernetes. This information is intended for
13-
cluster operators who run your own admission webhook servers or third-party
14-
applications that modify your API requests.
12+
_admission webhooks_ in Kubernetes. This information is intended for
13+
cluster operators who run admission webhook servers or third-party applications
14+
that modify or validate your API requests.
1515

1616
Before reading this page, ensure that you're familiar with the following
1717
concepts:
@@ -23,33 +23,54 @@ concepts:
2323

2424
## Importance of good webhook design {#why-good-webhook-design-matters}
2525

26-
Mutating admission control occurs when any create, update, or delete request
27-
is sent to the Kubernetes API. These webhooks are often written to ensure that
28-
specific fields in object specifications exist or have specific allowed values.
29-
30-
With every release, Kubernetes adds or modifies the API with new features,
31-
feature promotions to beta or stable status, and deprecations. Even stable
32-
Kubernetes APIs are likely might change. For example, the `Pod` API changed in
33-
v1.29 to add the
26+
Admission control occurs when any create, update, or delete request
27+
is sent to the Kubernetes API. Admission controllers intercept requests that
28+
match specific criteria that you define. These requests are then sent to
29+
mutating admission webhooks or validating admission webhooks. These webhooks are
30+
often written to ensure that specific fields in object specifications exist or
31+
have specific allowed values.
32+
33+
Webhooks are a powerful mechanism to extend the Kubernetes API. Badly-designed
34+
webhooks often result in workload disruptions because of how much control
35+
the webhooks have over objects in the cluster. Like other API extension
36+
mechanisms, webhooks are challenging to test at scale for compatibility with
37+
all of your workloads, other webhooks, add-ons, and plugins.
38+
39+
Additionally, with every release, Kubernetes adds or modifies the API with new
40+
features, feature promotions to beta or stable status, and deprecations. Even
41+
stable Kubernetes APIs are likely to change. For example, the `Pod` API changed
42+
in v1.29 to add the
3443
[Sidecar containers](/docs/concepts/workloads/pods/sidecar-containers/) feature.
44+
While it's rare for a Kubernetes object to enter a broken state because of a new
45+
Kubernetes API, webhooks that worked as expected with earlier versions of an API
46+
might not be able to reconcile more recent changes to that API. This can result
47+
in unexpected behavior after you upgrade your clusters to newer versions.
3548

36-
Webhooks that worked as expected with earlier versions of an API might not be
37-
able to reconcile more recent changes to that API. This can result in unexpected
38-
behavior after you upgrade your clusters to newer versions.
49+
This page describes common webhook failure scenarios and how to avoid them by
50+
cautiously and thoughtfully designing and implementing your webhooks.
3951

40-
## Identify whether you use mutating webhooks {#identify-mutating-webhooks}
52+
## Identify whether you use admission webhooks {#identify-admission-webhooks}
4153

42-
Even if you don't run your own mutating admission webhooks, some third-party
43-
applications that you run in your clusters might include mutating webhooks.
54+
Even if you don't run your own admission webhooks, some third-party applications
55+
that you run in your clusters might use mutating or validating admission
56+
webhooks.
4457

45-
To check whether your cluster has any mutating webhooks, run the following
46-
command:
58+
To check whether your cluster has any mutating admission webhooks, run the
59+
following command:
4760

4861
```shell
4962
kubectl get mutatingwebhookconfigurations
5063
```
5164
The output lists any mutating admission controllers in the cluster.
5265

66+
To check whether your cluster has any validating admission webhooks, run the
67+
following command:
68+
69+
```shell
70+
kubectl get validatingwebhookconfigurations
71+
```
72+
The output lists any validating admission controllers in the cluster.
73+
5374
## Choose an admission control mechanism {#choose-admission-mechanism}
5475

5576
Kubernetes includes multiple admission control and policy enforcement options.
@@ -122,7 +143,7 @@ control when possible.
122143

123144
If you use
124145
{{< glossary_tooltip text="CustomResourceDefinitions" term_id="customresourcedefinition" >}},
125-
don't use mutating webhooks to validate values in CustomResource specifications
146+
don't use admission webhooks to validate values in CustomResource specifications
126147
or to set default values for fields. Kubernetes lets you define validation rules
127148
and default field values when you create CustomResourceDefinitions.
128149

@@ -140,8 +161,9 @@ latency. In summary, these are as follows:
140161
* Use audit logs to check for webhooks that repeatedly do the same action.
141162
* Use load balancing for webhook availability.
142163
* Set a small timeout value for each webhook.
164+
* Consider cluster availability needs during webhook design.
143165

144-
### Improve latency in mutating webhooks {#improve-latency-mutating-webhooks}
166+
### Design admission webhooks for low latency {#design-admission-webhooks-low-latency}
145167

146168
Mutating admission webhooks are called in sequence. Depending on the mutating
147169
webhook setup, some webhooks might be called multiple times. Every mutating
@@ -197,8 +219,22 @@ For details, see
197219

198220
Admission webhooks should leverage some form of load-balancing to provide high
199221
availability and performance benefits. If a webhook is running within the
200-
cluster, you can run multiple webhook backends behind a Service to use the
201-
Service load balancing.
222+
cluster, you can run multiple webhook backends behind a Service of type
223+
`ClusterIP`.
224+
225+
### Use a high-availability deployment model {#ha-deployment}
226+
227+
Consider your cluster's availability requirements when designing your webhook.
228+
For example, during node downtime or zonal outages, Kubernetes marks Pods as
229+
`NotReady` to allow load balancers to reroute traffic to available zones and
230+
nodes. These updates to Pods might trigger your mutating webhooks. Depending on
231+
the number of affected Pods, the mutating webhook server has a risk of timing
232+
out or causing delays in Pod processing. As a result, traffic won't get
233+
rerouted as quickly as you need.
234+
235+
Consider situations like the preceding example when writing your webhooks.
236+
Exclude operations that are a result of Kubernetes responding to unavoidable
237+
incidents.
202238

203239
## Request filtering {#request-filtering}
204240

@@ -212,20 +248,23 @@ specific webhooks. In summary, these are as follows:
212248

213249
### Limit the scope of each webhook {#webhook-limit-scope}
214250

215-
Mutating webhooks run when an API request matches the webhook configuration.
216-
Limit the scope of each webhook to reduce unnecessary calls to the webhook
217-
server. Consider the following scope limitations:
251+
Admission webhooks are only called when an API request matches the corresponding
252+
webhook configuration. Limit the scope of each webhook to reduce unnecessary
253+
calls to the webhook server. Consider the following scope limitations:
218254

219-
* Don't match objects in the `kube-system` namespace. If you run your own Pods
220-
in the `kube-system` namespace, use an
221-
[objectSelector](/docs/reference/access-authn-authz/extensible-admission-controllers/#matching-requests-objectselector)
255+
* Avoid matching objects in the `kube-system` namespace. If you run your own
256+
Pods in the `kube-system` namespace, use an
257+
[`objectSelector`](/docs/reference/access-authn-authz/extensible-admission-controllers/#matching-requests-objectselector)
222258
to avoid mutating a critical workload.
223-
* Don't match node leases. Intercepting node leases might result in failed node
224-
upgrades.
225-
* Don't match `TokenReview` or `SubjectAccessReview` requests. These are always
226-
read-only requests. Modifying these requests might break your cluster.
259+
* Don't mutate node leases, which exist as Lease objects in the
260+
`kube-node-lease` system namespace. Mutating node leases might result in
261+
failed node upgrades. Only apply validation controls to Lease objects in this
262+
namespace if you're confident that the controls won't put your cluster at
263+
risk.
264+
* Don't mutate TokenReview or SubjectAccessReview objects. These are always
265+
read-only requests. Modifying these objects might break your cluster.
227266
* Limit each webhook to a specific namespace by using a
228-
[namespaceSelector](/docs/reference/access-authn-authz/extensible-admission-controllers/#matching-requests-namespaceselector).
267+
[`namespaceSelector`](/docs/reference/access-authn-authz/extensible-admission-controllers/#matching-requests-namespaceselector).
229268

230269
### Filter for specific requests by using match conditions {#filter-match-conditions}
231270

@@ -242,7 +281,7 @@ server.
242281
For details, see
243282
[Matching requests: `matchConditions`](/docs/reference/access-authn-authz/extensible-admission-controllers/#matching-requests-matchconditions).
244283

245-
### Match all versions of an object {#match-all-versions}
284+
### Match all versions of an API {#match-all-versions}
246285

247286
By default, admission webhooks run on any API versions that affect a specified
248287
resource. The `matchPolicy` field in the webhook configuration controls this
@@ -269,14 +308,14 @@ considerations for object fields. In summary, these are as follows:
269308
### Patch only required fields {#patch-required-fields}
270309

271310
Admission webhook servers send HTTP responses to indicate what to do with a
272-
specific Kubernetes API request. This response is an `AdmissionReview` object.
311+
specific Kubernetes API request. This response is an AdmissionReview object.
273312
A mutating webhook can add specific fields to mutate before allowing admission
274313
by using the `patchType` field and the `patch` field in the response. Ensure
275314
that you only modify the fields that require a change.
276315

277316
For example, consider a mutating webhook that's configured to ensure that
278317
`web-server` Deployments have at least three replicas. When a request to
279-
create a `Deployment` object matches your webhook configuration, the webhook
318+
create a Deployment object matches your webhook configuration, the webhook
280319
should only update the value in the `spec.replicas` field.
281320

282321
### Don't overwrite array values {#dont-overwrite-arrays}
@@ -301,14 +340,14 @@ Consider the following when modifying arrays:
301340

302341
### Avoid side effects {#avoid-side-effects}
303342

304-
Ensure that your webhooks operate only on the content of the `AdmissionReview`
343+
Ensure that your webhooks operate only on the content of the AdmissionReview
305344
that's sent to them, and do not make out-of-band changes. These additional
306345
changes, called _side effects_, might cause conflicts during admission if they
307346
aren't reconciled properly. The `.webhooks[].sideEffects` field should
308347
be set to `None` if a webhook doesn't have any side effect.
309348

310349
If side effects are required during the admission evaluation, they must be
311-
suppressed when processing an `AdmissionReview` object with `dryRun` set to
350+
suppressed when processing an AdmissionReview object with `dryRun` set to
312351
`true`, and the `.webhooks[].sideEffects` field should be set to `NoneOnDryRun`.
313352

314353
For details, see
@@ -320,17 +359,17 @@ A webhook running inside the cluster might cause deadlocks for its own
320359
deployment if it is configured to intercept resources required to start its own
321360
Pods.
322361

323-
For example, a mutating admission webhook is configured to admit `CREATE` Pod
324-
requests only if a certain label is set in the Pod (such as `"env": "prod"`).
325-
The webhook server runs in a Deployment that doesn't set the `"env"` label.
362+
For example, a mutating admission webhook is configured to admit **create** Pod
363+
requests only if a certain label is set in the Pod (such as `env: prod`).
364+
The webhook server runs in a Deployment that doesn't set the `env` label.
326365

327366
When a node that runs the webhook server Pods becomes unhealthy, the webhook
328367
Deployment tries to reschedule the Pods to another node. However, the existing
329-
webhook server rejects the requests since the `"env"` label is unset. As a
368+
webhook server rejects the requests since the `env` label is unset. As a
330369
result, the migration cannot happen.
331370

332371
Exclude the namespace where your webhook is running with a
333-
[namespaceSelector](/docs/reference/access-authn-authz/extensible-admission-controllers/#matching-requests-namespaceselector).
372+
[`namespaceSelector`](/docs/reference/access-authn-authz/extensible-admission-controllers/#matching-requests-namespaceselector).
334373

335374
### Fail open and validate the final state {#fail-open-validate-final-state}
336375

@@ -366,7 +405,7 @@ added a `restartPolicy` field to the Pod API.
366405
Mutating webhooks that respond to a broad range of API requests might
367406
unintentionally trigger themselves. For example, consider a webhook that
368407
responds to all requests in the cluster. If you configure the webhook to create
369-
`Event` objects for every mutation, it'll respond to its own `Event` object
408+
Event objects for every mutation, it'll respond to its own Event object
370409
creation requests.
371410

372411
To avoid this, consider setting a unique label in any resources that your
@@ -381,7 +420,10 @@ kubelet on the node creates a
381420
server to track the static Pod. However, changes to the mirror Pod don't
382421
propagate to the static Pod.
383422

384-
Don't attempt to mutate these objects during admission.
423+
Don't attempt to mutate these objects during admission. All mirror Pods have the
424+
`kubernetes.io/config.mirror` annotation. To exclude mirror Pods while reducing
425+
the security risk of ignoring an annotation, allow static Pods to only run in
426+
specific namespaces.
385427

386428
## Mutating webhook ordering and idempotence {#ordering-idempotence}
387429

@@ -394,7 +436,7 @@ webhooks. In summary, these are as follows:
394436
* Ensure that the set of mutating webhooks is idempotent, not just the
395437
individual webhooks.
396438

397-
### Don't rely on mutating webhook order {#dont-rely-webhook-order}
439+
### Don't rely on mutating webhook invocation order {#dont-rely-webhook-order}
398440

399441
Mutating admission webhooks don't run in a consistent order. Various factors
400442
might change when a specific webhook is called. Don't rely on your webhook
@@ -431,22 +473,22 @@ challenging. The following recommendations might help:
431473

432474
The following examples show idempotent mutation logic:
433475

434-
1. For a `CREATE` Pod request, set the field
476+
1. For a **create** Pod request, set the field
435477
`.spec.securityContext.runAsNonRoot` of the Pod to true.
436478

437-
1. For a `CREATE` Pod request, if the field
479+
1. For a **create** Pod request, if the field
438480
`.spec.containers[].resources.limits` of a container is not set, set default
439481
resource limits.
440482

441-
1. For a `CREATE` Pod request, inject a sidecar container with name
483+
1. For a **create** Pod request, inject a sidecar container with name
442484
`foo-sidecar` if no container with the name `foo-sidecar` already exists.
443485

444486
In these cases, the webhook can be safely reinvoked, or admit an object that
445487
already has the fields set.
446488

447489
The following examples show non-idempotent mutation logic:
448490

449-
1. For a `CREATE` Pod request, inject a sidecar container with name
491+
1. For a **create** Pod request, inject a sidecar container with name
450492
`foo-sidecar` suffixed with the current timestamp (such as
451493
`foo-sidecar-19700101-000000`).
452494

@@ -455,12 +497,12 @@ The following examples show non-idempotent mutation logic:
455497
webhook can inject duplicated containers if the sidecar already exists in
456498
a user-provided pod.
457499

458-
1. For a `CREATE`/`UPDATE` Pod request, reject if the Pod has label `"env"` set,
459-
otherwise add an `"env": "prod"` label to the Pod.
500+
1. For a **create**/**update** Pod request, reject if the Pod has label `env`
501+
set, otherwise add an `env: prod` label to the Pod.
460502

461503
Reinvoking the webhook will result in the webhook failing on its own output.
462504

463-
1. For a `CREATE` Pod request, append a sidecar container named `foo-sidecar`
505+
1. For a **create** Pod request, append a sidecar container named `foo-sidecar`
464506
without checking whether a `foo-sidecar` container exists.
465507

466508
Reinvoking the webhook will result in duplicated containers in the Pod, which
@@ -471,10 +513,20 @@ The following examples show non-idempotent mutation logic:
471513
This section provides recommendations for testing your mutating webhooks and
472514
validating mutated objects. In summary, these are as follows:
473515

516+
* Test webhooks in staging environments.
474517
* Avoid mutations that violate validations.
475518
* Test minor version upgrades for regressions and conflicts.
476519
* Validate mutated objects before admission.
477520

521+
### Test webhooks in staging environments {#test-in-staging-environments}
522+
523+
Robust testing should be a core part of your release cycle for new or updated
524+
webhooks. If possible, test any changes to your cluster webhooks in a staging
525+
environment that closely resembles your production clusters. At the very least,
526+
consider using a tool like [minikube](https://minikube.sigs.k8s.io/docs/) or
527+
[kind](https://kind.sigs.k8s.io/) to create a small test cluster for webhook
528+
changes.
529+
478530
### Ensure that mutations don't violate validations {#ensure-mutations-dont-violate-validations}
479531

480532
Your mutating webhooks shouldn't break any of the validations that apply to an
@@ -524,19 +576,18 @@ webhooks. In summary, these are as follows:
524576
* Limit access to edit the webhook configuration resources.
525577
* Limit access to the namespace that runs the webhook server, if the server is
526578
in-cluster.
527-
* Consider cluster availability needs during webhook design.
528579

529580
### Install and enable a mutating webhook {#install-enable-mutating-webhook}
530581

531582
When you're ready to deploy your mutating webhook to a cluster, use the
532583
following order of operations:
533584

534585
1. Install the webhook server and start it.
535-
1. Set the `failurePolicy` field in the `MutatingWebhookConfiguration` object
586+
1. Set the `failurePolicy` field in the MutatingWebhookConfiguration manifest
536587
to Ignore. This lets you avoid disruptions caused by misconfigured webhooks.
537-
1. Set the `namespaceSelector` field in the `MutatingWebhookConfiguration`
538-
object to a test namespace.
539-
1. Deploy the `MutatingWebhookConfiguration` object to your cluster.
588+
1. Set the `namespaceSelector` field in the MutatingWebhookConfiguration
589+
manifest to a test namespace.
590+
1. Deploy the MutatingWebhookConfiguration to your cluster.
540591

541592
Monitor the webhook in the test namespace to check for any issues, then roll the
542593
webhook out to other namespaces. If the webhook intercepts an API request that
@@ -547,30 +598,16 @@ webhook configuration.
547598

548599
Mutating webhooks are powerful Kubernetes controllers. Use RBAC or another
549600
authorization mechanism to limit access to your webhook configurations and
550-
servers. Ensure that the following access is only available to trusted
601+
servers. For RBAC, ensure that the following access is only available to trusted
551602
entities:
552603

553-
* Verbs: `create`, `update`, `patch`, `delete`, `deletecollection`
604+
* Verbs: **create**, **update**, **patch**, **delete**, **deletecollection**
554605
* API group: `admissionregistration.k8s.io/v1`
555-
* Resources: `MutatingWebhookConfigurations`
606+
* API kind: MutatingWebhookConfigurations
556607

557608
If your mutating webhook server runs in the cluster, limit access to create or
558609
modify any resources in that namespace.
559610

560-
### Use a high-availability deployment model {#ha-deployment}
561-
562-
Consider your cluster's availability requirements when designing your webhook.
563-
For example, during node downtime or zonal outages, Kubernetes marks Pods as
564-
`NotReady` to allow load balancers to reroute traffic to available zones and
565-
nodes. These updates to Pods might trigger your mutating webhooks. Depending on
566-
the number of affected Pods, the mutating webhook server has a risk of timing
567-
out or causing delays in Pod processing. As a result, traffic won't get
568-
rerouted as quickly as you need.
569-
570-
Consider situations like the preceding example when writing your webhooks.
571-
Exclude operations that are a result of Kubernetes responding to unavoidable
572-
incidents.
573-
574611
## Examples of good implementations {#example-good-implementations}
575612

576613
{{% thirdparty-content %}}

0 commit comments

Comments
 (0)