@@ -8,16 +8,15 @@ content_type: concept
8
8
weight : 20
9
9
---
10
10
11
-
12
11
<!-- overview -->
13
12
14
- You can constrain a {{< glossary_tooltip text="Pod" term_id="pod" >}} so that it is
13
+ You can constrain a {{< glossary_tooltip text="Pod" term_id="pod" >}} so that it is
15
14
_ restricted_ to run on particular {{< glossary_tooltip text="node(s)" term_id="node" >}},
16
15
or to _ prefer_ to run on particular nodes.
17
16
There are several ways to do this and the recommended approaches all use
18
17
[ label selectors] ( /docs/concepts/overview/working-with-objects/labels/ ) to facilitate the selection.
19
18
Often, you do not need to set any such constraints; the
20
- {{< glossary_tooltip text="scheduler" term_id="kube-scheduler" >}} will automatically do a reasonable placement
19
+ {{< glossary_tooltip text="scheduler" term_id="kube-scheduler" >}} will automatically do a reasonable placement
21
20
(for example, spreading your Pods across nodes so as not place Pods on a node with insufficient free resources).
22
21
However, there are some circumstances where you may want to control which node
23
22
the Pod deploys to, for example, to ensure that a Pod ends up on a node with an SSD attached to it,
@@ -28,10 +27,10 @@ or to co-locate Pods from two different services that communicate a lot into the
28
27
You can use any of the following methods to choose where Kubernetes schedules
29
28
specific Pods:
30
29
31
- * [ nodeSelector] ( #nodeselector ) field matching against [ node labels] ( #built-in-node-labels )
32
- * [ Affinity and anti-affinity] ( #affinity-and-anti-affinity )
33
- * [ nodeName] ( #nodename ) field
34
- * [ Pod topology spread constraints] ( #pod-topology-spread-constraints )
30
+ - [ nodeSelector] ( #nodeselector ) field matching against [ node labels] ( #built-in-node-labels )
31
+ - [ Affinity and anti-affinity] ( #affinity-and-anti-affinity )
32
+ - [ nodeName] ( #nodename ) field
33
+ - [ Pod topology spread constraints] ( #pod-topology-spread-constraints )
35
34
36
35
## Node labels {#built-in-node-labels}
37
36
@@ -51,15 +50,15 @@ and a different value in other environments.
51
50
Adding labels to nodes allows you to target Pods for scheduling on specific
52
51
nodes or groups of nodes. You can use this functionality to ensure that specific
53
52
Pods only run on nodes with certain isolation, security, or regulatory
54
- properties.
53
+ properties.
55
54
56
55
If you use labels for node isolation, choose label keys that the {{<glossary_tooltip text="kubelet" term_id="kubelet">}}
57
56
cannot modify. This prevents a compromised node from setting those labels on
58
57
itself so that the scheduler schedules workloads onto the compromised node.
59
58
60
59
The [ ` NodeRestriction ` admission plugin] ( /docs/reference/access-authn-authz/admission-controllers/#noderestriction )
61
60
prevents the kubelet from setting or modifying labels with a
62
- ` node-restriction.kubernetes.io/ ` prefix.
61
+ ` node-restriction.kubernetes.io/ ` prefix.
63
62
64
63
To make use of that label prefix for node isolation:
65
64
@@ -73,7 +72,7 @@ To make use of that label prefix for node isolation:
73
72
You can add the ` nodeSelector ` field to your Pod specification and specify the
74
73
[ node labels] ( #built-in-node-labels ) you want the target node to have.
75
74
Kubernetes only schedules the Pod onto nodes that have each of the labels you
76
- specify.
75
+ specify.
77
76
78
77
See [ Assign Pods to Nodes] ( /docs/tasks/configure-pod-container/assign-pods-nodes ) for more
79
78
information.
@@ -84,20 +83,20 @@ information.
84
83
labels. Affinity and anti-affinity expands the types of constraints you can
85
84
define. Some of the benefits of affinity and anti-affinity include:
86
85
87
- * The affinity/anti-affinity language is more expressive. ` nodeSelector ` only
86
+ - The affinity/anti-affinity language is more expressive. ` nodeSelector ` only
88
87
selects nodes with all the specified labels. Affinity/anti-affinity gives you
89
88
more control over the selection logic.
90
- * You can indicate that a rule is * soft* or * preferred* , so that the scheduler
89
+ - You can indicate that a rule is * soft* or * preferred* , so that the scheduler
91
90
still schedules the Pod even if it can't find a matching node.
92
- * You can constrain a Pod using labels on other Pods running on the node (or other topological domain),
91
+ - You can constrain a Pod using labels on other Pods running on the node (or other topological domain),
93
92
instead of just node labels, which allows you to define rules for which Pods
94
93
can be co-located on a node.
95
94
96
95
The affinity feature consists of two types of affinity:
97
96
98
- * * Node affinity* functions like the ` nodeSelector ` field but is more expressive and
97
+ - * Node affinity* functions like the ` nodeSelector ` field but is more expressive and
99
98
allows you to specify soft rules.
100
- * * Inter-pod affinity/anti-affinity* allows you to constrain Pods against labels
99
+ - * Inter-pod affinity/anti-affinity* allows you to constrain Pods against labels
101
100
on other Pods.
102
101
103
102
### Node affinity
@@ -106,12 +105,12 @@ Node affinity is conceptually similar to `nodeSelector`, allowing you to constra
106
105
Pod can be scheduled on based on node labels. There are two types of node
107
106
affinity:
108
107
109
- * ` requiredDuringSchedulingIgnoredDuringExecution ` : The scheduler can't
110
- schedule the Pod unless the rule is met. This functions like ` nodeSelector ` ,
111
- but with a more expressive syntax.
112
- * ` preferredDuringSchedulingIgnoredDuringExecution ` : The scheduler tries to
113
- find a node that meets the rule. If a matching node is not available, the
114
- scheduler still schedules the Pod.
108
+ - ` requiredDuringSchedulingIgnoredDuringExecution ` : The scheduler can't
109
+ schedule the Pod unless the rule is met. This functions like ` nodeSelector ` ,
110
+ but with a more expressive syntax.
111
+ - ` preferredDuringSchedulingIgnoredDuringExecution ` : The scheduler tries to
112
+ find a node that meets the rule. If a matching node is not available, the
113
+ scheduler still schedules the Pod.
115
114
116
115
{{<note >}}
117
116
In the preceding types, ` IgnoredDuringExecution ` means that if the node labels
@@ -127,17 +126,17 @@ For example, consider the following Pod spec:
127
126
128
127
In this example, the following rules apply:
129
128
130
- * The node * must* have a label with the key ` topology.kubernetes.io/zone ` and
131
- the value of that label * must* be either ` antarctica-east1 ` or ` antarctica-west1 ` .
132
- * The node * preferably* has a label with the key ` another-node-label-key ` and
133
- the value ` another-node-label-value ` .
129
+ - The node * must* have a label with the key ` topology.kubernetes.io/zone ` and
130
+ the value of that label * must* be either ` antarctica-east1 ` or ` antarctica-west1 ` .
131
+ - The node * preferably* has a label with the key ` another-node-label-key ` and
132
+ the value ` another-node-label-value ` .
134
133
135
134
You can use the ` operator ` field to specify a logical operator for Kubernetes to use when
136
135
interpreting the rules. You can use ` In ` , ` NotIn ` , ` Exists ` , ` DoesNotExist ` ,
137
136
` Gt ` and ` Lt ` .
138
137
139
138
` NotIn ` and ` DoesNotExist ` allow you to define node anti-affinity behavior.
140
- Alternatively, you can use [ node taints] ( /docs/concepts/scheduling-eviction/taint-and-toleration/ )
139
+ Alternatively, you can use [ node taints] ( /docs/concepts/scheduling-eviction/taint-and-toleration/ )
141
140
to repel Pods from specific nodes.
142
141
143
142
{{<note >}}
@@ -168,7 +167,7 @@ The final sum is added to the score of other priority functions for the node.
168
167
Nodes with the highest total score are prioritized when the scheduler makes a
169
168
scheduling decision for the Pod.
170
169
171
- For example, consider the following Pod spec:
170
+ For example, consider the following Pod spec:
172
171
173
172
{{< codenew file="pods/pod-with-affinity-anti-affinity.yaml" >}}
174
173
@@ -268,8 +267,8 @@ to unintended behavior.
268
267
Similar to [node affinity](#node-affinity) are two types of Pod affinity and
269
268
anti-affinity as follows :
270
269
271
- * `requiredDuringSchedulingIgnoredDuringExecution`
272
- * `preferredDuringSchedulingIgnoredDuringExecution`
270
+ - ` requiredDuringSchedulingIgnoredDuringExecution`
271
+ - ` preferredDuringSchedulingIgnoredDuringExecution`
273
272
274
273
For example, you could use
275
274
` requiredDuringSchedulingIgnoredDuringExecution` affinity to tell the scheduler to
@@ -297,7 +296,7 @@ The affinity rule says that the scheduler can only schedule a Pod onto a node if
297
296
the node is in the same zone as one or more existing Pods with the label
298
297
` security=S1` . More precisely, the scheduler must place the Pod on a node that has the
299
298
` topology.kubernetes.io/zone=V` label, as long as there is at least one node in
300
- that zone that currently has one or more Pods with the Pod label `security=S1`.
299
+ that zone that currently has one or more Pods with the Pod label `security=S1`.
301
300
302
301
The anti-affinity rule says that the scheduler should try to avoid scheduling
303
302
the Pod onto a node that is in the same zone as one or more Pods with the label
@@ -314,9 +313,9 @@ You can use the `In`, `NotIn`, `Exists` and `DoesNotExist` values in the
314
313
In principle, the `topologyKey` can be any allowed label key with the following
315
314
exceptions for performance and security reasons :
316
315
317
- * For Pod affinity and anti-affinity, an empty `topologyKey` field is not allowed in both `requiredDuringSchedulingIgnoredDuringExecution`
316
+ - For Pod affinity and anti-affinity, an empty `topologyKey` field is not allowed in both `requiredDuringSchedulingIgnoredDuringExecution`
318
317
and `preferredDuringSchedulingIgnoredDuringExecution`.
319
- * For `requiredDuringSchedulingIgnoredDuringExecution` Pod anti-affinity rules,
318
+ - For `requiredDuringSchedulingIgnoredDuringExecution` Pod anti-affinity rules,
320
319
the admission controller `LimitPodHardAntiAffinityTopology` limits
321
320
` topologyKey` to `kubernetes.io/hostname`. You can modify or disable the
322
321
admission controller if you want to allow custom topologies.
@@ -328,17 +327,18 @@ If omitted or empty, `namespaces` defaults to the namespace of the Pod where the
328
327
affinity/anti-affinity definition appears.
329
328
330
329
# ### Namespace selector
330
+
331
331
{{< feature-state for_k8s_version="v1.24" state="stable" >}}
332
332
333
333
You can also select matching namespaces using `namespaceSelector`, which is a label query over the set of namespaces.
334
334
The affinity term is applied to namespaces selected by both `namespaceSelector` and the `namespaces` field.
335
- Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
335
+ Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
336
336
null `namespaceSelector` matches the namespace of the Pod where the rule is defined.
337
337
338
338
# ### More practical use-cases
339
339
340
340
Inter-pod affinity and anti-affinity can be even more useful when they are used with higher
341
- level collections such as ReplicaSets, StatefulSets, Deployments, etc. These
341
+ level collections such as ReplicaSets, StatefulSets, Deployments, etc. These
342
342
rules allow you to configure that a set of workloads should
343
343
be co-located in the same defined topology; for example, preferring to place two related
344
344
Pods onto the same node.
@@ -430,10 +430,10 @@ spec:
430
430
Creating the two preceding Deployments results in the following cluster layout,
431
431
where each web server is co-located with a cache, on three separate nodes.
432
432
433
- | node-1 | node-2 | node-3 |
434
- |:--------------------:| :-------------------:| :------------------: |
435
- | *webserver-1* | *webserver-2* | *webserver-3* |
436
- | *cache-1* | *cache-2* | *cache-3* |
433
+ | node-1 | node-2 | node-3 |
434
+ | :----------- : | :-----------: | :-----------: |
435
+ | *webserver-1* | *webserver-2* | *webserver-3* |
436
+ | *cache-1* | *cache-2* | *cache-3* |
437
437
438
438
The overall effect is that each cache instance is likely to be accessed by a single client, that
439
439
is running on the same node. This approach aims to minimize both skew (imbalanced load) and latency.
@@ -453,13 +453,12 @@ tries to place the Pod on that node. Using `nodeName` overrules using
453
453
454
454
Some of the limitations of using `nodeName` to select nodes are :
455
455
456
- - If the named node does not exist, the Pod will not run, and in
457
- some cases may be automatically deleted.
458
- - If the named node does not have the resources to accommodate the
459
- Pod, the Pod will fail and its reason will indicate why,
460
- for example OutOfmemory or OutOfcpu.
461
- - Node names in cloud environments are not always predictable or
462
- stable.
456
+ - If the named node does not exist, the Pod will not run, and in
457
+ some cases may be automatically deleted.
458
+ - If the named node does not have the resources to accommodate the
459
+ Pod, the Pod will fail and its reason will indicate why,
460
+ for example OutOfmemory or OutOfcpu.
461
+ - Node names in cloud environments are not always predictable or stable.
463
462
464
463
{{< note >}}
465
464
` nodeName` is intended for use by custom schedulers or advanced use cases where
@@ -495,12 +494,10 @@ to learn more about how these work.
495
494
496
495
# # {{% heading "whatsnext" %}}
497
496
498
- * Read more about [taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/) .
499
- * Read the design docs for [node affinity](https://git.k8s.io/design-proposals-archive/scheduling/nodeaffinity.md)
497
+ - Read more about [taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/) .
498
+ - Read the design docs for [node affinity](https://git.k8s.io/design-proposals-archive/scheduling/nodeaffinity.md)
500
499
and for [inter-pod affinity/anti-affinity](https://git.k8s.io/design-proposals-archive/scheduling/podaffinity.md).
501
- * Learn about how the [topology manager](/docs/tasks/administer-cluster/topology-manager/) takes part in node-level
502
- resource allocation decisions.
503
- * Learn how to use [nodeSelector](/docs/tasks/configure-pod-container/assign-pods-nodes/).
504
- * Learn how to use [affinity and anti-affinity](/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/).
505
-
506
-
500
+ - Learn about how the [topology manager](/docs/tasks/administer-cluster/topology-manager/) takes part in node-level
501
+ resource allocation decisions.
502
+ - Learn how to use [nodeSelector](/docs/tasks/configure-pod-container/assign-pods-nodes/).
503
+ - Learn how to use [affinity and anti-affinity](/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/).
0 commit comments